Maintenance
This section covers potential maintenance activites for the flexFS metadata service.
Overview
When run as root, and unless another path is explicitly set by the --dbFolder
flag of the start
command, all default metadata state is persisted in the /root/.flexfs/meta/data
folder. Within that data folder, there is a metadata journal file named <volume-uuid>.journal
for each volume that has had mutating activity. This journal file is an append-only log of all mutations to the volume state from some some initial state at time t0
. Initially, the t0
state is effectively null, indicating the absolute beginning of the volume state.
Journal pruning
Over time, it may be desirable to archive or remove journal state by materializing a new .t0
state file. To prune metadata journal state, first stop the metadata service (this will flush all current metadata state to disk).
While the metadata service is offline, all mount clients using the metadata store will freeze and wait for the metadata service to come back online before consistently resuming any in-process operations. FlexFS is designed to support metadata service maintenance while mounts are actively being used.
Once the metadata service is stopped:
- Archive or remove any existing
.t0
files. - Archive or remove any existing
.journal
files. - Copy any
.sqlite
or.snapshot
file to a file with the same name, but with an added.t0
extension. - Start the metadata service.
There will intentionally be no .journal
files present in the data folder when we start the server. They will appear automatically as new mutations on the volume occur.
Journal replay
The metdata service can replay journal files entirely, or to a specified point in time. This feature can be used restore state (in the case of improper shutdown of an in-memory service, accidental deletion or corruption of on-disk sqlite files, or data recovery). Journal replay can also be used a debugging tool as it can output a human-readable representation of all mutations to the file system captured in the journal (by specifying the --format txt
flag). As with journal pruning, the metadata service should be stopped before performing these actions.
The flexFS metadata service automatically detects any missing volume state on startup, and will perform a full restore of any lost state automatically. However, this command can be run manually, which is typically only needed for data recovery or store type change scenarios.
When a .t0
state file exists, replay
can only safely restore state in the same format as the .t0
state file. Be careful which --format
value is provided when using replay
to restore state.
Example: full restore
We have a .journal
file, and possibly a .t0
state file, but no current .snapshot
state file on disk.
sudo meta.flexfs replay --format snapshot 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal
This operation will generate an ouput file called 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot
, which is the volume's current state after replaying all mutating operations in 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal
relative to any existing 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot.t0
.
Example: data recovery
This capability is limited by the retention
parameter, which is configured on a per-volume basis. Even though it may be possible to revert the metadata state to an earlier point in time, the data blocks for files that have been modified or deleted need to be retained in the block store in order to recover their state. The default retention
value is set to one week, which enables data recovery for up to one week in the past. The value for this parameter can be adjusted as desired.
We want to revert the volume to an earlier point in time (in read-only mode) to recover deleted or corrupted data.
- Back up all state files in the data folder for the target volume.
- Remove the current
.snapshot
or.sqlite
state file for the target volume. - Use the
replay
command, along with the--toTime
flag to specify the unix timestamp to replay until. - Switch the volume to read-only mode in the administrative service (possibly contact Paradigm4) or ensure all mounts will be mounted read-only to avoid any potential volume corruption.
- Start the metadata service.
- Recover the desired file(s) to another file system.
- Stop the metadata service.
- Restore the previously backed up state files for the target volume in to the data folder.
- Switch the volume to read-write mode in the administrative service (possibly contact Paradigm4).
- Start the metadata service.
- Copy the recovered file(s) back into flexFS.
Example: store type change
We want to switch from one store type to another (i.e. from an on-disk sqlite store to an in-memory snapshot store or vice versa):
- Ensure all volumes have a current
.snapshot
or.sqlite
state file on disk. If necessary, a full restore can be performed. - Archive or remove any existing
.t0
files. - Archive or remove any existing
.journal
files. - Perform a journal extraction on each volume's
.snapshot
or.sqlite
state file. - Archive or remove any existing
.snapshot
or.sqlite
files.
Journal extraction
The metadata service can extract
a synthetic (non-historic) journal file from a volume's current state. To do so, there must be a current .snapshot
or .sqlite
state file on disk from which to extract the journal. This feature can be used a log compaction mechanism, or as part of the process of changing the store type. As with journal pruning, the metadata service should be stopped before performing these actions.
Extracted journals lack the historic context needed to perform data recovery operations. Do not perform this operation without archiving the source state if a near-term data recovery operation is anticipated.
To perform a journal extraction (e.g. of a .snapshot
state file):
sudo meta.flexfs extract 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot
This operation will generate an output file called 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal
from the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot
state file. If a file by that name already exists, and you want to replace it, use the --clobber
flag. Any future mutations to the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6
will be appended to the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal
file.