Maintenance
This section covers potential maintenance activites for the flexFS metadata service.
Overview
When run as root, and unless another path is explicitly set by the --dbFolder flag of the start command, all default metadata state is persisted in the /root/.flexfs/meta/data folder. Within that data folder, there is a metadata journal file named <volume-uuid>.journal for each volume that has had mutating activity. This journal file is an append-only log of all mutations to the volume state from some some initial state at time t0. Initially, the t0 state is effectively null, indicating the absolute beginning of the volume state.
Journal pruning
Over time, it may be desirable to archive or remove journal state by materializing a new .t0 state file. To prune metadata journal state, first stop the metadata service (this will flush all current metadata state to disk).
While the metadata service is offline, all mount clients using the metadata store will freeze and wait for the metadata service to come back online before consistently resuming any in-process operations. FlexFS is designed to support metadata service maintenance while mounts are actively being used.
Once the metadata service is stopped:
- Archive or remove any existing
.t0files. - Archive or remove any existing
.journalfiles. - Copy any
.sqliteor.snapshotfile to a file with the same name, but with an added.t0extension. - Start the metadata service.
There will intentionally be no .journal files present in the data folder when we start the server. They will appear automatically as new mutations on the volume occur.
Journal replay
The metdata service can replay journal files entirely, or to a specified point in time. This feature can be used restore state (in the case of improper shutdown of an in-memory service, accidental deletion or corruption of on-disk sqlite files, or data recovery). Journal replay can also be used a debugging tool as it can output a human-readable representation of all mutations to the file system captured in the journal (by specifying the --format txt flag). As with journal pruning, the metadata service should be stopped before performing these actions.
The flexFS metadata service automatically detects any missing volume state on startup, and will perform a full restore of any lost state automatically. However, this command can be run manually, which is typically only needed for data recovery or store type change scenarios.
When a .t0 state file exists, replay can only safely restore state in the same format as the .t0 state file. Be careful which --format value is provided when using replay to restore state.
Example: full restore
We have a .journal file, and possibly a .t0 state file, but no current .snapshot state file on disk.
sudo meta.flexfs replay --format snapshot 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal
This operation will generate an ouput file called 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot, which is the volume's current state after replaying all mutating operations in 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal relative to any existing 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot.t0.
Example: data recovery
This capability is limited by the retention parameter, which is configured on a per-volume basis. Even though it may be possible to revert the metadata state to an earlier point in time, the data blocks for files that have been modified or deleted need to be retained in the block store in order to recover their state. The default retention value is set to one week, which enables data recovery for up to one week in the past. The value for this parameter can be adjusted as desired.
We want to revert the volume to an earlier point in time (in read-only mode) to recover deleted or corrupted data.
- Back up all state files in the data folder for the target volume.
- Remove the current
.snapshotor.sqlitestate file for the target volume. - Use the
replaycommand, along with the--toTimeflag to specify the unix timestamp to replay until. - Switch the volume to read-only mode in the administrative service (possibly contact Paradigm4) or ensure all mounts will be mounted read-only to avoid any potential volume corruption.
- Start the metadata service.
- Recover the desired file(s) to another file system.
- Stop the metadata service.
- Restore the previously backed up state files for the target volume in to the data folder.
- Switch the volume to read-write mode in the administrative service (possibly contact Paradigm4).
- Start the metadata service.
- Copy the recovered file(s) back into flexFS.
Example: store type change
We want to switch from one store type to another (i.e. from an on-disk sqlite store to an in-memory snapshot store or vice versa):
- Ensure all volumes have a current
.snapshotor.sqlitestate file on disk. If necessary, a full restore can be performed. - Archive or remove any existing
.t0files. - Archive or remove any existing
.journalfiles. - Perform a journal extraction on each volume's
.snapshotor.sqlitestate file. - Archive or remove any existing
.snapshotor.sqlitefiles.
Journal extraction
The metadata service can extract a synthetic (non-historic) journal file from a volume's current state. To do so, there must be a current .snapshot or .sqlite state file on disk from which to extract the journal. This feature can be used a log compaction mechanism, or as part of the process of changing the store type. As with journal pruning, the metadata service should be stopped before performing these actions.
Extracted journals lack the historic context needed to perform data recovery operations. Do not perform this operation without archiving the source state if a near-term data recovery operation is anticipated.
To perform a journal extraction (e.g. of a .snapshot state file):
sudo meta.flexfs extract 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot
This operation will generate an output file called 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal from the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.snapshot state file. If a file by that name already exists, and you want to replace it, use the --clobber flag. Any future mutations to the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6 will be appended to the 2d8ff191-f286-44b0-9db1-4c850fdcb3f6.journal file.