High Availability
This guide covers high availability considerations for each flexFS component. FlexFS is designed so that most component failures are either tolerated gracefully or recoverable with minimal downtime.
Admin Server
Section titled “Admin Server”The admin server (admin.flexfs or free.flexfs) is queried at mount time for volume settings and periodically for auto-updates. It is not on the data path during normal filesystem operations.
Impact of failure: Active mounts continue operating. New mounts and credential initialization will fail. Auto-updates pause until the server returns.
Recommendations:
- Run on a reliable host with systemd
Restart=alwaysconfigured. - Back up the admin database directory regularly.
- The admin server is a single instance per deployment. For HA, use VM-level redundancy (e.g., auto-restart, warm standby).
Metadata Server
Section titled “Metadata Server”The metadata server (meta.flexfs) is on the critical path for all filesystem operations. Every file open, directory listing, and attribute lookup goes through it.
Impact of failure: Active mounts become unresponsive. Mount clients will retry and reconnect automatically when the server returns. No data is lost — all metadata is persisted to disk.
Recommendations:
- Run on high-reliability infrastructure with local SSD storage.
- Configure systemd with
Restart=always. - Enable
--syncfor crash durability at the cost of write performance. - Back up the database folder regularly. The metadata database supports online backup.
- Consider separate metadata servers for separate volume groups to limit blast radius.
Proxy Servers
Section titled “Proxy Servers”Proxy servers (proxy.flexfs) are on the read path but are not required for correctness.
Impact of failure: Mount clients automatically fall back to direct object storage access. Reads continue without interruption, though with higher latency for uncached blocks.
Recommendations:
- Deploy multiple proxy servers per proxy group for redundancy.
- Rendezvous hashing redistributes blocks when a proxy leaves the group.
- Dynamic membership allows adding replacement proxies without restarting mounts.
- Monitor cache hit rates and disk usage via the proxy’s
/metricsendpoint.
Mount Client
Section titled “Mount Client”The mount client (mount.flexfs) runs as a daemon on each compute host.
Impact of failure: The mount point becomes stale (“transport endpoint not connected”). The mount.flexfs start command detects stale mounts and cleans them up automatically.
Recommendations:
- Use fstab entries with
_netdevandnofailfor automatic remounting. - The auto-update mechanism performs seamless FUSE session handoff, so updates do not cause mount interruptions.
- For containerized workloads, the CSI driver manages mount lifecycle automatically.
Object Storage
Section titled “Object Storage”Object storage (S3, GCS, Azure Blob, OCI) provides the durability layer for block data.
Impact of failure: Reads and writes to blocks will fail. This is catastrophic but exceedingly rare — cloud object storage services offer 99.99%+ availability SLAs.
Recommendations:
- Use the default storage class for your cloud provider.
- Enable versioning on the bucket as a defense-in-depth measure.
- Enable S3 server-side encryption (
--sse) for at-rest protection.
Active-Passive Setup
Section titled “Active-Passive Setup”For deployments requiring minimal downtime on the metadata server or admin server, an active-passive configuration can be used. In this setup, a standby instance is ready to take over if the primary fails. This approach applies to both meta.flexfs and admin.flexfs.
Shared storage approach
Section titled “Shared storage approach”Both the active and passive nodes mount a shared block device (e.g., an EBS volume, Azure Managed Disk, or a SAN LUN) containing the service’s database folder. Only one node runs the service at a time. On failover, the shared storage is detached from the failed node, attached to the standby, and the service is started.
Replicated storage approach
Section titled “Replicated storage approach”Use a block-level replication solution (e.g., DRBD, or cloud-native disk replication) to mirror the database folder from the active node to the standby. On failover, the standby promotes its replica and starts the service.
Failover mechanics
Section titled “Failover mechanics”In both approaches:
- Stop the service on the failed node (or confirm it is down).
- Ensure the standby node has access to the current database folder.
- Start the service on the standby node with the same
--bindAddrand credentials. - If the standby has a different IP address, update the relevant address record:
- For
meta.flexfs:configure.flexfs update meta-store <id> --address <new-address> - For
admin.flexfs: update theadminAddrin the credentials files of dependent services (meta.flexfs,configure.flexfs, and mount clients)
- For
- Active mount clients will reconnect automatically once the services are reachable.
Fencing and STONITH
Section titled “Fencing and STONITH”In an active-passive cluster, it is critical to ensure that the failed node is truly stopped before the standby takes over. Without proper fencing, a “split-brain” scenario can occur where both nodes access the database simultaneously, leading to corruption.
Use a fencing mechanism to guarantee that the failed node is powered off or isolated before failover:
- STONITH (Shoot The Other Node In The Head): Cluster managers like Pacemaker/Corosync support STONITH agents that forcibly power off or reset the failed node via IPMI, cloud provider APIs (e.g., AWS EC2
stop-instances, Azurevm deallocate), or PDU power control. - Storage fencing: With shared block devices, use SCSI persistent reservations or cloud-level disk detach operations to ensure the failed node cannot write to the shared storage after failover.
Recovery Procedures
Section titled “Recovery Procedures”Metadata Server Recovery
Section titled “Metadata Server Recovery”- Stop the metadata service:
manage.flexfs stop meta - Restore the database folder from backup.
- Start the metadata service:
manage.flexfs start meta - Active mounts will reconnect automatically.
Full Cluster Recovery
Section titled “Full Cluster Recovery”- Start the admin server first:
manage.flexfs start admin - Start metadata servers:
manage.flexfs start meta - Start proxy servers:
manage.flexfs start proxy - Remount on clients:
mount.flexfs start <name> <mount-point>or useupdate.flexfs --mount