Scaling to 1000+ Mounts

This guide covers tuning recommendations for deployments with hundreds or thousands of concurrent mount clients accessing a single metadata server.

Metadata Server Sizing

The metadata server is the primary scaling bottleneck. Each mount client maintains a persistent RPC session.

Hardware Recommendations

Mount Count	CPU Cores	RAM	Storage
100-500	4-8	16-32 GiB	SSD
500-1000	8-16	32-64 GiB	NVMe SSD
1000+	16+	64+ GiB	NVMe SSD

Memory Tuning

The metadata server’s database memory cache is controlled by the internal --dbMemCapacity flag (default 40% of system RAM). For high session counts, this is generally appropriate. Monitor the metadata server’s Prometheus metrics for cache hit rates.

Proxy Group Sizing

For large deployments, proxy groups reduce the load on object storage and improve read performance.

Sizing Guidelines

2-4 proxy servers per group for up to 500 mount clients.
4-8 proxy servers per group for 500-2000 mount clients.
Rendezvous hashing distributes blocks evenly across group members.
Each proxy server should have fast local SSD storage sized to the active working set.

Mount Client Tuning

FUSE Tuning

For high-throughput workloads with many concurrent readers:

Ensure the default max_pages setting is used (do not set --noMaxPages).
The default --attrValid (3600 seconds) and --entryValid (1 second) values are appropriate for most workloads. Increase --entryValid for read-heavy workloads where directory structure rarely changes.

Block Cache

For compute nodes with available local disk:

mount.flexfs start my-volume /mnt/flexfs \
  --diskFolder /local-ssd/cache \
  --diskQuota 80%

This reduces repeated reads from hitting the metadata or proxy layer.

Kernel Tuning

Network Tuning

For hosts running many mount clients or a metadata server handling many sessions:

# Increase connection tracking and socket buffers
sysctl -w net.core.somaxconn=4096
sysctl -w net.ipv4.tcp_max_syn_backlog=4096

Deployment Patterns

Separate Volume Groups

For very large deployments, split workloads across multiple volumes with separate metadata servers. This eliminates the single metadata server as a bottleneck:

Volume A (team 1): meta-server-1, block-store-1
Volume B (team 2): meta-server-2, block-store-2

Each volume can use the same or different proxy groups.

fstab-Based Mass Deployment

For deploying mounts across many hosts, use fstab entries:

my-volume  /mnt/flexfs  flexfs  _netdev,nofail  0 0

Combined with configuration management (Ansible, Puppet, Chef), credential initialization and fstab entries can be rolled out to thousands of hosts.

Kubernetes

The CSI driver automatically manages mount lifecycle in Kubernetes. For large clusters, deploy the CSI node DaemonSet on all worker nodes and use PersistentVolumeClaims for pod access.

Monitoring at Scale

Monitor these key metrics across all metadata servers:

RPC operations per second: Tracks overall load.
RPC latency percentiles: Detects degradation.
Active sessions: Counts connected mount clients.
Volume size gauges: Capacity planning.

Set up Prometheus alerting for:

RPC latency p99 exceeding thresholds.
Session count approaching known limits.
Metadata server process restarts.