Skip to content

Scaling to 1000+ Mounts

This guide covers tuning recommendations for deployments with hundreds or thousands of concurrent mount clients accessing a single metadata server.

The metadata server is the primary scaling bottleneck. Each mount client maintains a persistent RPC session.

Mount CountCPU CoresRAMStorage
100-5004-816-32 GiBSSD
500-10008-1632-64 GiBNVMe SSD
1000+16+64+ GiBNVMe SSD

The metadata server’s database memory cache is controlled by the internal --dbMemCapacity flag (default 40% of system RAM). For high session counts, this is generally appropriate. Monitor the metadata server’s Prometheus metrics for cache hit rates.

For large deployments, proxy groups reduce the load on object storage and improve read performance.

  • 2-4 proxy servers per group for up to 500 mount clients.
  • 4-8 proxy servers per group for 500-2000 mount clients.
  • Rendezvous hashing distributes blocks evenly across group members.
  • Each proxy server should have fast local SSD storage sized to the active working set.

For high-throughput workloads with many concurrent readers:

  • Ensure the default max_pages setting is used (do not set --noMaxPages).
  • The default --attrValid (3600 seconds) and --entryValid (1 second) values are appropriate for most workloads. Increase --entryValid for read-heavy workloads where directory structure rarely changes.

For compute nodes with available local disk:

Terminal window
mount.flexfs start my-volume /mnt/flexfs \
--diskFolder /local-ssd/cache \
--diskQuota 80%

This reduces repeated reads from hitting the metadata or proxy layer.

For hosts running many mount clients or a metadata server handling many sessions:

Terminal window
# Increase connection tracking and socket buffers
sysctl -w net.core.somaxconn=4096
sysctl -w net.ipv4.tcp_max_syn_backlog=4096

For very large deployments, split workloads across multiple volumes with separate metadata servers. This eliminates the single metadata server as a bottleneck:

  • Volume A (team 1): meta-server-1, block-store-1
  • Volume B (team 2): meta-server-2, block-store-2

Each volume can use the same or different proxy groups.

For deploying mounts across many hosts, use fstab entries:

my-volume /mnt/flexfs flexfs _netdev,nofail 0 0

Combined with configuration management (Ansible, Puppet, Chef), credential initialization and fstab entries can be rolled out to thousands of hosts.

The CSI driver automatically manages mount lifecycle in Kubernetes. For large clusters, deploy the CSI node DaemonSet on all worker nodes and use PersistentVolumeClaims for pod access.

Monitor these key metrics across all metadata servers:

  • RPC operations per second: Tracks overall load.
  • RPC latency percentiles: Detects degradation.
  • Active sessions: Counts connected mount clients.
  • Volume size gauges: Capacity planning.

Set up Prometheus alerting for:

  • RPC latency p99 exceeding thresholds.
  • Session count approaching known limits.
  • Metadata server process restarts.