Capacity Planning
This guide helps you estimate storage requirements, configure quotas, and understand the flexFS billing model for capacity planning.
Storage components
Section titled “Storage components”FlexFS storage usage has three components:
| Component | Where stored | Sizing factors |
|---|---|---|
| Block data | Cloud object storage | Total file data size (after compression). Directly proportional to the data written. |
| Metadata | Metadata server local disk | Proportional to the number of files, directories, and their attributes. |
| Cache | Mount client / proxy local disk | Configurable via --diskQuota. Working set dependent. |
Block data estimates
Section titled “Block data estimates”Block data is the dominant storage cost. After compression, the actual storage used depends on the data type:
| Data type | Typical compression ratio (LZ4) | 1 TB raw data stored as |
|---|---|---|
| Text / source code | 3-5x | 200-330 GB |
| Genomics (BAM) | 1.1-1.3x | 770 GB - 910 GB |
| Compressed files (gzip, zstd) | 1.0x (no benefit) | ~1 TB |
| Binary / random data | 1.0x | ~1 TB |
| Parquet / columnar data | 1.5-2x | 500 GB - 670 GB |
Retired blocks
Section titled “Retired blocks”When files are deleted or modified, old blocks are retired but may be retained for time-travel access during the retention period. Factor this into your storage estimates:
Total block storage = active blocks + retired blocks (within retention window)Monitor retired blocks via the flexfs_meta_volume_blocks_retired metric.
Metadata estimates
Section titled “Metadata estimates”Metadata storage scales with the number of filesystem objects, not with file data size.
| Filesystem objects | Approximate metadata size |
|---|---|
| 1 million files | 1-2 GB |
| 10 million files | 10-20 GB |
| 100 million files | 100-200 GB |
| 1 billion files | 1-2 TB |
Factors that increase metadata size per file:
- Extended ACLs
- Extended attributes
- Long file names
- Deep directory nesting (more directory entries)
- Time-travel retention (historical versions of metadata)
Metadata server disk sizing
Section titled “Metadata server disk sizing”Provision metadata server disk space at 2-3x the estimated metadata size to allow for:
- Database compaction overhead
- WAL (write-ahead log) files
- Growth headroom
Monitor disk usage with the flexfs_meta_db_disk_usage_bytes and flexfs_meta_db_folder_disk_capacity_bytes metrics.
Cache sizing
Section titled “Cache sizing”Mount client disk cache
Section titled “Mount client disk cache”The disk cache holds recently accessed blocks on local disk. Size it based on your working set:
| Working set | Recommended --diskQuota |
|---|---|
| Small (< 10 GB active data) | 20-50 GB |
| Medium (10-100 GB active data) | 100-500 GB |
| Large (> 100 GB active data) | 500 GB - 1 TB+ |
Use NVMe SSDs for the cache folder when low latency is important.
Proxy cache
Section titled “Proxy cache”Proxy server caches are shared across all mount clients in the proxy group. Size them based on the combined working set of all clients:
Proxy cache size >= working set of all clients / number of proxy serversVolume quotas
Section titled “Volume quotas”Enterprise volumes support quotas to limit resource usage:
| Quota type | Description |
|---|---|
maxBlocks | Maximum number of active blocks. Limits total data volume. |
maxInodes | Maximum number of inodes (files + directories). Limits total file count. |
maxProxied | Maximum number of blocks that can be proxied. Limits proxy cache consumption. |
Quotas are set during volume creation or update via configure.flexfs. When a quota is reached, write operations that would exceed the limit will fail.
Monitor quota usage via the flexfs_meta_volume_blocks, flexfs_meta_volume_inodes, and flexfs_meta_volume_size_bytes metrics.
Billing model
Section titled “Billing model”Enterprise flexFS usage is metered on a GB-month basis:
Monthly cost = sum(volume_size_bytes * hours_active) / (1 GB * hours_in_month)Key points:
- Billing is based on the logical storage size of the volume.
- Proxy cache and mount client cache do not count toward billed storage.
Planning checklist
Section titled “Planning checklist”| Question | How to answer |
|---|---|
| How much data will I store? | Estimate total file sizes, apply compression ratio. |
| How many files will I have? | Count files and directories for metadata sizing. |
| What is my working set? | Identify the subset of data accessed frequently for cache sizing. |
| How long do I need time-travel? | Set retention period; longer retention = more metadata and retired block storage. |
| What are my write patterns? | High write rates need more dirty cache capacity and writeback tuning. |
Next steps
Section titled “Next steps”- Performance tuning — optimize caches and block size
- Backup and recovery
- Metrics reference — monitor usage metrics