dedup.flexfs
dedup.flexfs identifies duplicate files within a flexFS volume and optionally replaces them with hard links to reclaim storage. It fetches duplicate candidates from the metadata server’s /duplicates endpoint, then verifies them through checksum comparison and byte-for-byte validation before making any changes.
dedup.flexfs [flags] [path]If no path is specified, the current directory is used. The path must be within a flexFS mount.
| Flag | Type | Default | Description | Visibility |
|---|---|---|---|---|
--fix | bool | false | Replace duplicates with hard links (requires root) | Public |
--limit | uint64 | 0 | Maximum number of duplicate groups to return (0 = unlimited). Largest groups are returned first. | Public |
--maxBlocks | uint64 | 0 | Maximum blocks filter (0 = no limit) | Public |
--maxSize | uint64 | 0 | Maximum byte size filter (0 = no limit) | Public |
--minBlocks | uint64 | 0 | Minimum blocks filter | Public |
--minSize | uint64 | 0 | Minimum byte size filter | Public |
--noMetaSSL | bool | false | Disable SSL for metadata server connections | Internal |
How It Works
Section titled “How It Works”- Candidate Discovery: Queries the metadata server for files that share the same size and block count. All paths are returned, including multiple hard links to the same inode.
- Checksum Grouping: For groups of 3+ unique inodes, computes xxhash64 checksums concurrently to sub-group candidates. For pairs, this step is skipped.
- Byte Verification: Performs a byte-for-byte comparison of each candidate inode against the retained inode (once per unique inode, not per path).
- Retention Heuristic: The inode with the lowest birth time (oldest) is retained. Ties are broken by highest hard link count.
- Hard Link Replacement (with
--fix): Atomically replaces all paths for each duplicate inode with hard links to the retained inode.
Examples
Section titled “Examples”Scan a directory for duplicates (dry run):
dedup.flexfs /mnt/flexfs/dataFix duplicates by replacing them with hard links:
sudo dedup.flexfs --fix /mnt/flexfs/dataOnly scan files with at least 2 blocks and at most 100 blocks:
dedup.flexfs --minBlocks 2 --maxBlocks 100 /mnt/flexfs/dataWarnings
Section titled “Warnings”Run without --fix first. Always perform a dry run to review which files will be deduplicated before applying changes.
Hard links share a single inode. When --fix replaces a duplicate with a hard link, the duplicate’s original ownership, permissions, and timestamps are replaced by those of the retained file. If two duplicate files had different owners or permissions, the replaced file will silently adopt the retained file’s metadata. Review the dry-run output to ensure this is acceptable, particularly in multi-user environments.
Hard links share data. After deduplication, all linked paths point to the same inode and the same data blocks. A write to any path modifies the data seen by all paths. Similarly, operations like chmod, chown, and truncate affect all linked paths. If independent copies are needed, the file must be copied (not linked) to a new path.
Requirements
Section titled “Requirements”- The path must be within an active flexFS mount.
--fixrequires root privileges.--fixis not supported on time-travel mounts.