A fast CLI tool that deduplicates files by content using SHA-256 hashing with concurrent processing.
- ✅ Content-based deduplication (not name-based)
- ✅ Concurrent processing with configurable workers
- ✅ Whitelist or blacklist file extensions
- ✅ Preserves file permissions
- ✅ Handles filename collisions automatically
go build -o sorterScan a directory and get all file extensions in whitelist format:
./sorter extensions -input /path/to/sourceOutput example: jpg,png,pdf,txt,doc
./sorter -input /path/to/source -output /path/to/destination./sorter -input /path/to/source -output /path/to/destination -whitelist jpg,png,pdf./sorter -input /path/to/source -output /path/to/destination -blacklist tmp,log,cache./sorter -input /path/to/source -output /path/to/destination -workers 16| Argument | Required | Description |
|---|---|---|
-input |
Yes | Source directory path |
-output |
Yes | Destination directory path |
-whitelist |
No | Comma-separated extensions to include (e.g., jpg,png,pdf) |
-blacklist |
No | Comma-separated extensions to exclude (e.g., tmp,log) |
-workers |
No | Number of concurrent workers (default: 8) |
Note: You cannot specify both -whitelist and -blacklist at the same time.
Discover available file types:
./sorter extensions -input /mnt/external
# Output: avi,doc,docx,jpg,mov,mp3,mp4,pdf,png,txtCopy only images:
./sorter -input /mnt/external -output ~/unique_files -whitelist jpg,jpeg,png,gif,bmpCopy everything except system files:
./sorter -input /mnt/external -output ~/unique_files -blacklist tmp,log,cache,sysProcess with maximum speed:
./sorter -input /mnt/external -output ~/unique_files -workers 32