skills/tar-compression/SKILL.md
Archive and compression commands — tar, gzip, zip, zstd, xz, bzip2, 7z, and friends. Use when user mentions "tar", "gzip", "zip", "unzip", "compress", "extract", "archive", "bzip2", "xz", "zstd", "7z", "7zip", "rar", "pigz", "pbzip2", ".tar.gz", ".tgz", ".tar.xz", ".tar.zst", ".tar.bz2", "untar", "decompress", "incremental backup", "split archive", "parallel compression", "disk image", "dd", or working with compressed files.
npx skillsauth add 1mangesh1/dev-skills-collection tar-compressionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
tar -cvf archive.tar dir/ # create .tar (no compression)
tar -czvf archive.tar.gz dir/ # create .tar.gz (gzip)
tar -cjvf archive.tar.bz2 dir/ # create .tar.bz2 (bzip2)
tar -cJvf archive.tar.xz dir/ # create .tar.xz (xz)
tar --zstd -cvf archive.tar.zst dir/ # create .tar.zst (zstd)
tar -xvf archive.tar # extract (auto-detects compression)
tar -xzvf archive.tar.gz # explicit gzip
tar -xjvf archive.tar.bz2 # explicit bzip2
tar -xJvf archive.tar.xz # explicit xz
Modern tar auto-detects compression on extraction, so tar -xvf works for all formats.
| Flag | Meaning |
|------|---------|
| -c | Create archive |
| -x | Extract archive |
| -t | List contents (don't extract) |
| -r | Append files to existing archive (uncompressed .tar only) |
| -u | Update — append only files newer than what's in archive |
| -v | Verbose (show files) |
| -f | Next argument is the filename — must come last before the filename |
| -z | gzip compression |
| -j | bzip2 compression |
| -J | xz compression |
| --zstd | zstd compression |
| -C | Change to directory before operating |
| -p | Preserve permissions |
| -h | Follow symlinks (archive the target, not the link) |
-f must be the last flag before the archive name. tar -cfv archive.tar fails — tar looks for a file named v.
tar -rvf archive.tar newfile.txt # append (uncompressed .tar only)
tar -uvf archive.tar dir/ # update only newer files
tar -czvf archive.tar.gz dir/ --exclude='*.log' --exclude='.git'
tar -czvf archive.tar.gz dir/ --exclude-vcs # skip .git, .svn, .hg
tar -czvf archive.tar.gz dir/ --exclude-vcs-ignores # also honor .gitignore
tar -czvf archive.tar.gz dir/ -X exclude.txt # patterns from file
# Inspect without extracting
tar -tzvf archive.tar.gz # list all files with sizes
tar -tzvf archive.tar.gz | grep pattern # find specific files
tar -tzvf archive.tar.gz --wildcards '*.conf' # list matching files (GNU tar)
# Extract specific files or directories
tar -xzvf archive.tar.gz path/to/file.txt # extract one file
tar -xzvf archive.tar.gz dir/subdir/ # extract one directory
tar -xzvf archive.tar.gz --wildcards '*.conf' # extract by pattern (GNU tar)
tar -xzvf archive.tar.gz -C /target/dir/ # extract to specific directory
tar -xzvf archive.tar.gz --strip-components=1 # strip top-level directory
# Full preservation (system backups, run as root)
tar -cpzvf archive.tar.gz --same-owner dir/
# Follow symlinks instead of storing links
tar -chzvf archive.tar.gz dir/
# Restore with ownership (requires root)
tar -xpzvf archive.tar.gz --same-owner -C /restore/
# Compare archive to filesystem (check what changed)
tar -dvf archive.tar dir/
On macOS, --same-owner is default for root. On GNU/Linux, specify it explicitly.
| Tool | Flag | Ext | Speed | Ratio | Parallel tool |
|-------|------|-----------|-----------|--------|---------------|
| gzip | -z | .tar.gz | Fast | Good | pigz |
| bzip2 | -j | .tar.bz2 | Slow | Better | pbzip2 |
| xz | -J | .tar.xz | Very slow | Best | pxz / pixz |
| zstd | --zstd | .tar.zst | Very fast | Better | built-in -T |
| lz4 | --use-compress-program=lz4 | .tar.lz4 | Fastest | Lower | built-in |
Rough benchmarks (1 GB mixed data, single core):
| Tool | Compress | Decompress | Compressed size | |---------|----------|------------|-----------------| | lz4 | ~2s | ~0.5s | ~55% of original | | zstd -1 | ~3s | ~1s | ~42% | | gzip -6 | ~12s | ~3s | ~36% | | zstd -19| ~90s | ~1s | ~30% | | xz -6 | ~120s | ~8s | ~28% |
Rule of thumb: quick backup → gzip/zstd. Source distribution → xz. General purpose → zstd. Real-time → lz4. bzip2 → legacy, skip it.
gzip file.txt # compress to file.txt.gz, removes original
gzip -k file.txt # keep original
gunzip file.txt.gz # decompress
gzip -9 file.txt # max compression (1=fast, 9=best)
gzip -l file.txt.gz # show compression ratio
zcat file.txt.gz # decompress to stdout
zstd file.txt # compress -> file.txt.zst (default level 3)
zstd -d file.txt.zst # decompress
zstd -19 file.txt # max standard compression (1-19)
zstd --ultra -22 file.txt # ultra mode (20-22, needs --ultra)
zstd --fast file.txt # speed over ratio
zstd -T0 file.txt # use all CPU cores
zstd --long file.txt # larger window for better ratio on large files
zstd --train samples/* -o mydict.zst # train from samples
zstd -D mydict.zst file.txt -o file.txt.zst # compress with dictionary
zstd -d -D mydict.zst file.txt.zst # decompress with dictionary
Improves ratio 2-5x on small, structurally similar files (logs, JSON, configs).
mysqldump mydb | zstd -T0 > dump.sql.zst
zstd -d < dump.sql.zst | mysql mydb
tar -cf - dir/ | pv | zstd -T0 > archive.tar.zst
# pigz — parallel gzip (drop-in replacement)
tar -cf - dir/ | pigz -p 8 > archive.tar.gz
pigz -d archive.tar.gz
# pbzip2 — parallel bzip2
tar -cf - dir/ | pbzip2 -p8 > archive.tar.bz2
# pxz — parallel xz
tar -cf - dir/ | pxz -T 8 > archive.tar.xz
# tar integration via --use-compress-program
tar -cf archive.tar.gz --use-compress-program='pigz -9' dir/
tar -xf archive.tar.gz --use-compress-program='pigz -d' -C /target/
zip archive.zip file1 file2 # create
zip -r archive.zip dir/ # recursive (directories)
zip -r archive.zip dir/ -x '*.log' # exclude pattern
unzip archive.zip # extract to current dir
unzip archive.zip -d /target/dir/ # extract to specific dir
unzip archive.zip file.txt # extract one file
unzip -o archive.zip # overwrite without prompting
unzip -l archive.zip # list files and sizes
zipinfo archive.zip # detailed info
zip -u archive.zip updated-file.txt # update changed files in archive
zip -e archive.zip file1 file2 # prompt for password
unzip -P 'mypass' archive.zip # extract with password
zip's encryption (ZipCrypto) is weak. Use 7z with AES-256 for real security.
zip -r -s 100m archive.zip dir/ # split at 100MB chunks
zip -s 0 archive.zip --out combined.zip # merge splits before extracting
unzip combined.zip
7z a archive.7z dir/ # create .7z
7z a archive.7z dir/ -mx=9 # ultra compression
7z a archive.7z dir/ -ms=on # solid mode (better compression)
7z a archive.7z dir/ -p'Pass' -mhe=on # AES-256 encrypt contents + filenames
7z a -v100m archive.7z dir/ # split into 100MB volumes
7z x archive.7z # extract preserving structure
7z e archive.7z # extract flat (no dirs)
7z x archive.7z -o/target/dir/ # extract to directory
7z l archive.7z # list contents
7z t archive.7z # test integrity
7z reads/writes: .7z, .zip, .tar, .gz, .bz2, .xz. Extracts .rar.
# Local to remote
tar -czvf - dir/ | ssh user@host 'tar -xzvf - -C /target/'
# Remote to local
ssh user@host 'tar -czvf - /remote/dir/' | tar -xzvf - -C /local/dir/
# With zstd for faster transfer
tar -cf - dir/ | zstd -T0 | ssh user@host 'zstd -d | tar -xf - -C /target/'
# With progress and bandwidth limit
tar -cf - dir/ | pv -L 10m | ssh user@host 'tar -xf - -C /target/'
# Level 0: full backup (creates snapshot file)
tar -czvf backup-full.tar.gz \
--listed-incremental=/var/backups/snapshot.snar dir/
# Level 1: incremental (only changes since last backup)
tar -czvf backup-inc-$(date +%Y%m%d).tar.gz \
--listed-incremental=/var/backups/snapshot.snar dir/
# Restore: apply full, then each incremental in order
tar -xzvf backup-full.tar.gz -C /restore/ --listed-incremental=/dev/null
tar -xzvf backup-inc-20260410.tar.gz -C /restore/ --listed-incremental=/dev/null
--listed-incremental=/dev/null during restore tells tar to extract everything. GNU tar only.
tar -czvf - dir/ | split -b 100m - archive.tar.gz.part-
cat archive.tar.gz.part-* | tar -xzvf -
| Feature | GNU tar (Linux) | BSD tar (macOS) |
|---------|----------------|-----------------|
| --wildcards | Required for patterns | Default behavior |
| --exclude-vcs-ignores | Supported | Not supported |
| --zstd | Supported | Use --use-compress-program |
| --listed-incremental | Supported | Not supported |
| Extended attributes | --xattrs | Stored by default |
# macOS: install GNU tar for full feature set
brew install gnu-tar # use as 'gtar'
# macOS: zstd with BSD tar
tar -cf - dir/ | zstd > archive.tar.zst
# macOS: avoid resource fork ._* files
COPYFILE_DISABLE=1 tar -czvf archive.tar.gz dir/
# Logs (repetitive text, compresses 90%+)
gzip -9 app.log
find /var/log -name '*.log' -mtime +7 -exec gzip {} \;
zstd --train logs/*.log -o log-dict.zst && zstd -D log-dict.zst -19 app.log
# Database dumps
mysqldump mydb | zstd -T0 > mydb.sql.zst
pg_dump mydb | gzip -9 > mydb.sql.gz
pg_dump -Fc mydb > mydb.dump # pg custom format (built-in compression)
# Source code distribution
tar -cJvf project.tar.xz --exclude-vcs project/
git archive --format=tar.gz HEAD > project.tar.gz
# Disk images
dd if=/dev/sda bs=4M status=progress | zstd -T0 > disk.img.zst
zstd -d < disk.img.zst | dd of=/dev/sda bs=4M status=progress
qemu-img convert -c -O qcow2 disk.raw disk.qcow2
# Backup with timestamp
tar -czvf "backup-$(date +%Y%m%d-%H%M%S).tar.gz" /path/to/dir/
# Dry run — see what would be extracted
tar -tzvf archive.tar.gz | head -20
# Find largest files in archive
tar -tzvf archive.tar.gz | sort -k3 -n -r | head -20
# Create archive from file list
tar -czvf archive.tar.gz -T filelist.txt
# Create archive excluding large files
find dir/ -size -10M | tar -czvf small-files.tar.gz -T -
# Verify archive integrity
gzip -t archive.tar.gz && echo "OK" || echo "CORRUPT"
xz -t archive.tar.xz
zstd -t archive.tar.zst
tools
Parallel execution with xargs, GNU parallel, and batch processing patterns. Use when user mentions "xargs", "parallel", "batch processing", "run in parallel", "parallel execution", "process list of files", "bulk operations", "concurrent commands", "map over files", or running commands on multiple inputs.
development
WebSocket implementation for real-time bidirectional communication. Use when user mentions "websocket", "ws://", "wss://", "real-time", "live updates", "chat application", "socket.io", "Server-Sent Events", "SSE", "push notifications", "live data", "streaming data", "bidirectional communication", "websocket server", "reconnection", or building real-time features.
tools
Frontend bundler configuration for Webpack and Vite. Use when user mentions "webpack", "vite", "bundler", "vite config", "webpack config", "code splitting", "tree shaking", "hot module replacement", "HMR", "build optimization", "bundle size", "chunk splitting", "loader", "plugin", "esbuild", "rollup", "dev server", or configuring JavaScript build tools.
tools
VS Code configuration, extensions, keybindings, and workspace optimization. Use when user mentions "vscode", "vs code", "vscode settings", "vscode extensions", "keybindings", "code editor", "workspace settings", "settings.json", "launch.json", "tasks.json", "vscode snippets", "devcontainer", "remote development", or customizing their VS Code setup.