plugins/bash-master/skills/parallel-processing-patterns/SKILL.md
Parallel and concurrent processing patterns in bash including GNU Parallel, xargs -P, job pools, and async patterns (2025). PROACTIVELY activate for: (1) running tasks in parallel from bash, (2) xargs -P for parallel command execution, (3) GNU parallel for complex distribution, (4) background jobs with & and wait, (5) implementing a concurrency cap (job pool of N workers), (6) collecting exit codes from parallel children, (7) preventing race conditions in shared output files, (8) parallel file processing across many cores. Provides: xargs -P recipes, GNU parallel quick reference, job-pool implementations in pure bash, exit-code aggregation patterns, and FIFO-based work queues.
npx skillsauth add JosiahSiegel/claude-plugin-marketplace parallel-processing-patternsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
MANDATORY: Always Use Backslashes on Windows for File Paths
When using Edit or Write tools on Windows, you MUST use backslashes (\) in file paths, NOT forward slashes (/).
Comprehensive guide to parallel and concurrent execution in bash, covering GNU Parallel, xargs parallelization, job control, worker pools, and modern async patterns for maximum performance.
# Debian/Ubuntu
sudo apt-get install parallel
# macOS
brew install parallel
# From source
wget https://ftp.gnu.org/gnu/parallel/parallel-latest.tar.bz2
tar -xjf parallel-latest.tar.bz2
cd parallel-*
./configure && make && sudo make install
#!/usr/bin/env bash
set -euo pipefail
# Process multiple files in parallel
parallel gzip ::: *.txt
# Equivalent to:
# for f in *.txt; do gzip "$f"; done
# But runs in parallel!
# Using find with parallel
find . -name "*.jpg" | parallel convert {} -resize 50% resized/{}
# Specify number of jobs
parallel -j 8 process_file ::: *.dat
# From stdin
cat urls.txt | parallel -j 10 wget -q
# Multiple inputs
parallel echo ::: A B C ::: 1 2 3
# Output: A 1, A 2, A 3, B 1, B 2, B 3, C 1, C 2, C 3
# Paired inputs with :::+
parallel echo ::: A B C :::+ 1 2 3
# Output: A 1, B 2, C 3
#!/usr/bin/env bash
set -euo pipefail
# Input from file
parallel -a input.txt process_line
# Multiple input files
parallel -a file1.txt -a file2.txt 'echo {1} {2}'
# Column-based input
cat data.tsv | parallel --colsep '\t' 'echo Name: {1}, Value: {2}'
# Named columns
cat data.csv | parallel --header : --colsep ',' 'echo {name}: {value}'
# Null-delimited for safety with special characters
find . -name "*.txt" -print0 | parallel -0 wc -l
# Line-based chunking
cat huge_file.txt | parallel --pipe -N1000 'wc -l'
#!/usr/bin/env bash
set -euo pipefail
# {} - Full input
parallel echo 'Processing: {}' ::: file1.txt file2.txt
# {.} - Remove extension
parallel echo '{.}' ::: file.txt file.csv
# Output: file, file
# {/} - Basename
parallel echo '{/}' ::: /path/to/file.txt
# Output: file.txt
# {//} - Directory path
parallel echo '{//}' ::: /path/to/file.txt
# Output: /path/to
# {/.} - Basename without extension
parallel echo '{/.}' ::: /path/to/file.txt
# Output: file
# {#} - Job number (1-based)
parallel echo 'Job {#}: {}' ::: A B C
# {%} - Slot number (recycled job slot)
parallel -j 2 'echo "Slot {%}: {}"' ::: A B C D E
# Combined
parallel 'convert {} -resize 50% {//}/thumb_{/.}.jpg' ::: *.png
#!/usr/bin/env bash
set -euo pipefail
# Show progress bar
parallel --bar process_item ::: {1..100}
# Progress with ETA
parallel --progress process_item ::: {1..100}
# Verbose output
parallel --verbose gzip ::: *.txt
# Log to file
parallel --joblog jobs.log gzip ::: *.txt
# Resume from where it left off (skip completed jobs)
parallel --joblog jobs.log --resume gzip ::: *.txt
# Results logging
parallel --results results_dir 'echo {1} + {2}' ::: 1 2 3 ::: 4 5 6
# Creates: results_dir/1/4/stdout, results_dir/1/4/stderr, etc.
#!/usr/bin/env bash
set -euo pipefail
# CPU-based parallelism (number of cores)
parallel -j "$(nproc)" process_item ::: {1..1000}
# Leave some cores free
parallel -j '-2' process_item ::: {1..1000} # nproc - 2
# Percentage of cores
parallel -j '50%' process_item ::: {1..1000}
# Load-based throttling
parallel --load 80% process_item ::: {1..1000}
# Memory-based throttling
parallel --memfree 2G process_item ::: {1..1000}
# Rate limiting (max jobs per second)
parallel -j 4 --delay 0.5 wget ::: url1 url2 url3 url4
# Timeout per job
parallel --timeout 60 long_process ::: {1..100}
# Retry failed jobs
parallel --retries 3 flaky_process ::: {1..100}
#!/usr/bin/env bash
set -euo pipefail
# Run on multiple servers
parallel --sshloginfile servers.txt process_item ::: {1..1000}
# servers.txt format:
# 4/server1.example.com (4 jobs on server1)
# 8/server2.example.com (8 jobs on server2)
# : (local machine)
# Transfer files before execution
parallel --sshloginfile servers.txt --transferfile {} process {} ::: *.dat
# Return results
parallel --sshloginfile servers.txt --return {.}.result process {} ::: *.dat
# Cleanup after transfer
parallel --sshloginfile servers.txt --transfer --return {.}.out --cleanup \
'process {} > {.}.out' ::: *.dat
# Environment variables
export MY_VAR="value"
parallel --env MY_VAR --sshloginfile servers.txt 'echo $MY_VAR' ::: A B C
#!/usr/bin/env bash
set -euo pipefail
# Pipe mode - distribute stdin across workers
cat huge_file.txt | parallel --pipe -N1000 'sort | uniq -c'
# Block size for pipe mode
cat data.bin | parallel --pipe --block 10M 'process_chunk'
# Keep order of output
parallel --keep-order 'sleep $((RANDOM % 3)); echo {}' ::: A B C D E
# Group output (don't mix output from different jobs)
parallel --group 'for i in 1 2 3; do echo "Job {}: line $i"; done' ::: A B C
# Tag output with job identifier
parallel --tag 'echo "output from {}"' ::: A B C
# Sequence output (output as they complete, but grouped)
parallel --ungroup 'echo "Starting {}"; sleep 1; echo "Done {}"' ::: A B C
#!/usr/bin/env bash
set -euo pipefail
# -P for parallel jobs
find . -name "*.txt" | xargs -P 4 -I {} gzip {}
# -n for items per command
echo {1..100} | xargs -n 10 -P 4 echo "Batch:"
# Null-delimited for safety
find . -name "*.txt" -print0 | xargs -0 -P 4 -I {} process {}
# Multiple arguments per process
cat urls.txt | xargs -P 10 -n 5 wget -q
# Limit max total arguments
echo {1..1000} | xargs -P 4 --max-args=50 echo
#!/usr/bin/env bash
set -euo pipefail
# Use sh -c for complex commands
find . -name "*.jpg" -print0 | \
xargs -0 -P 4 -I {} sh -c 'convert "$1" -resize 50% "thumb_$(basename "$1")"' _ {}
# Multiple placeholders
paste file1.txt file2.txt | \
xargs -P 4 -n 2 sh -c 'diff "$1" "$2" > "diff_$(basename "$1" .txt).patch"' _
# Process in batches
find . -name "*.log" -print0 | \
xargs -0 -P 4 -n 100 tar -czvf logs_batch.tar.gz
# With failure handling
find . -name "*.dat" -print0 | \
xargs -0 -P 4 -I {} sh -c 'process "$1" || echo "Failed: $1" >> failures.log' _ {}
#!/usr/bin/env bash
set -euo pipefail
# Track background jobs
declare -a PIDS=()
# Start jobs
for item in {1..10}; do
process_item "$item" &
PIDS+=($!)
done
# Wait for all
for pid in "${PIDS[@]}"; do
wait "$pid"
done
echo "All jobs complete"
# Or wait for any to complete
wait -n # Bash 4.3+
echo "At least one job complete"
#!/usr/bin/env bash
set -euo pipefail
# Maximum concurrent jobs
MAX_JOBS=4
# Simple semaphore using a counter
job_count=0
run_with_limit() {
local cmd=("$@")
# Wait if at limit
while ((job_count >= MAX_JOBS)); do
wait -n 2>/dev/null || true
((job_count--))
done
# Start new job
"${cmd[@]}" &
((job_count++))
}
# Usage
for item in {1..20}; do
run_with_limit process_item "$item"
done
# Wait for remaining
wait
#!/usr/bin/env bash
set -euo pipefail
MAX_JOBS=4
JOB_FIFO="/tmp/job_pool_$$"
# Create job slots
mkfifo "$JOB_FIFO"
trap 'rm -f "$JOB_FIFO"' EXIT
# Initialize slots
exec 3<>"$JOB_FIFO"
for ((i=0; i<MAX_JOBS; i++)); do
echo >&3
done
# Run with slot
run_with_slot() {
local cmd=("$@")
read -u 3 # Acquire slot (blocks if none available)
{
"${cmd[@]}"
echo >&3 # Release slot
} &
}
# Usage
for item in {1..20}; do
run_with_slot process_item "$item"
done
wait
exec 3>&-
#!/usr/bin/env bash
set -euo pipefail
WORK_QUEUE="/tmp/work_queue_$$"
RESULT_QUEUE="/tmp/result_queue_$$"
NUM_WORKERS=4
mkfifo "$WORK_QUEUE" "$RESULT_QUEUE"
trap 'rm -f "$WORK_QUEUE" "$RESULT_QUEUE"' EXIT
# Worker function
worker() {
local id="$1"
while read -r task; do
[[ "$task" == "STOP" ]] && break
# Process task
local result
result=$(process_task "$task" 2>&1)
echo "RESULT:$id:$task:$result"
done
}
# Start workers
for ((i=0; i<NUM_WORKERS; i++)); do
worker "$i" < "$WORK_QUEUE" > "$RESULT_QUEUE" &
done
# Result collector (background)
collect_results() {
while read -r line; do
[[ "$line" == "DONE" ]] && break
echo "$line" >> results.txt
done < "$RESULT_QUEUE"
} &
COLLECTOR_PID=$!
# Producer - send work
{
for task in "${TASKS[@]}"; do
echo "$task"
done
# Stop signals for workers
for ((i=0; i<NUM_WORKERS; i++)); do
echo "STOP"
done
} > "$WORK_QUEUE"
# Signal end of results
wait # Wait for workers
echo "DONE" > "$RESULT_QUEUE"
wait "$COLLECTOR_PID"
Longer examples for PID collection, wait -n, bounded concurrency, fail-fast job pools, progress reporting, signal-safe cleanup, batch sizing, CPU-count tuning, memory-aware concurrency, and benchmarking live in references/modern-async-patterns.md and references/performance-optimization.md. Load those references for complex orchestration or tuning beyond the core GNU Parallel / xargs / job-control patterns.
#!/usr/bin/env bash
set -euo pipefail
# Track failures
declare -A FAILURES
parallel_with_retry() {
local max_retries=3
local items=("$@")
for item in "${items[@]}"; do
local retries=0
local success=false
while ((retries < max_retries)) && ! $success; do
if process_item "$item"; then
success=true
else
((retries++))
echo "Retry $retries for $item" >&2
sleep $((retries * 2)) # Exponential backoff
fi
done
if ! $success; then
FAILURES["$item"]="Failed after $max_retries retries"
fi
done &
wait
}
# Report failures
report_failures() {
if ((${#FAILURES[@]} > 0)); then
echo "Failures:" >&2
for item in "${!FAILURES[@]}"; do
echo " $item: ${FAILURES[$item]}" >&2
done
return 1
fi
}
#!/usr/bin/env bash
set -euo pipefail
# Global cancellation flag
CANCELLED=false
declare -a WORKER_PIDS=()
cancel_all() {
CANCELLED=true
for pid in "${WORKER_PIDS[@]}"; do
kill "$pid" 2>/dev/null || true
done
}
trap cancel_all SIGINT SIGTERM
cancellable_worker() {
local id="$1"
while ! $CANCELLED; do
# Check for work
if work=$(get_next_work); then
process_work "$work"
else
sleep 0.1
fi
done
}
# Start workers
for ((i=0; i<NUM_WORKERS; i++)); do
cancellable_worker "$i" &
WORKER_PIDS+=($!)
done
# Wait with interrupt support
wait || true
Master parallel processing for efficient multi-core utilization and faster script execution.
development
This skill should be used when the user asks to train, debug, scale, or improve ML models. PROACTIVELY activate for: (1) PyTorch, TensorFlow/Keras, JAX, Flax, Hugging Face Trainer/Accelerate training loops, (2) distributed training, DDP/FSDP/DeepSpeed, TPU/GPU setup, (3) mixed precision AMP/bf16, gradient accumulation, checkpointing, seeding, (4) overfitting, imbalance, loss functions, regularization, LR schedules, warmup, (5) memory optimization, gradient checkpointing, offloading, quantization-aware training. Provides: reproducible training best practices across deep learning and classical ML.
development
This skill should be used when the user asks to productionize, track, version, govern, monitor, or automate ML systems. PROACTIVELY activate for: (1) MLflow, Weights & Biases, Neptune, Comet, ClearML experiment tracking, (2) model registry, model versioning, artifact lineage, reproducibility, (3) Kubeflow, SageMaker Pipelines, Vertex AI Pipelines, Azure ML pipelines, Databricks workflows, (4) CI/CD, continuous training/evaluation, A/B tests, canary/shadow deployments, (5) drift detection, model monitoring, data validation, responsible AI governance. Provides: end-to-end MLOps architecture and operational safeguards.
development
This skill should be used when the user asks to optimize, export, serve, compress, or accelerate ML inference. PROACTIVELY activate for: (1) latency, throughput, p95/p99, batching, concurrency, KV cache, memory, or cost issues, (2) quantization INT8/INT4, GPTQ, AWQ, bitsandbytes, pruning, sparsity, distillation, (3) ONNX export, ONNX Runtime, TensorRT, TorchScript, torch.compile, XLA, OpenVINO, Core ML, TFLite, (4) Triton, TorchServe, TF Serving, BentoML, Seldon, KServe configuration, (5) edge deployment, CPU/GPU/TPU/Inferentia serving. Provides: hardware-aware inference optimization and safe benchmarking.
testing
This skill should be used when the user asks to tune hyperparameters, run sweeps, optimize search spaces, or use AutoML. PROACTIVELY activate for: (1) Optuna, Ray Tune, FLAML, AutoGluon, Hyperopt, Nevergrad, KerasTuner, W&B sweeps, (2) grid search, random search, Bayesian optimization, TPE, Gaussian processes, evolutionary search, (3) ASHA, Hyperband, successive halving, multi-fidelity optimization, population-based training, (4) learning-rate finder, batch-size search, early stopping, pruning, (5) reproducible sweep design and experiment analysis. Provides: budget-aware hyperparameter search strategy.