skills/simulation-workflow/simulation-orchestrator/SKILL.md
Orchestrate multi-simulation campaigns — generate parameter sweep configurations (grid, linspace, or Latin Hypercube sampling), initialize and track batch job campaigns, monitor job completion status, and aggregate results with summary statistics across all runs. Use when running a parameter study across dt, kappa, or other simulation inputs, managing dozens or hundreds of simulation configurations, combining outputs from completed batch runs to find the best result, or automating the generate-run-collect workflow for systematic studies, even if the user only says "I need to try many parameter combinations" or "how do I organize a sweep."
npx skillsauth add HeshamFS/materials-simulation-skills simulation-orchestratorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provide tools to manage multi-simulation campaigns: generate parameter sweeps, track job execution status, and aggregate results from completed runs.
Before running orchestration scripts, collect from the user:
| Input | Description | Example |
|-------|-------------|---------|
| Base config | Template simulation configuration | base_config.json |
| Parameter ranges | Parameters to sweep with bounds | dt:[1e-4,1e-2],kappa:[0.1,1.0] |
| Sweep method | How to sample parameter space | grid, lhs, linspace |
| Output directory | Where to store campaign files | ./campaign_001 |
| Simulation command | Command to run each simulation | python sim.py --config {config} |
Need every combination (full factorial)?
├── YES → Use grid (warning: exponential growth with parameters)
└── NO → Is space-filling coverage needed?
├── YES → Use lhs (Latin Hypercube Sampling)
└── NO → Use linspace for uniform sampling per parameter
| Method | Best For | Sample Count |
|--------|----------|--------------|
| grid | Low dimensions (1-3), need exact corners | n^d (exponential) |
| linspace | 1D sweeps, uniform spacing | n per parameter |
| lhs | High dimensions, space-filling | user-specified budget |
| Parameters | Grid Points Each | Total Runs | Recommendation | |------------|------------------|------------|----------------| | 1 | 10 | 10 | Grid is fine | | 2 | 10 | 100 | Grid acceptable | | 3 | 10 | 1,000 | Consider LHS | | 4+ | 10 | 10,000+ | Use LHS or DOE |
| Script | Output Fields |
|--------|---------------|
| scripts/sweep_generator.py | configs, parameter_space, sweep_method, total_runs |
| scripts/campaign_manager.py | campaign_id, status, jobs, progress |
| scripts/job_tracker.py | job_id, status, start_time, end_time, exit_code |
| scripts/result_aggregator.py | summary, statistics, best_run, failed_runs |
Create configurations for all parameter combinations:
python3 scripts/sweep_generator.py \
--base-config base_config.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:3" \
--method linspace \
--output-dir ./campaign_001 \
--json
Create campaign tracking structure:
python3 scripts/campaign_manager.py \
--action init \
--config-dir ./campaign_001 \
--command "python sim.py --config {config}" \
--json
Monitor running jobs:
python3 scripts/job_tracker.py \
--campaign-dir ./campaign_001 \
--update \
--json
Combine results from completed runs:
python3 scripts/result_aggregator.py \
--campaign-dir ./campaign_001 \
--metric objective_value \
--json
# Generate 5x3=15 runs varying dt (5 values) and kappa (3 values)
python3 scripts/sweep_generator.py \
--base-config sim.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:3" \
--method linspace \
--output-dir ./sweep_001 \
--json
# Generate LHS samples for 4 parameters with budget of 20 runs
python3 scripts/sweep_generator.py \
--base-config sim.json \
--params "dt:1e-4:1e-2,kappa:0.1:1.0,M:1e-6:1e-4,W:0.5:2.0" \
--method lhs \
--samples 20 \
--output-dir ./lhs_001 \
--json
# Check campaign status
python3 scripts/campaign_manager.py \
--action status \
--config-dir ./sweep_001 \
--json
# Get summary statistics from completed runs
python3 scripts/result_aggregator.py \
--campaign-dir ./sweep_001 \
--metric final_energy \
--json
User: I want to run a parameter sweep on dt and kappa for my phase-field simulation. I want to try 5 values of dt between 1e-4 and 1e-2, and 4 values of kappa between 0.1 and 1.0.
Agent workflow:
python3 scripts/sweep_generator.py \
--base-config simulation.json \
--params "dt:1e-4:1e-2:5,kappa:0.1:1.0:4" \
--method linspace \
--output-dir ./dt_kappa_sweep \
--json
python3 scripts/campaign_manager.py \
--action init \
--config-dir ./dt_kappa_sweep \
--command "python phase_field.py --config {config}" \
--json
python3 scripts/result_aggregator.py \
--campaign-dir ./dt_kappa_sweep \
--metric interface_width \
--json
| Error | Cause | Resolution |
|-------|-------|------------|
| Base config not found | Invalid file path | Verify base config file exists |
| Invalid parameter format | Malformed param string | Use format name:min:max:count or name:min:max |
| Output directory exists | Would overwrite | Use --force or choose new directory |
| No completed jobs | No results to aggregate | Wait for jobs to complete or check for failures |
| Metric not found | Result files missing field | Verify metric name in result JSON |
The simulation-orchestrator works with other simulation-workflow skills:
parameter-optimization simulation-orchestrator
│ │
│ DOE samples ────────────────>│ Generate configs
│ │
│ │ Run simulations
│ │
│<──────────────────────────── │ Aggregate results
│ │
│ Sensitivity analysis │
│ Optimizer selection │
parameter-optimization/doe_generator.py to get sample pointssimulation-orchestrator/sweep_generator.py to create configssimulation-orchestrator/result_aggregator.py to collect resultsparameter-optimization/sensitivity_summary.py to analyze[a-zA-Z_][a-zA-Z0-9_.]* to prevent traversal or injection via crafted keyscampaign_manager.py validates command templates to reject shell chaining operators (;, |, &, backticks, $)--params format strings are parsed and validated (name:min:max:count with finite numeric bounds and positive integer counts)--method is validated against a fixed allowlist (grid, linspace, lhs)--samples is validated as a positive integer with an upper bound--action is validated against a fixed allowlist (init, status)sweep_generator.py reads a single base config file (JSON) specified by --base-config and writes generated configs to --output-dirresult_aggregator.py enforces a 10 MB file-size limit per result file, maximum JSON nesting depth, and strict numeric type checking (rejects bool, NaN, Inf)shlex.quote()allowed-tools excludes Bash to prevent the agent from executing arbitrary commands when processing untrusted simulation outputseval(), exec(), or dynamic code generationshell=True)references/campaign_patterns.md - Common campaign structuresreferences/sweep_strategies.md - Parameter sweep design guidancereferences/aggregation_methods.md - Result aggregation techniquesdevelopment
Plan verification and validation campaigns for simulation codes using manufactured solutions, canonical benchmark problems, grid/time refinement, uncertainty propagation, and pass/fail acceptance criteria. Use when an agent needs to prove a solver, model, or result is trustworthy rather than only plausible.
testing
Map computational materials tasks onto workflow engines such as atomate2, jobflow, AiiDA, pyiron, or a simple one-off script. Use when deciding how to structure a reproducible campaign, DAG, restart strategy, provenance record, storage layout, or migration path from ad hoc scripts to managed workflows.
development
Plan molecular dynamics post-processing for materials simulations, including RDF, MSD and diffusion, VACF/VDOS, coordination numbers, bond-angle distributions, stress-strain curves, equilibration detection, PBC unwrapping, and trajectory format choices. Use before writing MD analysis scripts or trusting trajectory-derived results.
development
Triage cross-code simulation failures and propose safe retry ladders for nonconvergence, NaN/Inf, exploding energies, unstable timesteps, pressure blow-up, missing potentials, bad pseudopotentials, corrupted output, and incomplete runs. Use when an agent sees a failed or suspicious materials simulation and needs a defensible first response.