skills/simulation-workflow/simulation-validator/SKILL.md
Validate simulations across three stages — run pre-flight checks on configuration files (parameter ranges, required fields, disk space), monitor runtime logs for residual growth, NaN/Inf, and adaptive dt collapse, and perform post-flight validation of results (physical bounds, mass/energy conservation, convergence). Diagnose failed simulations with probable-cause analysis and recommended fixes. Use when preparing to launch a simulation, checking whether a running job is healthy, verifying that finished results are trustworthy, or debugging a crash or blow-up, even if the user only says "my simulation crashed" or "can I trust these results."
npx skillsauth add HeshamFS/materials-simulation-skills simulation-validatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.
Before running validation scripts, collect from the user:
| Input | Description | Example |
|-------|-------------|---------|
| Config file | Simulation configuration (JSON/YAML) | simulation.json |
| Log file | Runtime output log | simulation.log |
| Metrics file | Post-run metrics (JSON) | results.json |
| Required params | Parameters that must exist | dt,dx,kappa |
| Valid ranges | Parameter bounds | dt:1e-6:1e-2 |
Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│ └── BLOCK status? → Fix issues, do NOT run simulation
│ └── WARN status? → Review warnings, document if accepted
│ └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│ └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│ └── Failed checks? → Do NOT use results
│ → Run failure_diagnoser.py
│ └── All passed? → Results are valid
| Metric | Conservative | Standard | Relaxed | |--------|--------------|----------|---------| | Mass tolerance | 1e-6 | 1e-3 | 1e-2 | | Residual growth | 2x | 10x | 100x | | dt reduction | 10x | 100x | 1000x |
| Script | Output Fields |
|--------|---------------|
| scripts/preflight_checker.py | report.status, report.blockers, report.warnings |
| scripts/runtime_monitor.py | alerts, residual_stats, dt_stats |
| scripts/result_validator.py | checks, confidence_score, failed_checks |
| scripts/failure_diagnoser.py | probable_causes, recommended_fixes |
scripts/preflight_checker.py --config simulation.jsonpython3 scripts/preflight_checker.py \
--config simulation.json \
--required dt,dx,kappa \
--ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
--min-free-gb 1.0 \
--json
scripts/runtime_monitor.py --log simulation.log periodicallypython3 scripts/runtime_monitor.py \
--log simulation.log \
--residual-growth 10.0 \
--dt-drop 100.0 \
--json
scripts/result_validator.py --metrics results.jsonpython3 scripts/result_validator.py \
--metrics results.json \
--bound-min 0.0 \
--bound-max 1.0 \
--mass-tol 1e-3 \
--json
When validation fails:
python3 scripts/failure_diagnoser.py --log simulation.log --json
User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?
Agent workflow:
python3 scripts/failure_diagnoser.py --log simulation.log --json
python3 scripts/runtime_monitor.py --log simulation.log --json
| Error | Cause | Resolution |
|-------|-------|------------|
| Config not found | File path invalid | Verify config path exists |
| Non-numeric value | Parameter is not a number | Fix config file format |
| out of range | Parameter outside bounds | Adjust parameter or bounds |
| Output directory not writable | Permission issue | Check directory permissions |
| Insufficient disk space | Disk nearly full | Free up space or reduce output |
| Status | Meaning | Action | |--------|---------|--------| | PASS | All checks passed | Proceed with confidence | | WARN | Non-critical issues found | Review and document | | BLOCK | Critical issues found | Must fix before proceeding |
| Score | Meaning | |-------|---------| | 1.0 | All validation checks passed | | 0.75+ | Most checks passed, minor issues | | 0.5-0.75 | Significant issues, review carefully | | < 0.5 | Major problems, do not trust results |
| Pattern in Log | Likely Cause | Recommended Fix | |----------------|--------------|-----------------| | NaN, Inf, overflow | Numerical instability | Reduce dt, increase damping | | max iterations, did not converge | Solver failure | Tune preconditioner, tolerances | | out of memory | Memory exhaustion | Reduce mesh, enable out-of-core | | dt reduced | Adaptive stepping triggered | May be okay if controlled |
--required parameter names are validated against a safe-character allowlist--ranges entries are parsed as name:min:max with finite numeric bounds enforced--min-free-gb is validated as a finite positive number--residual-growth and --dt-drop thresholds are validated as finite positive numbers--bound-min, --bound-max, and --mass-tol are validated as finite numbers with bound-max > bound-minpreflight_checker.py reads a single user-specified config file (JSON/YAML) and checks disk space on the output directoryruntime_monitor.py reads a single log file specified by --log; log files are size-limited (500 MB max) before parsingresult_validator.py reads a single metrics file (JSON) specified by --metricsfailure_diagnoser.py reads a single log file specified by --logpreflight_checker.py, runtime_monitor.py, result_validator.py, failure_diagnoser.py) with explicit argument listseval(), exec(), or dynamic code generationshell=True)references/validation_protocol.md - Detailed checklist and criteriareferences/log_patterns.md - Common failure signatures and regex patternsdevelopment
Plan verification and validation campaigns for simulation codes using manufactured solutions, canonical benchmark problems, grid/time refinement, uncertainty propagation, and pass/fail acceptance criteria. Use when an agent needs to prove a solver, model, or result is trustworthy rather than only plausible.
testing
Map computational materials tasks onto workflow engines such as atomate2, jobflow, AiiDA, pyiron, or a simple one-off script. Use when deciding how to structure a reproducible campaign, DAG, restart strategy, provenance record, storage layout, or migration path from ad hoc scripts to managed workflows.
development
Plan molecular dynamics post-processing for materials simulations, including RDF, MSD and diffusion, VACF/VDOS, coordination numbers, bond-angle distributions, stress-strain curves, equilibration detection, PBC unwrapping, and trajectory format choices. Use before writing MD analysis scripts or trusting trajectory-derived results.
development
Triage cross-code simulation failures and propose safe retry ladders for nonconvergence, NaN/Inf, exploding energies, unstable timesteps, pressure blow-up, missing potentials, bad pseudopotentials, corrupted output, and incomplete runs. Use when an agent sees a failed or suspicious materials simulation and needs a defensible first response.