Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

HeshamFS/simulation-validator

Name: simulation-validator
Author: HeshamFS

skills/simulation-workflow/simulation-validator/SKILL.md

npx skillsauth add HeshamFS/materials-simulation-skills simulation-validator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Error

VirusTotalMulti-engine malware detection

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Simulation Validator

Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

Requirements

Python 3.10+
No external dependencies (uses Python standard library only)
Works on Linux, macOS, and Windows

Inputs to Gather

Before running validation scripts, collect from the user:

| Input | Description | Example | |-------|-------------|---------| | Config file | Simulation configuration (JSON/YAML) | simulation.json | | Log file | Runtime output log | simulation.log | | Metrics file | Post-run metrics (JSON) | results.json | | Required params | Parameters that must exist | dt,dx,kappa | | Valid ranges | Parameter bounds | dt:1e-6:1e-2 |

Decision Guidance

When to Run Each Stage

Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid

Choosing Validation Thresholds

| Metric | Conservative | Standard | Relaxed | |--------|--------------|----------|---------| | Mass tolerance | 1e-6 | 1e-3 | 1e-2 | | Residual growth | 2x | 10x | 100x | | dt reduction | 10x | 100x | 1000x |

Script Outputs (JSON Fields)

| Script | Output Fields | |--------|---------------| | scripts/preflight_checker.py | report.status, report.blockers, report.warnings | | scripts/runtime_monitor.py | alerts, residual_stats, dt_stats | | scripts/result_validator.py | checks, confidence_score, failed_checks | | scripts/failure_diagnoser.py | probable_causes, recommended_fixes |

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

Run scripts/preflight_checker.py --config simulation.json
BLOCK status: Stop immediately, fix all blocker issues
WARN status: Review warnings, document accepted risks
PASS status: Proceed to simulation

python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json

Stage 2: Runtime (During Simulation)

Run scripts/runtime_monitor.py --log simulation.log periodically
Configure alert thresholds based on problem type
Stop simulation if critical alerts appear

python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json

Stage 3: Post-flight (After Simulation)

Run scripts/result_validator.py --metrics results.json
All checks PASS: Results are valid for analysis
Any check FAIL: Do NOT use results, diagnose failure

python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json

Failure Diagnosis

When validation fails:

python3 scripts/failure_diagnoser.py --log simulation.log --json

Conversational Workflow Example

User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

Agent workflow:

First, check the log for obvious errors:

python3 scripts/failure_diagnoser.py --log simulation.log --json

If diagnosis suggests numerical blow-up, check runtime stats:

python3 scripts/runtime_monitor.py --log simulation.log --json

Recommend fixes based on findings:
- If residual grew rapidly → reduce time step
- If dt collapsed → check stability conditions
- If NaN detected → check initial conditions

Error Handling

| Error | Cause | Resolution | |-------|-------|------------| | Config not found | File path invalid | Verify config path exists | | Non-numeric value | Parameter is not a number | Fix config file format | | out of range | Parameter outside bounds | Adjust parameter or bounds | | Output directory not writable | Permission issue | Check directory permissions | | Insufficient disk space | Disk nearly full | Free up space or reduce output |

Interpretation Guidance

Status Meanings

| Status | Meaning | Action | |--------|---------|--------| | PASS | All checks passed | Proceed with confidence | | WARN | Non-critical issues found | Review and document | | BLOCK | Critical issues found | Must fix before proceeding |

Confidence Score Interpretation

| Score | Meaning | |-------|---------| | 1.0 | All validation checks passed | | 0.75+ | Most checks passed, minor issues | | 0.5-0.75 | Significant issues, review carefully | | < 0.5 | Major problems, do not trust results |

Common Failure Patterns

| Pattern in Log | Likely Cause | Recommended Fix | |----------------|--------------|-----------------| | NaN, Inf, overflow | Numerical instability | Reduce dt, increase damping | | max iterations, did not converge | Solver failure | Tune preconditioner, tolerances | | out of memory | Memory exhaustion | Reduce mesh, enable out-of-core | | dt reduced | Adaptive stepping triggered | May be okay if controlled |

Security

Input Validation

Config file paths are validated for existence before parsing; non-existent paths produce clear errors
--required parameter names are validated against a safe-character allowlist
--ranges entries are parsed as name:min:max with finite numeric bounds enforced
--min-free-gb is validated as a finite positive number
--residual-growth and --dt-drop thresholds are validated as finite positive numbers
--bound-min, --bound-max, and --mass-tol are validated as finite numbers with bound-max > bound-min

File Access

preflight_checker.py reads a single user-specified config file (JSON/YAML) and checks disk space on the output directory
runtime_monitor.py reads a single log file specified by --log; log files are size-limited (500 MB max) before parsing
result_validator.py reads a single metrics file (JSON) specified by --metrics
failure_diagnoser.py reads a single log file specified by --log
No scripts write to the filesystem; all output goes to stdout

Tool Restrictions

Read: Used to inspect script source, references, config files, and simulation logs
Bash: Used to execute the four Python validation scripts (preflight_checker.py, runtime_monitor.py, result_validator.py, failure_diagnoser.py) with explicit argument lists
Write: Used to save validation reports; writes are scoped to the user's working directory
Grep/Glob: Used to locate log files, config files, and search references

Safety Measures

No eval(), exec(), or dynamic code generation
All subprocess calls use explicit argument lists (no shell=True)
Log parsing uses pre-compiled regex patterns; user-supplied patterns are not accepted (patterns are hardcoded)
Phase names and diagnostic strings extracted from logs are sanitized (truncated, control characters stripped) before inclusion in output

Limitations

Not a real-time monitor: Scripts analyze logs after-the-fact
Regex-based: Log parsing depends on pattern matching; may miss unusual formats
No automatic fixes: Scripts diagnose but don't modify simulations

References

references/validation_protocol.md - Detailed checklist and criteria
references/log_patterns.md - Common failure signatures and regex patterns

Version History

v1.1.0 (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
v1.0.0: Initial release with 4 validation scripts

HeshamFS/simulation-validator

skills/simulation-workflow/simulation-validator/SKILL.md

Validate simulations across three stages — run pre-flight checks on configuration files (parameter ranges, required fields, disk space), monitor runtime logs for residual growth, NaN/Inf, and adaptive dt collapse, and perform post-flight validation of results (physical bounds, mass/energy conservation, convergence). Diagnose failed simulations with probable-cause analysis and recommended fixes. Use when preparing to launch a simulation, checking whether a running job is healthy, verifying that finished results are trustworthy, or debugging a crash or blow-up, even if the user only says "my simulation crashed" or "can I trust these results."

39 stars

development

Updated May 19, 2026

$ install --global

skillsauth

npx skillsauth add HeshamFS/materials-simulation-skills simulation-validator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Error

VirusTotalMulti-engine malware detection

70%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 19, 2026, 5:33 AM146.2s9 files scanned

SKILL.md

name:: simulation-validator
description:: >
allowed-tools:: Read, Bash, Write, Grep, Glob
author:: HeshamFS
version:: 1.1.0
security_tier:: high
security_reviewed:: true
eval_cases:: 5
last_reviewed:: 2026-03-26

Simulation Validator

Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

Requirements

Python 3.10+
No external dependencies (uses Python standard library only)
Works on Linux, macOS, and Windows

Inputs to Gather

Before running validation scripts, collect from the user:

Decision Guidance

When to Run Each Stage

Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid

Choosing Validation Thresholds

Script Outputs (JSON Fields)

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

Run scripts/preflight_checker.py --config simulation.json
BLOCK status: Stop immediately, fix all blocker issues
WARN status: Review warnings, document accepted risks
PASS status: Proceed to simulation

python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json

Stage 2: Runtime (During Simulation)

Run scripts/runtime_monitor.py --log simulation.log periodically
Configure alert thresholds based on problem type
Stop simulation if critical alerts appear

python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json

Stage 3: Post-flight (After Simulation)

Run scripts/result_validator.py --metrics results.json
All checks PASS: Results are valid for analysis
Any check FAIL: Do NOT use results, diagnose failure

python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json

Failure Diagnosis

When validation fails:

python3 scripts/failure_diagnoser.py --log simulation.log --json

Conversational Workflow Example

User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

Agent workflow:

First, check the log for obvious errors:

python3 scripts/failure_diagnoser.py --log simulation.log --json

If diagnosis suggests numerical blow-up, check runtime stats:

python3 scripts/runtime_monitor.py --log simulation.log --json

Recommend fixes based on findings:
- If residual grew rapidly → reduce time step
- If dt collapsed → check stability conditions
- If NaN detected → check initial conditions

Error Handling

Interpretation Guidance

Status Meanings

Confidence Score Interpretation

Common Failure Patterns

Security

Input Validation

Config file paths are validated for existence before parsing; non-existent paths produce clear errors
--required parameter names are validated against a safe-character allowlist
--ranges entries are parsed as name:min:max with finite numeric bounds enforced
--min-free-gb is validated as a finite positive number
--residual-growth and --dt-drop thresholds are validated as finite positive numbers
--bound-min, --bound-max, and --mass-tol are validated as finite numbers with bound-max > bound-min

File Access

preflight_checker.py reads a single user-specified config file (JSON/YAML) and checks disk space on the output directory
runtime_monitor.py reads a single log file specified by --log; log files are size-limited (500 MB max) before parsing
result_validator.py reads a single metrics file (JSON) specified by --metrics
failure_diagnoser.py reads a single log file specified by --log
No scripts write to the filesystem; all output goes to stdout

Tool Restrictions

Read: Used to inspect script source, references, config files, and simulation logs
Bash: Used to execute the four Python validation scripts (preflight_checker.py, runtime_monitor.py, result_validator.py, failure_diagnoser.py) with explicit argument lists
Write: Used to save validation reports; writes are scoped to the user's working directory
Grep/Glob: Used to locate log files, config files, and search references

Safety Measures

No eval(), exec(), or dynamic code generation
All subprocess calls use explicit argument lists (no shell=True)
Log parsing uses pre-compiled regex patterns; user-supplied patterns are not accepted (patterns are hardcoded)
Phase names and diagnostic strings extracted from logs are sanitized (truncated, control characters stripped) before inclusion in output

Limitations

Not a real-time monitor: Scripts analyze logs after-the-fact
Regex-based: Log parsing depends on pattern matching; may miss unusual formats
No automatic fixes: Scripts diagnose but don't modify simulations

References

references/validation_protocol.md - Detailed checklist and criteria
references/log_patterns.md - Common failure signatures and regex patterns

Version History

v1.1.0 (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
v1.0.0: Initial release with 4 validation scripts

Related Skills

HeshamFS/benchmark-and-mms-planner

development

VerifiedTrustedCommunity

Plan verification and validation campaigns for simulation codes using manufactured solutions, canonical benchmark problems, grid/time refinement, uncertainty propagation, and pass/fail acceptance criteria. Use when an agent needs to prove a solver, model, or result is trustworthy rather than only plausible.

39SKILL.mdUpdated May 19, 2026

HeshamFS/benchmark-and-mms-planner

HeshamFS/workflow-engine-mapper

testing

VerifiedTrustedCommunity

Map computational materials tasks onto workflow engines such as atomate2, jobflow, AiiDA, pyiron, or a simple one-off script. Use when deciding how to structure a reproducible campaign, DAG, restart strategy, provenance record, storage layout, or migration path from ad hoc scripts to managed workflows.

39SKILL.mdUpdated May 19, 2026

HeshamFS/workflow-engine-mapper

HeshamFS/md-analysis-planner

development

VerifiedTrustedCommunity

Plan molecular dynamics post-processing for materials simulations, including RDF, MSD and diffusion, VACF/VDOS, coordination numbers, bond-angle distributions, stress-strain curves, equilibration detection, PBC unwrapping, and trajectory format choices. Use before writing MD analysis scripts or trusting trajectory-derived results.

39SKILL.mdUpdated May 19, 2026

HeshamFS/md-analysis-planner

HeshamFS/simulation-failure-triage

development

VerifiedTrustedCommunity

Triage cross-code simulation failures and propose safe retry ladders for nonconvergence, NaN/Inf, exploding energies, unstable timesteps, pressure blow-up, missing potentials, bad pseudopotentials, corrupted output, and incomplete runs. Use when an agent sees a failed or suspicious materials simulation and needs a defensible first response.

39SKILL.mdUpdated May 19, 2026

HeshamFS/simulation-failure-triage

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/HeshamFS/materials-simulation-skills.git

# Copy into Claude Code skills folder (global)
cp -r materials-simulation-skills/skills/simulation-workflow/simulation-validator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

HeshamFS/materials-simulation-skills

39 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT