skills/simulation-workflow/performance-profiling/SKILL.md
Identify computational bottlenecks, analyze parallel scaling, estimate memory requirements, and generate optimization recommendations for materials simulations — parse timing logs to find dominant phases (solver, assembly, I/O), evaluate strong and weak scaling efficiency, profile memory from mesh and field parameters, and detect bottlenecks with actionable fix suggestions. Use when a simulation is running slower than expected, investigating MPI scaling efficiency, planning HPC resource allocation, deciding whether to tune the preconditioner or reduce I/O frequency, or estimating if a problem fits in available RAM, even if the user only says "my simulation is too slow" or "how many nodes do I need."
npx skillsauth add HeshamFS/materials-simulation-skills performance-profilingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Provide tools to analyze simulation performance, identify bottlenecks, and recommend optimization strategies for computational materials science simulations.
Before running profiling scripts, collect from the user:
| Input | Description | Example |
|-------|-------------|---------|
| Simulation log | Log file with timing information | simulation.log |
| Scaling data | JSON with multi-run performance data | scaling_data.json |
| Simulation parameters | JSON with mesh, fields, solver config | params.json |
| Available memory | System memory in GB (optional) | 16.0 |
Need to identify slow phases?
├── YES → Use timing_analyzer.py
│ └── Parse simulation logs for timing data
│
Need to understand parallel performance?
├── YES → Use scaling_analyzer.py
│ └── Analyze strong or weak scaling efficiency
│
Need to estimate memory requirements?
├── YES → Use memory_profiler.py
│ └── Estimate memory from problem parameters
│
Need optimization recommendations?
└── YES → Use bottleneck_detector.py
└── Combine analyses and get actionable advice
| Metric | Good | Acceptable | Poor | |--------|------|------------|------| | Phase dominance | <30% | 30-50% | >50% | | Parallel efficiency | >0.80 | 0.70-0.80 | <0.70 | | Memory usage | <60% | 60-80% | >80% |
| Script | Key Outputs |
|--------|-------------|
| timing_analyzer.py | timing_data.phases, timing_data.slowest_phase, timing_data.total_time |
| scaling_analyzer.py | scaling_analysis.results, scaling_analysis.efficiency_threshold_processors |
| memory_profiler.py | memory_profile.total_memory_gb, memory_profile.per_process_gb, memory_profile.warnings |
| bottleneck_detector.py | bottlenecks, recommendations |
# Basic timing analysis
python3 scripts/timing_analyzer.py \
--log simulation.log \
--json
# Custom timing pattern
python3 scripts/timing_analyzer.py \
--log simulation.log \
--pattern 'Step\s+(\w+)\s+took\s+([\d.]+)s' \
--json
# Strong scaling (fixed problem size)
python3 scripts/scaling_analyzer.py \
--data scaling_data.json \
--type strong \
--json
# Weak scaling (constant work per processor)
python3 scripts/scaling_analyzer.py \
--data scaling_data.json \
--type weak \
--json
# Estimate memory requirements
python3 scripts/memory_profiler.py \
--params simulation_params.json \
--available-gb 16.0 \
--json
# Detect bottlenecks from timing only
python3 scripts/bottleneck_detector.py \
--timing timing_results.json \
--json
# Comprehensive analysis with all inputs
python3 scripts/bottleneck_detector.py \
--timing timing_results.json \
--scaling scaling_results.json \
--memory memory_results.json \
--json
User: My simulation is taking too long. Can you help me identify what's slow?
Agent workflow:
python3 scripts/timing_analyzer.py --log simulation.log --json
python3 scripts/scaling_analyzer.py --data scaling.json --type strong --json
python3 scripts/bottleneck_detector.py --timing timing.json --scaling scaling.json --json
| Scenario | Meaning | Action | |----------|---------|--------| | Solver >70% | Solver-dominated | Tune preconditioner, check tolerance | | Assembly >50% | Assembly-dominated | Cache matrices, vectorize, parallelize | | I/O >30% | I/O-dominated | Reduce frequency, use parallel I/O | | Balanced (<30% each) | Well-balanced | Look for algorithmic improvements |
| Efficiency | Meaning | Action | |------------|---------|--------| | >0.80 | Excellent scaling | Continue scaling up | | 0.70-0.80 | Good scaling | Monitor at larger scales | | 0.50-0.70 | Poor scaling | Investigate communication/load balance | | <0.50 | Very poor scaling | Reduce processor count or redesign |
| Usage | Meaning | Action | |-------|---------|--------| | <60% available | Safe | No action needed | | 60-80% available | Moderate | Monitor, consider optimization | | >80% available | High | Reduce resolution or increase processors | | >100% available | Exceeds capacity | Must reduce problem size |
| Error | Cause | Resolution |
|-------|-------|------------|
| Log file not found | Invalid path | Verify log file path |
| No timing data found | Pattern mismatch | Provide custom pattern with --pattern |
| At least 2 runs required | Insufficient data | Provide more scaling runs |
| Missing required parameters | Incomplete params | Add mesh and fields to params file |
--pattern regex values are validated for length (500 chars max) and rejected if they contain constructs prone to catastrophic backtracking (ReDoS)available_gb is validated as a positive finite number; mesh dimensions and field parameters are validated as positive integers--type (scaling type) is validated against a fixed allowlist (strong, weak)timing_analyzer.py reads a single log file specified by --log; log files are capped at 500 MB and rejected before parsingscaling_analyzer.py, memory_profiler.py, and bottleneck_detector.py read JSON files capped at 100 MBallowed-tools excludes Bash to prevent the agent from executing arbitrary commands when processing untrusted simulation logs or result fileseval(), exec(), or dynamic code generationshell=True)references/profiling_guide.md - Profiling concepts and interpretationreferences/optimization_strategies.md - Detailed optimization approachesdevelopment
Plan verification and validation campaigns for simulation codes using manufactured solutions, canonical benchmark problems, grid/time refinement, uncertainty propagation, and pass/fail acceptance criteria. Use when an agent needs to prove a solver, model, or result is trustworthy rather than only plausible.
testing
Map computational materials tasks onto workflow engines such as atomate2, jobflow, AiiDA, pyiron, or a simple one-off script. Use when deciding how to structure a reproducible campaign, DAG, restart strategy, provenance record, storage layout, or migration path from ad hoc scripts to managed workflows.
development
Plan molecular dynamics post-processing for materials simulations, including RDF, MSD and diffusion, VACF/VDOS, coordination numbers, bond-angle distributions, stress-strain curves, equilibration detection, PBC unwrapping, and trajectory format choices. Use before writing MD analysis scripts or trusting trajectory-derived results.
development
Triage cross-code simulation failures and propose safe retry ladders for nonconvergence, NaN/Inf, exploding energies, unstable timesteps, pressure blow-up, missing potentials, bad pseudopotentials, corrupted output, and incomplete runs. Use when an agent sees a failed or suspicious materials simulation and needs a defensible first response.