src/autoskillit/skills_extended/exp-lens-variance-stability/SKILL.md
Create a variance analysis profile assessing whether signals exceed noise and whether results are stable across random seeds. Stability lens answering "Is the signal larger than the noise?"
npx skillsauth add talont-org/autoskillit exp-lens-variance-stabilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Philosophical Mode: Stability Primary Question: "Is the signal larger than the noise?" Focus: Run-to-Run Variance, Seed Sensitivity, Nondeterminism Sources, Confidence Intervals, Noise Floor
/autoskillit:exp-lens-variance-stability [context_path] [experiment_plan_path]
/autoskillit:exp-lens-variance-stability or /autoskillit:make-experiment-diag varianceNEVER:
{{AUTOSKILLIT_TEMP}}/exp-lens-variance-stability/run_in_background: true is prohibited)ALWAYS:
Count the actual number of independent runs — single-run results must be flagged prominently
Assess whether claimed improvements exceed the observed standard deviation
Identify all sources of nondeterminism, not just random seeds
Report when confidence intervals are absent — absence is a finding, not an omission
BEFORE creating any diagram, LOAD the /autoskillit:mermaid skill using the Skill tool - this is MANDATORY
If the Skill tool cannot be used (disable-model-invocation) or refuses this invocation, do NOT proceed with diagram creation. Abort this step and omit the diagram from output.
Write output to {{AUTOSKILLIT_TEMP}}/exp-lens-variance-stability/exp_diag_variance_stability_{YYYY-MM-DD_HHMMSS}.md
After writing the file, emit the structured output token as literal plain text with no markdown formatting on the token name (the adjudicator performs a regex match):
diagram_path = /absolute/path/to/{{AUTOSKILLIT_TEMP}}/exp-lens-variance-stability/exp_diag_variance_stability_{...}.md
If positional arg 1 (context_path) is provided and the file exists, read it to obtain IV/DV tables, H0/H1 hypotheses, controlled variables, and success criteria. If positional arg 2 (experiment_plan_path) is provided and exists, read the experiment plan for full methodology. Use this structured context as the foundation for Steps 1-5; skip the CWD exploration for these fields if the context file supplies them.
Spawn Explore subagents to investigate:
Random Seed Management
Nondeterminism Sources
Multiple Run Protocol
Variance Reporting
Signal-to-Noise Assessment
For each reported result:
Build the variance profile.
CRITICAL — Analyze Signal vs Noise: For every claimed improvement:
Use the mermaid skill conventions to create a stochasticity diagram with:
Direction: TB (nondeterminism sources flow down through aggregation to reported results)
Subgraphs:
Node Styling:
stateNode class: Nondeterminism sourceshandler class: Aggregation methodsoutput class: Reported resultsgap class: Unreported variance or single-run resultsdetector class: Confidence intervals and statistical testscli class: Seed managementWrite the diagram to: {{AUTOSKILLIT_TEMP}}/exp-lens-variance-stability/exp_diag_variance_stability_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)
# Variance Stability Analysis: {Experiment Name}
**Lens:** Variance Stability (Stability)
**Question:** Is the signal larger than the noise?
**Date:** {YYYY-MM-DD}
**Scope:** {What was analyzed}
## Variance Profile
| Experiment | N Runs | Mean | Std | CI | Signal > Noise? |
|------------|--------|------|-----|----|-----------------|
| {experiment} | {n} | {mean} | {std} | {CI or "Not reported"} | {Yes/No/Unclear} |
## Nondeterminism Inventory
| Source | Type | Controlled? | Impact |
|--------|------|-------------|--------|
| {source} | {seed/hardware/async/etc} | {Yes/No/Partial} | {Low/Medium/High} |
## Stochasticity Diagram
```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
graph TB
%% CLASS DEFINITIONS %%
classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;
subgraph NDSources ["NONDETERMINISM SOURCES"]
direction TB
SEED["Random Seed<br/>━━━━━━━━━━<br/>Controlled via<br/>seed management"]
HW["Hardware Variance<br/>━━━━━━━━━━<br/>GPU/CPU ordering<br/>differences"]
ASYNC["Async Operations<br/>━━━━━━━━━━<br/>Thread/process<br/>race conditions"]
end
subgraph Aggregation ["VARIANCE AGGREGATION"]
direction TB
MULTI["Multiple Runs<br/>━━━━━━━━━━<br/>N independent<br/>repetitions"]
CI["Confidence Interval<br/>━━━━━━━━━━<br/>Statistical bounds<br/>on estimates"]
end
subgraph Results ["REPORTED RESULTS"]
direction TB
RESULT["Reported Result<br/>━━━━━━━━━━<br/>Mean ± std<br/>with CI"]
SINGLE["Single-Run Result<br/>━━━━━━━━━━<br/>No variance<br/>reported"]
end
SEED --> MULTI
HW --> MULTI
ASYNC --> MULTI
MULTI --> CI
CI --> RESULT
MULTI --> SINGLE
%% CLASS ASSIGNMENTS %%
class SEED cli;
class HW,ASYNC stateNode;
class MULTI handler;
class CI detector;
class RESULT output;
class SINGLE gap;
| Seed | Run Result | Rank Among Methods | |------|-----------|-------------------| | {seed} | {result} | {rank} |
---
## Pre-Diagram Checklist
Before creating the diagram, verify:
- [ ] LOADED `/autoskillit:mermaid` skill using the Skill tool
- [ ] Using ONLY classDef styles from the mermaid skill (no invented colors)
- [ ] Diagram will include a color legend table
---
## Related Skills
- `/autoskillit:make-experiment-diag` - Parent skill for lens selection
- `/autoskillit:mermaid` - MUST BE LOADED before creating diagram
- `/autoskillit:exp-lens-reproducibility-artifacts` - For environment and artifact reproducibility
- `/autoskillit:exp-lens-error-budget` - For systematic error and bias analysis
development
Generate YAML recipes for .autoskillit/recipes/. Use when user says "make script skill", "generate script", "script a workflow", "write a script", "create a script", "new recipe", "write a pipeline", or when loaded by other skills for script formatting.
data-ai
Create Uncertainty Representation visualization planning spec showing error bar definitions, distribution-aware alternatives, and multi-seed variance protocols. Statistical lens answering "How is uncertainty honestly represented?"
data-ai
Create Temporal Dynamics visualization planning spec showing axis scaling (linear vs log), smoothing disclosure, epoch/step alignment, run aggregation (mean + variance bands), early-stopping markers, and wall-clock vs step-count x-axis. Temporal lens answering "Are training dynamics shown clearly and honestly?"
data-ai
Create Narrative Story Arc visualization planning spec showing visual consistency across the report (same color = same model everywhere), logical figure progression, redundant figure detection, and narrative dependency between figures. Narrative lens answering "Do the figures tell a coherent story across the report?"