Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

talont-org/exp-lens-severity-testing

Name: exp-lens-severity-testing
Author: talont-org

src/autoskillit/skills_extended/exp-lens-severity-testing/SKILL.md

npx skillsauth add talont-org/autoskillit exp-lens-severity-testing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Severity Testing Experimental Design Lens

Philosophical Mode: Falsificationist Primary Question: "Would this design have caught the error?" Focus: Adversarial Cases, Negative Controls, Falsification Tests, Easy-Pass Detection, Confirmatory Theater

Arguments

/autoskillit:exp-lens-severity-testing [context_path] [experiment_plan_path]

context_path (optional positional arg 1) — Absolute path to a lens context file containing IV/DV tables, H0/H1 hypotheses, controlled variables, and success criteria. If provided, read this file before beginning analysis to obtain structured context. If omitted, discover context by exploring the CWD.
experiment_plan_path (optional positional arg 2) — Absolute path to the full experiment plan. If provided, read for complete experimental methodology and design. If omitted, locate the experiment plan by exploring the CWD.

When to Use

Evaluating whether positive results are meaningful or trivially achievable
Checking for adversarial robustness of experimental conclusions
User invokes /autoskillit:exp-lens-severity-testing or /autoskillit:make-experiment-diag severity

Critical Constraints

NEVER:

Modify any source code files
Accept a "pass" result without asking what a false result would have looked like under this design
Create files outside {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/
Run subagents in the background (run_in_background: true is prohibited)

ALWAYS:

For every positive claim, identify what error the test was capable of detecting
Inventory negative controls and sanity checks explicitly — their absence is a finding
Rate severity before reporting conclusions, not after
Flag confirmatory theater: experiments designed to confirm rather than risk refutation
BEFORE creating any diagram, LOAD the /autoskillit:mermaid skill using the Skill tool - this is MANDATORY
If the Skill tool cannot be used (disable-model-invocation) or refuses this invocation, do NOT proceed with diagram creation. Abort this step and omit the diagram from output.
Write output to {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{YYYY-MM-DD_HHMMSS}.md
After writing the file, emit the structured output token as literal plain text with no markdown formatting on the token name (the adjudicator performs a regex match):
```
diagram_path = /absolute/path/to/{{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{...}.md
```

Analysis Workflow

Step 0: Parse optional arguments

If positional arg 1 (context_path) is provided and the file exists, read it to obtain IV/DV tables, H0/H1 hypotheses, controlled variables, and success criteria. If positional arg 2 (experiment_plan_path) is provided and exists, read the experiment plan for full methodology. Use this structured context as the foundation for Steps 1-5; skip the CWD exploration for these fields if the context file supplies them.

Step 1: Launch Parallel Exploration Subagents

Spawn Explore subagents to investigate:

Positive Results Claimed

Find all conclusions and positive claims in the experiment
Look for: demonstrates, improves, outperforms, achieves, shows, confirms, validates

Negative Controls & Sanity Checks

Find negative controls, baselines, and sanity check tests
Look for: negative_control, sanity, ablation, degenerate, trivial, null, random

Adversarial Conditions

Find adversarial or stress-test conditions applied
Look for: adversarial, attack, stress, perturbation, corruption, noise, edge_case

Alternative Explanations Tested

Find whether alternative explanations were examined
Look for: alternative, confound, artifact, spurious, coincidence, luck

Prediction Specificity

Find how specific the predictions were before seeing data
Look for: prediction, hypothesis, preregistered, expected, prior

Step 2: Assess Severity for Each Claim

For each claim:

What error was the test capable of detecting?
What would a false positive result have looked like under this design?
Were negative controls or sanity checks included?
Were adversarial conditions tested?
Is the test informative (would a bad result look different from a good result)?

Step 3: Rate Severity and Identify Gaps

Severity ratings: HIGH / MEDIUM / LOW Flag confirmatory theater when design is structured to confirm rather than risk refutation.

Step 4: Create Optional Severity-Flow Diagram

Show Claims → HIGH/MEDIUM/LOW severity tests → Severity verdicts.

Step 5: Write Output

Write the analysis to: {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)

Pre-Diagram Checklist

Before creating the diagram, verify:

[ ] LOADED /autoskillit:mermaid skill using the Skill tool
[ ] Using ONLY classDef styles from the mermaid skill (no invented colors)
[ ] Diagram will include a color legend table

Related Skills

/autoskillit:make-experiment-diag - Parent skill
/autoskillit:mermaid - MUST BE LOADED before creating diagram
/autoskillit:exp-lens-error-budget
/autoskillit:exp-lens-validity-threats

talont-org/exp-lens-severity-testing

src/autoskillit/skills_extended/exp-lens-severity-testing/SKILL.md

Analyze severity of experimental tests — adversarial cases, negative controls, falsification tests, easy-pass detection, and confirmatory theater. Falsificationist lens answering "Would this design have caught the error?"

2 stars

testing

Updated May 9, 2026

$ install --global

skillsauth

npx skillsauth add talont-org/autoskillit exp-lens-severity-testing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 9, 2026, 5:18 AM188.6s1 file scanned

SKILL.md

name:: exp-lens-severity-testing
categories:: [exp-lens]
activate_deps:: [mermaid]
description:: Analyze severity of experimental tests — adversarial cases, negative controls, falsification tests, easy-pass detection, and confirmatory theater. Falsificationist lens answering "Would this design have caught the error?
- matcher:: *
- type:: command
command:: echo 'Severity Testing Lens - Analyzing adversarial robustness of experimental conclusions...'
once:: true

Severity Testing Experimental Design Lens

Arguments

/autoskillit:exp-lens-severity-testing [context_path] [experiment_plan_path]

context_path (optional positional arg 1) — Absolute path to a lens context file containing IV/DV tables, H0/H1 hypotheses, controlled variables, and success criteria. If provided, read this file before beginning analysis to obtain structured context. If omitted, discover context by exploring the CWD.
experiment_plan_path (optional positional arg 2) — Absolute path to the full experiment plan. If provided, read for complete experimental methodology and design. If omitted, locate the experiment plan by exploring the CWD.

When to Use

Evaluating whether positive results are meaningful or trivially achievable
Checking for adversarial robustness of experimental conclusions
User invokes /autoskillit:exp-lens-severity-testing or /autoskillit:make-experiment-diag severity

Critical Constraints

NEVER:

Modify any source code files
Accept a "pass" result without asking what a false result would have looked like under this design
Create files outside {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/
Run subagents in the background (run_in_background: true is prohibited)

ALWAYS:

For every positive claim, identify what error the test was capable of detecting
Inventory negative controls and sanity checks explicitly — their absence is a finding
Rate severity before reporting conclusions, not after
Flag confirmatory theater: experiments designed to confirm rather than risk refutation
BEFORE creating any diagram, LOAD the /autoskillit:mermaid skill using the Skill tool - this is MANDATORY
If the Skill tool cannot be used (disable-model-invocation) or refuses this invocation, do NOT proceed with diagram creation. Abort this step and omit the diagram from output.
Write output to {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{YYYY-MM-DD_HHMMSS}.md
After writing the file, emit the structured output token as literal plain text with no markdown formatting on the token name (the adjudicator performs a regex match):
```
diagram_path = /absolute/path/to/{{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{...}.md
```

Analysis Workflow

Step 0: Parse optional arguments

Step 1: Launch Parallel Exploration Subagents

Spawn Explore subagents to investigate:

Positive Results Claimed

Find all conclusions and positive claims in the experiment
Look for: demonstrates, improves, outperforms, achieves, shows, confirms, validates

Negative Controls & Sanity Checks

Find negative controls, baselines, and sanity check tests
Look for: negative_control, sanity, ablation, degenerate, trivial, null, random

Adversarial Conditions

Find adversarial or stress-test conditions applied
Look for: adversarial, attack, stress, perturbation, corruption, noise, edge_case

Alternative Explanations Tested

Find whether alternative explanations were examined
Look for: alternative, confound, artifact, spurious, coincidence, luck

Prediction Specificity

Find how specific the predictions were before seeing data
Look for: prediction, hypothesis, preregistered, expected, prior

Step 2: Assess Severity for Each Claim

For each claim:

What error was the test capable of detecting?
What would a false positive result have looked like under this design?
Were negative controls or sanity checks included?
Were adversarial conditions tested?
Is the test informative (would a bad result look different from a good result)?

Step 3: Rate Severity and Identify Gaps

Severity ratings: HIGH / MEDIUM / LOW Flag confirmatory theater when design is structured to confirm rather than risk refutation.

Step 4: Create Optional Severity-Flow Diagram

Show Claims → HIGH/MEDIUM/LOW severity tests → Severity verdicts.

Step 5: Write Output

Write the analysis to: {{AUTOSKILLIT_TEMP}}/exp-lens-severity-testing/exp_diag_severity_testing_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)

Pre-Diagram Checklist

Before creating the diagram, verify:

[ ] LOADED /autoskillit:mermaid skill using the Skill tool
[ ] Using ONLY classDef styles from the mermaid skill (no invented colors)
[ ] Diagram will include a color legend table

Related Skills

/autoskillit:make-experiment-diag - Parent skill
/autoskillit:mermaid - MUST BE LOADED before creating diagram
/autoskillit:exp-lens-error-budget
/autoskillit:exp-lens-validity-threats

Related Skills

talont-org/write-recipe

development

VerifiedTrustedCommunity

Generate YAML recipes for .autoskillit/recipes/. Use when user says "make script skill", "generate script", "script a workflow", "write a script", "create a script", "new recipe", "write a pipeline", or when loaded by other skills for script formatting.

2SKILL.mdUpdated May 9, 2026

talont-org/write-recipe

talont-org/vis-lens-uncertainty

data-ai

VerifiedTrustedCommunity

Create Uncertainty Representation visualization planning spec showing error bar definitions, distribution-aware alternatives, and multi-seed variance protocols. Statistical lens answering "How is uncertainty honestly represented?"

2SKILL.mdUpdated May 9, 2026

talont-org/vis-lens-uncertainty

talont-org/vis-lens-temporal

data-ai

VerifiedTrustedCommunity

Create Temporal Dynamics visualization planning spec showing axis scaling (linear vs log), smoothing disclosure, epoch/step alignment, run aggregation (mean + variance bands), early-stopping markers, and wall-clock vs step-count x-axis. Temporal lens answering "Are training dynamics shown clearly and honestly?"

2SKILL.mdUpdated May 9, 2026

talont-org/vis-lens-temporal

talont-org/vis-lens-story-arc

data-ai

VerifiedTrustedCommunity

Create Narrative Story Arc visualization planning spec showing visual consistency across the report (same color = same model everywhere), logical figure progression, redundant figure detection, and narrative dependency between figures. Narrative lens answering "Do the figures tell a coherent story across the report?"

2SKILL.mdUpdated May 9, 2026

talont-org/vis-lens-story-arc

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/talont-org/autoskillit.git

# Copy into Claude Code skills folder (global)
cp -r autoskillit/src/autoskillit/skills_extended/exp-lens-severity-testing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

talont-org/autoskillit

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT