skills/experiment-evidence-router/SKILL.md
Route ML experiment planning, execution, debugging, result interpretation, and evidence packaging tasks to the correct skill. Use this when the task involves experiments, compute, results, or evidence — instead of guessing between run-experiment, run-status-monitor, experiment-debugger, result-diagnosis, research-results-auditor, statistical-analysis-planner, or paper packaging skills. Do not solve the task directly.
npx skillsauth add a-green-hand-jack/ml-research-skills experiment-evidence-routerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a router. Do not solve the experiment task directly.
Your job: classify the task → read the route table → select one child skill → hand off.
git rev-parse --git-common-dir vs --show-toplevel.memory/BRIEFING.md exists, read it for active phase and worktree context.| Bucket | Key signals | Route to |
|---|---|---|
| planning | design experiment, ablation plan, hypothesis, baselines, metrics, controls | experiment-design-planner |
| baseline-fairness | are baselines fair, SOTA current, reviewer will object to comparison | baseline-selection-audit |
| compute | GPU hours, budget, smoke test sizing, how long will it take | compute-budget-planner |
| data | dataset, split, contamination, preprocessing, train/val/test protocol | data-pipeline-manager |
| launch | submit new job, create run, SLURM/RunAI/local script, job file | run-experiment |
| status | existing job, queued, stuck, running, finished, ContainerCreating | run-status-monitor |
| eng-failure | NaN, OOM, crash, wrong metrics, slow training, reproducibility failure | experiment-debugger |
| sci-surprise | valid result but negative, surprising, ambiguous, seeds vary, baselines winning | result-diagnosis |
| claim-audit | confound, claim-drift, protocol integrity, attribution, lock claim into paper | research-results-auditor |
| statistics | significance test, p-value, confidence interval, effect size, seed variance | statistical-analysis-planner |
| pivot | direction change, consistent multi-cycle failure, narrow scope, kill project | project-pivot-planner |
| packaging | evidence board, tables, figures, provenance, experiment report | paper-result-asset-builder or experiment-report-writer |
references/contrastive-routing.md.result-diagnosis as a catch-all.testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.