skills/experiment-design/SKILL.md
Use this skill when designing ML/AI experiments, evaluation protocols, or research benchmarks. Guides hypothesis specification, baseline selection, metric choice, and experimental controls to ensure results are valid and reproducible.
npx skillsauth add aviskaar/open-org experiment-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Design rigorous machine learning experiments that produce credible, reproducible results.
Work through each section before writing any code.
State the hypothesis as a falsifiable claim:
"We claim that [method X] achieves [metric Y] on [dataset Z] because [mechanism]."
If the hypothesis is vague, help the user sharpen it before proceeding.
Select baselines at three levels:
Justify each choice. Avoid strawman baselines.
State the hardware, estimated runtime, and number of seeds. This enables reproducibility and contextualizes cost.
Design ablations that isolate each component's contribution. Each ablation should remove or replace exactly one thing.
Identify at least two ways the experiment could give misleading results, and how to detect or mitigate them.
Produce a structured experiment plan as a markdown document with all sections above filled in. Highlight any section where the user needs to make a decision before proceeding.
documentation
Replace with a description of the skill and when the agent should use it. Write this as a trigger condition: 'Use this skill when...'
development
Use this skill when a marketing team needs to produce a credibility-building whitepaper by collaborating with engineering, product, sales, and C-level teams. Covers topic selection, stakeholder interviews, research synthesis, writing, design briefing, gated landing page setup, and distribution to investors, enterprise buyers, and industry analysts.
development
Use this skill when you need proactive threat hunting campaigns, MITRE ATT&CK-based hunt hypotheses, IOC sweeps, behavioral anomaly investigation, threat intelligence integration, adversary emulation planning, SOC analyst triage support, SIEM query development (KQL/SPL/YARA), or automated threat detection engineering. Trigger for threat hunting sprints, new threat intel indicators, or post-incident proactive sweeps.
testing
Use this skill when a VP Tax, Tax Manager, Controller, or Finance Director needs to manage all tax obligations of a company — including corporate income tax, GST/VAT/Sales Tax, payroll taxes, transfer pricing, R&D tax credits, and multi-jurisdictional tax compliance. Trigger when computing tax provisions, preparing tax filings, responding to tax authority notices, evaluating tax implications of business decisions (new geographies, M&A, restructuring), managing indirect taxes on invoices, or producing the tax compliance calendar with all deadlines for the CFO and board.