src/autoskillit/skills_extended/exp-lens-comparator-construction/SKILL.md
Create Comparator Construction experimental design analysis assessing whether baselines and controls are fair and relevant. Counterfactual lens answering "Is the comparator fair and relevant?"
npx skillsauth add talont-org/autoskillit exp-lens-comparator-constructionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Philosophical Mode: Counterfactual Primary Question: "Is the comparator fair and relevant?" Focus: Baseline Choice, Control Realism, Version Matching, Effort Symmetry, Baseline Drift
/autoskillit:exp-lens-comparator-construction [context_path] [experiment_plan_path]
/autoskillit:exp-lens-comparator-construction or /autoskillit:make-experiment-diag comparatorNEVER:
{{AUTOSKILLIT_TEMP}}/exp-lens-comparator-construction/run_in_background: true is prohibited)ALWAYS:
Build a fairness matrix covering all treatment-vs-comparator pairs
Check for confounding differences in implementation, tuning, data access, and compute
Assess whether each comparator is the best available alternative at the time of the experiment
Identify temporal drift in baseline relevance
BEFORE creating any diagram, LOAD the /autoskillit:mermaid skill using the Skill tool - this is MANDATORY
If the Skill tool cannot be used (disable-model-invocation) or refuses this invocation, do NOT proceed with diagram creation. Abort this step and omit the diagram from output.
Write output to {{AUTOSKILLIT_TEMP}}/exp-lens-comparator-construction/exp_diag_comparator_construction_{YYYY-MM-DD_HHMMSS}.md
After writing the file, emit the structured output token as literal plain text with no markdown formatting on the token name (the adjudicator performs a regex match):
diagram_path = /absolute/path/to/{{AUTOSKILLIT_TEMP}}/exp-lens-comparator-construction/exp_diag_comparator_construction_{...}.md
If positional arg 1 (context_path) is provided and the file exists, read it to obtain IV/DV tables, H0/H1 hypotheses, controlled variables, and success criteria. If positional arg 2 (experiment_plan_path) is provided and exists, read the experiment plan for full methodology. Use this structured context as the foundation for Steps 1-5; skip the CWD exploration for these fields if the context file supplies them.
Spawn Explore subagents to investigate:
Baseline/Control Definitions
Implementation Parity
Version & Environment Match
Tuning Protocol Symmetry
Temporal Baseline Drift
For each comparator, assess:
CRITICAL — Analyze Counterfactual Quality: For each treatment-vs-comparator pair:
Build a fairness matrix with rows = comparators, columns = fairness dimensions.
If a diagram adds value, create a simplified flowchart. This is OPTIONAL for this hybrid lens — the tables are the primary output.
Direction: LR (treatment and comparator flow in parallel toward evaluation)
Subgraphs: "PROPOSED METHOD", "COMPARATOR(S)", "SHARED EVALUATION"
Node Styling:
cli class: proposed method nodesphase class: comparator method nodeshandler class: shared evaluation pipeline nodesoutput class: results nodesgap class: asymmetries flaggeddetector class: parity checksWrite the analysis to: {{AUTOSKILLIT_TEMP}}/exp-lens-comparator-construction/exp_diag_comparator_construction_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)
# Comparator Construction Analysis: {Experiment Name}
**Lens:** Comparator Construction (Counterfactual)
**Question:** Is the comparator fair and relevant?
**Date:** {YYYY-MM-DD}
**Scope:** {What was analyzed}
## Comparator Inventory
| Comparator | Source | Reimplemented? | Same Environment? | Same Tuning Budget? |
|------------|--------|---------------|-------------------|---------------------|
| {name} | {paper/repo} | Yes / No / Partial | Yes / No | Yes / No / Unknown |
## Fairness Matrix
| Comparator | Best Available? | Equal Effort? | Same Env? | Symmetric Tuning? | Temporally Current? |
|------------|----------------|--------------|-----------|-------------------|---------------------|
| {name} | Yes / No | Yes / No | Yes / No | Yes / No | Yes / No |
## Comparison Diagram (Optional)
```mermaid
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart LR
%% CLASS DEFINITIONS %%
classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
classDef phase fill:#6a1b9a,stroke:#ba68c8,stroke-width:2px,color:#fff;
classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
classDef integration fill:#c62828,stroke:#ef9a9a,stroke-width:2px,color:#fff;
subgraph Proposed ["PROPOSED METHOD"]
METHOD["Proposed Method<br/>━━━━━━━━━━<br/>{method name}"]
end
subgraph Comparators ["COMPARATOR(S)"]
COMP1["Comparator 1<br/>━━━━━━━━━━<br/>{name}"]
COMP2["Comparator 2<br/>━━━━━━━━━━<br/>{name}"]
end
subgraph Evaluation ["SHARED EVALUATION"]
EVAL["Evaluation Pipeline<br/>━━━━━━━━━━<br/>{dataset/benchmark}"]
RESULTS["Results<br/>━━━━━━━━━━<br/>{metrics reported}"]
PARITY["Parity Check<br/>━━━━━━━━━━<br/>{asymmetry found}"]
ASYM["Asymmetry<br/>━━━━━━━━━━<br/>{description}"]
end
METHOD -->|"evaluated on"| EVAL
COMP1 -->|"evaluated on"| EVAL
COMP2 -->|"evaluated on"| EVAL
EVAL --> RESULTS
RESULTS --> PARITY
PARITY -.->|"flagged"| ASYM
class METHOD cli;
class COMP1,COMP2 phase;
class EVAL handler;
class RESULTS output;
class PARITY detector;
class ASYM gap;
Color Legend: | Color | Category | Description | |-------|----------|-------------| | Dark Blue | Proposed Method | The method being evaluated | | Purple | Comparators | Baselines and controls | | Orange | Evaluation | Shared evaluation pipeline | | Dark Teal | Results | Reported outcomes | | Red | Parity Checks | Fairness verification points | | Yellow | Asymmetries | Flagged unfair differences |
| # | Asymmetry | Affects | Impact Assessment | Remediation | |---|-----------|---------|-------------------|-------------| | 1 | {description} | {comparator(s)} | High / Medium / Low | {how to fix} |
---
## Pre-Diagram Checklist
Before creating the diagram, verify:
- [ ] LOADED `/autoskillit:mermaid` skill using the Skill tool
- [ ] Using ONLY classDef styles from the mermaid skill (no invented colors)
- [ ] Diagram will include a color legend table
---
## Related Skills
- `/autoskillit:make-experiment-diag` - Parent skill for lens selection
- `/autoskillit:mermaid` - MUST BE LOADED before creating diagram
- `/autoskillit:exp-lens-estimand-clarity` - For clarifying what the comparison is measuring
- `/autoskillit:exp-lens-fair-comparison` - For deeper analysis of evaluation protocol fairness
development
Generate YAML recipes for .autoskillit/recipes/. Use when user says "make script skill", "generate script", "script a workflow", "write a script", "create a script", "new recipe", "write a pipeline", or when loaded by other skills for script formatting.
data-ai
Create Uncertainty Representation visualization planning spec showing error bar definitions, distribution-aware alternatives, and multi-seed variance protocols. Statistical lens answering "How is uncertainty honestly represented?"
data-ai
Create Temporal Dynamics visualization planning spec showing axis scaling (linear vs log), smoothing disclosure, epoch/step alignment, run aggregation (mean + variance bands), early-stopping markers, and wall-clock vs step-count x-axis. Temporal lens answering "Are training dynamics shown clearly and honestly?"
data-ai
Create Narrative Story Arc visualization planning spec showing visual consistency across the report (same color = same model everywhere), logical figure progression, redundant figure detection, and narrative dependency between figures. Narrative lens answering "Do the figures tell a coherent story across the report?"