src/autoskillit/skills_extended/vis-lens-methodology-norms/SKILL.md
Create Methodology Norms visualization planning spec showing ML sub-area mandatory figures, community conventions, and coverage gaps. Methodology-Normative lens answering "Which domain-specific figures are expected by reviewers?"
npx skillsauth add talont-org/autoskillit vis-lens-methodology-normsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Philosophical Mode: Domain-Normative Primary Question: "Which domain-specific figures are expected by reviewers?" Focus: ML Sub-Area Mandatory Figures, Community Conventions, Coverage Gap Analysis
/autoskillit:vis-lens-methodology-norms [context_path] [experiment_plan_path]
## Methodology Tradition section with a tradition_slug field, use that
tradition directly instead of auto-detecting the ML sub-area in Step 0.
The tradition slug maps to a bundled tradition YAML in
recipes/methodology-traditions/{slug}.yaml which defines mandatory figures,
anti-patterns, and community norms for that research methodology./autoskillit:vis-lens-methodology-norms| ML Sub-Area | Mandatory Figures | Community Norm Source | |-------------|-------------------|----------------------| | Supervised Classification | Confusion matrix, precision-recall curve, ROC-AUC, learning curves | NeurIPS/ICML reviewer norms | | NLP | Per-task accuracy table, error analysis examples, attention/saliency (if applicable) | ACL anthology norms | | Computer Vision | Sample predictions grid, failure case gallery, per-class mAP bar | CVPR/ECCV norms | | Reinforcement Learning | Episode reward curve (mean ± std across seeds), sample efficiency curve | NeurIPS RL track norms | | Generative Models | Sample grids (unconditional + conditional), FID/IS table, failure modes | NeurIPS/ICLR generative norms | | Foundation Models | Few-shot performance scaling curve, task contamination analysis, ablation table | LLM paper norms (BIG-bench style) | | Agentic Systems | Task success rate bar (± CI), step-level trace examples, tool use breakdown | Emerging norm (2023–2025) | | Time-Series | Forecasting horizon curve, decomposition plots, residual ACF | ICLR/NeurIPS temporal norms |
This lens currently covers 8 ML sub-areas. Future domain-specific variants (e.g.,
vis-lens-methodology-norms-cv, vis-lens-methodology-norms-rl) may extend this catalog with
venue-specific norms, additional mandatory figure types, or sub-area-specific anti-pattern
overlays. The base lens should remain general enough to bootstrap any sub-area.
When multiple methodology traditions match a research plan, disambiguation rules are applied sequentially to determine the primary tradition and accumulate union rule sets.
primary_tradition and may add union rulesapplied_union_rules; if no rule set primary, the first matching overlap's primary_tradition is usedpriority number) from candidate set becomes primary| Order | Rule Name | Trigger Conditions | Resolution | Union Rules |
|-------|-----------|-------------------|------------|-------------|
| 1 | prisma_dominance | systematic_synthesis + any other tradition | primary = systematic_synthesis | — |
| 1 (exception) | prisma_dominance + prediction_model_validation | systematic_synthesis + prediction_model_validation | primary = systematic_synthesis | +TRIPOD_SRMA |
| 2 | rct_economic_union | controlled_intervention + economic_evaluation | primary = controlled_intervention | +CHEERS_union |
| 3 | arrive_supersedes_consort | animal_preclinical + controlled_intervention | primary = animal_preclinical | — |
| 4 | benchmarking_prediction_nested | method_comparison_benchmarking + prediction_model_validation | primary = method_comparison_benchmarking | +TRIPOD_nested |
| Order | Overlap Name | Trigger Conditions | Primary if No Rule | Union Rules |
|-------|--------------|-------------------|-------------------|-------------|
| 5 | tripod_consort_union | prediction_model_validation + controlled_intervention | controlled_intervention | +TRIPOD_union |
| 6 | strobe_prisma_moose | observational_correlational + systematic_synthesis | systematic_synthesis | +MOOSE_override |
| 7 | odd_controlled_nesting | simulation_modeling_tradition + controlled_intervention | simulation_modeling_tradition | +controlled_intervention_secondary |
| 8 | benchmarking_prisma_separation | method_comparison_benchmarking + systematic_synthesis | method_comparison_benchmarking | +PRISMA_curation_phase |
| 9 | srqr_consort_parallel | qualitative_interpretive_tradition + controlled_intervention | controlled_intervention | +SRQR_parallel |
When disambiguation is invoked, the result includes:
primary_tradition: The selected primary methodology tradition nameapplied_union_rules: Tuple of union rule strings accumulated from matching rules/overlapsprecedence_trace: String describing which rules and overlaps fired (e.g., rule_prisma_dominance+overlap_strobe_prisma_moose)This lens uses a two-stage matching process to identify the correct methodology tradition and its venue-specific appendix figures.
Stage A identifies the primary methodology tradition using keyword scoring across all bundled
traditions. This is the existing classify_methodology + disambiguate flow documented in the
Multi-Match Disambiguation Rules above. The result is a primary_tradition that anchors
the analysis.
Stage B runs after Stage A and detects ML sub-area specific figure requirements by
checking the resolved tradition's venue_specific_appendices for keyword matches in the
plan text. Each tradition YAML may contain venue-specific appendix entries for ML sub-areas
that attach to it as a primary or alternate parent.
Conditional branching: Some ML sub-areas have alternate parent traditions triggered by
specific keywords. For example, a Foundation Models plan with "calibration" and "held-out"
keywords routes to prediction_model_validation instead of the default method_comparison_benchmarking.
A plan with "psychometric" and "item response theory" keywords routes to
measurement_instrument_validation_tradition — but only when explicit construct measurement
keywords are also present (constraint evaluation).
Constraint evaluation: Some alternate-parent branches require explicit evidence of specific
constructs. The Foundation Models → COSMIN route is gated by only_if_explicit_construct_measurement,
which checks for keywords like "construct measurement", "item response theory", or
"latent trait model".
Result: resolve_venue_appendices(plan_text) returns a list of VenueAppendixMatch objects,
each containing the sub-area slug, the resolved parent tradition, the matching appendix
definition, and a re_routed flag indicating whether the parent was the primary or an
alternate.
When tradition_slug is provided via context file: Stage A is skipped (the tradition is
provided directly), but Stage B still runs against the loaded tradition's venue_specific_appendices
to detect ML sub-area specific figure requirements.
Some methodology traditions (e.g., qualitative_interpretive_tradition) have an empty
mandatory_figures list. These traditions do not mandate specific figure types — rigor
is assessed through trustworthiness, transferability, dependability, and confirmability
rather than statistical figures.
Detection: After loading the tradition in Step 0, call is_out_of_scope_tradition(spec)
from recipe/methodology_tradition_registry.py. This returns True when
len(spec.mandatory_figures) == 0.
When detected (BEFORE Steps 1-4):
docs/research/silent-type-convention.md):verdict: GO
advisory_context:
subject_kind: methodology_tradition
subject_name: <tradition.name>
reasoning: "<tradition.display_name> traditions do not mandate statistical figures. Rigor lives in trustworthiness/transferability/dependability/confirmability."
reference_framework: "<tradition.canonical_guideline.name>"
strongly_expected_figures: # map each entry in spec.strongly_expected_figures
- "<figure text from entry.figure> (<entry.source>)"
- "..."
requires_decision: false
visualization-plan-trace.md in the output directory
({{AUTOSKILLIT_TEMP}}/plan-visualization/visualization-plan-trace.md), appended
after any existing Tier-C routing metadata{{AUTOSKILLIT_TEMP}}/vis-lens-methodology-norms/vis_spec_methodology_norms_{timestamp}.md)
under a ## Out-of-Scope Advisory heading instead of the normal coverage/gap tablesdiagram_path structured output token as usual (pointing to the vis_spec file)Populating the advisory fields:
subject_name: Use spec.name (e.g., qualitative_interpretive_tradition)reasoning: Compose from spec.display_name + the standard trustworthiness rationalereference_framework: Use spec.canonical_guideline["name"] (e.g., "COREQ/SRQR")strongly_expected_figures: Map each entry in spec.strongly_expected_figures to its
figure value with the source in parentheses (e.g.,
"Coding Tree or Thematic Map (COREQ item 28)")ML sub-area fallback path: If no tradition_slug is provided and the ML sub-area
keyword detection path is taken, the out-of-scope gate does not apply — it only fires
when a tradition is loaded (either via tradition_slug or via classify_methodology).
NEVER:
{{AUTOSKILLIT_TEMP}}/vis-lens-methodology-norms/ALWAYS:
Identify the ML sub-area from the experiment plan or context before checking mandatory figures
For each mandatory figure type, assign one of three statuses: present, partial, absent
Sort the gap list absent-first, then partial
BEFORE creating any diagram, LOAD the /autoskillit:mermaid skill using the Skill tool - this is MANDATORY
If the Skill tool cannot be used (disable-model-invocation) or refuses this invocation, do NOT proceed with diagram creation. Abort this step and omit the diagram from output.
Write output to {{AUTOSKILLIT_TEMP}}/vis-lens-methodology-norms/vis_spec_methodology_norms_{YYYY-MM-DD_HHMMSS}.md (relative to the current working directory)
After writing the file, emit the structured output token as literal plain text with no markdown formatting on the token name (the adjudicator performs a regex match):
diagram_path = /absolute/path/to/{{AUTOSKILLIT_TEMP}}/vis-lens-methodology-norms/vis_spec_methodology_norms_{...}.md
If positional arg 1 (context_path) is provided and the file exists, read it. Check for
a ## Methodology Tradition section containing tradition_slug. If present:
recipes/methodology-traditions/{tradition_slug}.yamlmandatory_figures, strongly_expected_figures, and anti_patterns
as the norm source (instead of the ML Sub-Area table above)is_out_of_scope_tradition(spec). If True, follow the Out-of-Scope Tradition
Handling section — skip Steps 1-4 and emit the GO advisory.If no tradition_slug is provided, fall back to ML sub-area keyword detection:
Identify the ML sub-area by scanning for keywords:
classification, clf, precision, recall, confusion_matrix → Supervised ClassificationNLP, language model, BLEU, ROUGE, perplexity, token → NLPimage, detection, segmentation, mAP, COCO, ImageNet → Computer VisionRL, reinforcement, reward, episode, policy, agent, environment → Reinforcement LearningGAN, VAE, diffusion, FID, IS, generation → Generative ModelsLLM, few-shot, zero-shot, foundation, BIG-bench, scaling → Foundation Modelsagentic, tool use, task success, step trace, function call → Agentic Systemstime series, forecasting, temporal, ACF, seasonal, trend → Time-SeriesIf multiple sub-areas match, analyze for all matching sub-areas.
Scan experiment plan, context file, and codebase for:
Existing Figures
*.png, *.pdf, *.svg in results/figures directoriessavefig, plt.save, fig.write_imagePlanned Figures
figure, plot, diagram, visualization, chart in planning documentsFigure Types Present
For each mandatory figure type in the identified sub-area:
Collect all absent and partial mandatory figures. Sort:
For each gap, assign a recommended figure spec (chart_type, data_source estimate).
For each absent or partial mandatory figure, emit one yaml:figure-spec fenced block as a
recommendation. Then LOAD /autoskillit:mermaid and create the coverage diagram.
# Domain Norms Spec: {System / Experiment Name}
**Lens:** Domain Norms (Domain-Normative)
**Question:** Which domain-specific figures are expected by reviewers?
**Date:** {YYYY-MM-DD}
**ML Sub-Area:** {detected sub-area}
**Scope:** {What was analyzed}
## Coverage Summary
| Mandatory Figure | Status | Evidence |
|-----------------|--------|----------|
| {Confusion matrix} | present | results/figures/confusion_matrix.pdf |
| {PR curve} | partial | code exists; figure not generated |
| {ROC-AUC} | absent | no evidence found |
| {Learning curves} | absent | no evidence found |
## Gap Analysis
| Priority | Figure Type | Status | Recommendation |
|----------|-------------|--------|----------------|
| 1 | ROC-AUC | absent | Add roc_curve plot with CI band |
| 2 | Learning curves | absent | Plot train/val loss vs epoch |
| 3 | PR curve | partial | Generate from existing pr_curve.py |
## Recommended Figure Specs
```yaml
# yaml:figure-spec — canonical schema (spec_version: "1.0")
figure_id: "fig-missing-roc-auc"
figure_title: "ROC-AUC Curve"
spec_version: "1.0"
chart_type: "line"
chart_type_fallback: "scatter"
perceptual_justification: "Line chart with position encoding for TPR vs FPR; standard domain norm for classification."
data_source: "results/predictions.csv"
data_mapping:
x: "fpr"
y: "tpr"
color: "model"
size: ""
facet: ""
layout:
width_inches: 5.0
height_inches: 5.0
dpi: 300
stat_overlay:
type: "ci_band"
measure: "CI95"
n_seeds: 5
annotations: ["AUC = {value}", "diagonal baseline shown"]
anti_patterns: []
palette: "wong"
format: "pdf"
target_dpi: 300
library: "matplotlib"
report_section: "Section 4 Evaluation"
priority: "P0"
placement_tier: "main"
conflicts: []
metadata:
created_by: "vis-lens-methodology-norms"
reviewed_by: ""
last_updated: "{YYYY-MM-DD}"
%%{init: {'flowchart': {'nodeSpacing': 50, 'rankSpacing': 60, 'curve': 'basis'}}}%%
flowchart TB
%% CLASS DEFINITIONS %%
classDef cli fill:#1a237e,stroke:#7986cb,stroke-width:2px,color:#fff;
classDef stateNode fill:#004d40,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef handler fill:#e65100,stroke:#ffb74d,stroke-width:2px,color:#fff;
classDef newComponent fill:#2e7d32,stroke:#81c784,stroke-width:2px,color:#fff;
classDef output fill:#00695c,stroke:#4db6ac,stroke-width:2px,color:#fff;
classDef detector fill:#b71c1c,stroke:#ef5350,stroke-width:2px,color:#fff;
classDef gap fill:#ff6f00,stroke:#ffa726,stroke-width:2px,color:#000;
subgraph SubArea ["ML SUB-AREA"]
SA["{Supervised Classification}<br/>━━━━━━━━━━<br/>NeurIPS/ICML norms"]
end
subgraph Present ["PRESENT"]
P1["{Confusion Matrix}<br/>━━━━━━━━━━<br/>results/figures/cm.pdf"]
end
subgraph Partial ["PARTIAL"]
W1["{PR Curve}<br/>━━━━━━━━━━<br/>code exists; not generated"]
end
subgraph Absent ["ABSENT"]
G1["{ROC-AUC}<br/>━━━━━━━━━━<br/>no evidence"]
G2["{Learning Curves}<br/>━━━━━━━━━━<br/>no evidence"]
end
SA --> P1
SA --> W1
SA --> G1
SA --> G2
class SA cli;
class P1 newComponent;
class W1 handler;
class G1,G2 gap;
Color Legend: | Color | Category | Description | |-------|----------|-------------| | Dark Blue | Sub-Area | Identified ML domain | | Green | Present | Mandatory figure covered | | Orange | Partial | Figure planned but incomplete | | Amber | Absent | Mandatory figure missing |
---
## Pre-Diagram Checklist
Before creating the diagram, verify:
- [ ] LOADED `/autoskillit:mermaid` skill using the Skill tool
- [ ] Using ONLY classDef styles from the mermaid skill (no invented colors)
- [ ] Diagram will include a color legend table
- [ ] ML sub-area has been identified from context or experiment plan
- [ ] All mandatory figures for the sub-area have been checked
- [ ] Gap list is sorted absent-first, then partial
development
Generate YAML recipes for .autoskillit/recipes/. Use when user says "make script skill", "generate script", "script a workflow", "write a script", "create a script", "new recipe", "write a pipeline", or when loaded by other skills for script formatting.
data-ai
Create Uncertainty Representation visualization planning spec showing error bar definitions, distribution-aware alternatives, and multi-seed variance protocols. Statistical lens answering "How is uncertainty honestly represented?"
data-ai
Create Temporal Dynamics visualization planning spec showing axis scaling (linear vs log), smoothing disclosure, epoch/step alignment, run aggregation (mean + variance bands), early-stopping markers, and wall-clock vs step-count x-axis. Temporal lens answering "Are training dynamics shown clearly and honestly?"
data-ai
Create Narrative Story Arc visualization planning spec showing visual consistency across the report (same color = same model everywhere), logical figure progression, redundant figure detection, and narrative dependency between figures. Narrative lens answering "Do the figures tell a coherent story across the report?"