skills/artifact-evaluation-prep/SKILL.md
Prepare research artifact packages for evaluation or public release. Use for reproduction commands, environment checks, data packaging, and artifact forms.
npx skillsauth add a-green-hand-jack/ml-research-skills artifact-evaluation-prepInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Prepare a paper's code, data, checkpoints, scripts, and instructions so an external artifact reviewer can reproduce the paper-facing claims with minimal ambiguity.
Use this skill when:
Do not use this skill as a general code-release skill. Use release-code for public repository hygiene, licensing, CITATION files, tags, and GitHub releases. Use this skill for reviewer-facing artifact execution and claim reproduction.
Pair this skill with:
camera-ready-finalizer to recover accepted-paper obligations and final claim/evidence staterelease-code to prepare public repository hygiene after artifact obligations are clearreproducibility-audit when environment, data, or execution drift needs a broader auditrun-experiment for generating or testing reproduction commandsfigure-results-review when artifact outputs must match paper figures or tablescitation-audit when artifact metadata cites datasets, code, or prior artifactsresearch-project-memory when artifact status, blockers, and reviewer-facing instructions should persist<installed-skill-dir>/
├── SKILL.md
└── references/
├── artifact-audit.md
├── memory-writeback.md
├── package-manifest.md
├── report-template.md
└── reviewer-instructions.md
references/artifact-audit.md, references/package-manifest.md, and references/reviewer-instructions.md.references/report-template.md before writing a saved artifact evaluation report.references/memory-writeback.md when the project has memory/, component .agent/ folders, or the user asks for persistent state.Collect:
If no venue is specified, produce a venue-agnostic artifact package but mark venue-specific fields as unresolved.
For each paper-facing claim or result, record:
Do not imply full reproducibility if only a smoke test or cached output is provided.
Read references/package-manifest.md.
Create or update a manifest that lists:
Prefer small, stable names such as ARTIFACT.md, REPRODUCE.md, or docs/artifact_evaluation.md unless the venue requires a specific filename.
Read references/reviewer-instructions.md.
Provide:
Instructions should be copy-pasteable and should not require the reviewer to infer hidden paths or environment variables.
When allowed by the user and environment, run at least:
If commands are too expensive, record the exact reason and create a minimal substitute test.
Audit:
Route public release issues to release-code; route environment drift to reproducibility-audit if available.
Read references/report-template.md.
If saving to a project and no path is given, use:
docs/submission/artifact_evaluation_prep_YYYY-MM-DD.md
The report must include:
Read references/memory-writeback.md when memory exists.
Update artifact status, reproduction commands, blockers, claim support, release actions, and final handoff notes without copying full command logs into memory.
Before finalizing:
testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.