content/skills/research-learning-knowledge/paper2code/SKILL.md
Citation-anchored paper-to-code workflow for turning research papers into a minimal, honest Python implementation. Use this whenever the user wants to implement a paper from an arXiv ID, arXiv URL, local PDF, OpenReview forum page, or OpenReview PDF URL. Trigger even when the user says things like “implement this paper”, “复现这篇”, or “把这篇论文写成代码” without naming the source type explicitly. Reject DOI-only requests and unsupported landing pages instead of pretending the paper can be fetched.
npx skillsauth add bahayonghang/my-claude-code-settings paper2codeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Repo-native successor to the arXiv-only reference workflow. Keep the original paper2code discipline:
Accept exactly these inputs:
.pdfReject these explicitly:
If the user already has a paper PDF locally, prefer the local PDF path over network retrieval.
Extract:
PAPER_SOURCE — the paper input supplied by the userMODE — minimal (default), full, or educationalFRAMEWORK — pytorch (default), jax, or numpyNormalize the source into one of these internal source kinds:
arxiv_idarxiv_urllocal_pdfopenreview_pageopenreview_pdfIf the source is unsupported, fail fast with a clear reason. Do not silently fall back to unrelated fetch logic.
This workflow depends on the bundled Python scripts and their parser packages. Before running the pipeline:
scripts/, pipeline/, guardrails/,
knowledge/, and scaffolds/ existIf dependencies are missing, report the missing packages and the install command needed. Do not silently install new dependencies unless the user explicitly asked for environment setup.
Suggested command when setup is approved:
python -m pip install pymupdf4llm pdfplumber pymupdf requests pyyaml
Run the acquisition script first so it can determine the PAPER_KEY. Use a
temporary directory of the form:
.paper2code_work/{PAPER_KEY}/
All intermediate artifacts live there. The final generated project goes in the
current directory under {paper_slug}/.
Read and follow: pipeline/01_paper_acquisition.md
Run:
python "$SKILL_DIR/scripts/fetch_paper.py" "$PAPER_SOURCE" ".paper2code_work"
The script should:
.paper2code_work/{PAPER_KEY}/paper_text.md, paper_metadata.json, and paper.pdf when a PDF
exists locallyThen run structure extraction:
python "$SKILL_DIR/scripts/extract_structure.py" \
".paper2code_work/{PAPER_KEY}/paper_text.md" \
".paper2code_work/{PAPER_KEY}"
Stop here if acquisition artifacts are missing. Do not continue into codegen with partial paper text unless the pipeline document explicitly allows it.
Read and follow: pipeline/02_contribution_identification.md
Write:
.paper2code_work/{PAPER_KEY}/contribution.md
The output must isolate the single core contribution that will define scope.
Read and follow: pipeline/03_ambiguity_audit.md
Before that stage, also read:
guardrails/hallucination_prevention.mdWrite:
.paper2code_work/{PAPER_KEY}/ambiguity_audit.md
If the ambiguity audit says a detail is unspecified, preserve that uncertainty. Do not "fill the gap" with a confident guess.
Read and follow: pipeline/04_code_generation.md
Before writing code, also read:
guardrails/scope_enforcement.mdguardrails/badly_written_papers.mdknowledge/scaffolds/Generate the project under {paper_slug}/ in the current working directory.
Read and follow: pipeline/05_walkthrough_notebook.md
Generate:
{paper_slug}/notebooks/walkthrough.ipynb
minimal
full
educational
minimal, but add extra teaching comments and a richer
walkthrough notebookRemove .paper2code_work/ only after successful completion. If the workflow
fails midstream, keep the work directory so the user can inspect artifacts.
Print:
✓ paper2code complete for: {paper_title}
Source kind: {source_kind}
Output directory: {paper_slug}/
Files generated: {list of files}
Unspecified choices: {count} (see REPRODUCTION_NOTES.md)
Mode: {MODE} | Framework: {FRAMEWORK}
If the run stopped early, replace the success marker with a failure summary that names the exact failing stage and the artifact or dependency that blocked it.
Always apply:
guardrails/hallucination_prevention.mdguardrails/scope_enforcement.mdguardrails/badly_written_papers.mdConsult the relevant knowledge files before implementing:
knowledge/transformer_components.mdknowledge/training_recipes.mdknowledge/loss_functions.mdknowledge/paper_to_code_mistakes.mddevelopment
Use only when the user explicitly asks for swarm, subagents, parallel agents, dynamic workflow, multi-agent orchestration, 多智能体编排, or when the task truly needs coordinated research plus implementation plus review plus verification packets. Do not use for ordinary code review, planning-only work, single-line bugfixes, routine audits, or migrations unless orchestration is requested or at least two independent workflow dimensions are present.
development
Run a code quality review focused on maintainability, structure, abstraction quality, file growth, branching complexity, boundary cleanliness, and refactoring opportunities. Use when the user asks for code quality review, code review, maintainability review, architecture quality review, PR code quality feedback, 代码质量审查, 代码质量 review, 可维护性审查, 架构质量审查, or review comments about code structure. Do not use for pure security review, formatting-only review, performance profiling, or implementation tasks unless the user also asks for a code quality review.
development
Plan-first brainstorming workflow that turns an idea into an approved Markdown implementation plan by default. Use when the user wants to brainstorm, design, scope, or plan a feature/spec before implementation. Spark explores project context, asks only blocking questions, writes the plan under the project root's .plannings/YYYY-MM-DD-feature-slug.md path, self-reviews it, and waits for user approval. Create an HTML or visual plan/spec only when the user explicitly asks for HTML, browser-viewable, or visual output; save the paired .html beside the Markdown plan.
development
Run a code quality review focused on maintainability, structure, abstraction quality, file growth, branching complexity, boundary cleanliness, and refactoring opportunities. Use when the user asks for code quality review, code review, maintainability review, architecture quality review, PR code quality feedback, 代码质量审查, 代码质量 review, 可维护性审查, 架构质量审查, or review comments about code structure. Do not use for pure security review, formatting-only review, performance profiling, or implementation tasks unless the user also asks for a code quality review.