skills/auto-paper-improvement-loop/SKILL.md
Run multi-round review-implement-recompile improvement cycles on a paper draft. Use when a draft needs iterative writing quality passes with reviewer independence (fresh context per review round), edit-whitelist gating, and crash-resumable state. Distinct from paper-reviewer-simulator (report only) and paper-draft-consistency-editor (single pass).
npx skillsauth add a-green-hand-jack/ml-research-skills auto-paper-improvement-loopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run controlled, multi-round review → implement → recompile cycles on a paper draft. Each review round uses a fresh context to prevent confirmation bias; an edit-whitelist gates what may be changed; state is checkpointed after each round so sessions can resume.
Use this skill when:
Do not use this skill as a substitute for real reviewer feedback — use paper-reviewer-simulator first to identify structural risks. Do not use this skill to make decisions about experimental results — use result-diagnosis or research-results-auditor before running improvement loops.
Pair this skill with:
paper-reviewer-simulator before the first loop round to identify high-priority issuespaper-draft-consistency-editor for a single targeted pass when full multi-round iteration is not neededpaper-writing-assistant when a round's review flags sections that need substantial rewritingsubmit-paper after the final round to verify submission readiness<installed-skill-dir>/
├── SKILL.md
└── templates/
└── improvement-log.md
templates/improvement-log.md before starting a new loop.paper-writing-assistant/references/edit-whitelist-contract.md to select or customize the edit whitelist preset for this loop.paper/.agent/writing-contract.md when it exists to understand protected invariants.paper/.agent/PAPER_IMPROVEMENT_STATE.json when resuming an interrupted loop.Reviewer independence is non-negotiable. A reviewer that continues from the writer's session context produces inflated scores. Each review sub-task must start with no memory of prior rounds or the author's intentions — only the paper text.
Edit-whitelist prevents scope creep. A writing-quality pass should not silently introduce new claims, new citations, or new numerical values. Declare what is frozen before the loop starts.
Two rounds is usually enough. Round 1 catches the most obvious issues. Round 2 catches what round 1's fixes introduced. A third round rarely finds genuinely new problems and risks over-polishing.
Checkpoint after every round. Multi-round loops over long documents take time. Write state after each completed round.
Decide before starting:
Rounds: 2 (default) | 1 (quick) | 3 (high-stakes submission)
Mode: writing | theory | format | full
Edit whitelist — FROZEN (may not be changed):
- [ ] Theorem/lemma/proof bodies
- [ ] Any numerical result values
- [ ] Citation keys and reference list
- [ ] Section structure and ordering
Edit whitelist — ALLOWED:
- [ ] Prose rewording for clarity and flow
- [ ] Paragraph restructuring within sections
- [ ] Caption rewording
- [ ] Transition sentences
- [ ] Notation consistency fixes
Save the configuration and a snapshot of the current PDF (or .tex hash) as the baseline.
Create paper/.agent/PAPER_IMPROVEMENT_STATE.json:
{
"loop_id": "<paper-dir>-<YYYY-MM-DD>",
"rounds_planned": 2,
"rounds_completed": 0,
"mode": "writing",
"edit_whitelist_frozen": ["theorems", "numerics", "citations"],
"baseline_tex_hash": "<sha256>",
"round_summaries": [],
"status": "in-progress"
}
For each round:
Prepare a self-contained review prompt that includes only the paper text — no prior review history, no author intent, no session context.
Run the review as an isolated task using sidecar-task-runner with a fresh Codex session (codex exec --ephemeral), or explicitly start a new Claude session with no continuity. Never continue from the current agent session to run the review.
The review prompt should ask for:
Save the review output to paper/.agent/sidecars/improvement-round-<N>/output.md.
For each fix from the review:
REJECTED: [fix description] — touches frozen category [category]
.tex source.When mode includes theory:
This check catches accidental divergence introduced by prose edits near theorem environments.
After each round, update PAPER_IMPROVEMENT_STATE.json:
{
"rounds_completed": <N>,
"round_summaries": [
{
"round": 1,
"review_output": "paper/.agent/sidecars/improvement-round-1/output.md",
"fixes_implemented": <count>,
"fixes_rejected": <count>,
"recompile_status": "success"
}
]
}
Write a human-readable log using templates/improvement-log.md.
After all rounds:
\label{} keys\ref{} to undefined labelsOverfull \hbox lines > 10pt)submit-paper: run final submission preflight after the looppaper-reviewer-simulator: run a fresh simulation if structural issues were found during the looppaper-writing-assistant: draft new content for sections flagged as needing substantial workstatus: "complete" in PAPER_IMPROVEMENT_STATE.jsonIf the loop is interrupted:
paper/.agent/PAPER_IMPROVEMENT_STATE.json to find rounds_completed.recompile_status is not success for the last completed round, fix the compile error before continuing.Before marking the loop complete:
PAPER_IMPROVEMENT_STATE.json has status: "complete"testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.