skills/paper-write/SKILL.md
Draft LaTeX paper section by section from an outline. Use when user says "写论文", "write paper", "draft LaTeX", "开始写", or wants to generate LaTeX content from a paper plan.
npx skillsauth add wanshuiyin/Auto-claude-code-research-in-sleep paper-writeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Draft a LaTeX paper based on: $ARGUMENTS
gpt-5.5 — Model used via Codex MCP for section review. Must be an OpenAI model.ICLR — Default venue. Supported: ICLR, NeurIPS, ICML, CVPR (also ICCV/ECCV), ACL (also EMNLP/NAACL), AAAI, ACM (ACM MM, SIGIR, KDD, CHI, etc.), IEEE_JOURNAL (IEEE Transactions / Letters, e.g., T-PAMI, JSAC, TWC, TCOM, TSP, TIP), IEEE_CONF (IEEE conferences, e.g., ICC, GLOBECOM, INFOCOM, ICASSP). Determines style file and formatting.false for camera-ready. Note: most IEEE venues do NOT use anonymous submission — set false for IEEE.false to use legacy behavior (LLM search + [VERIFY] markers)./paper-plan)figures/ (from /paper-figure)figures/latex_includes.tex (from /paper-figure).bib file, or will create oneIf no PAPER_PLAN.md exists, ask the user to run /paper-plan first or provide a brief outline.
Keep the existing insleep workflow, file layout, and defaults. Use the shared references below only when they improve writing quality:
../shared-references/writing-principles.md before drafting the Abstract, Introduction, Related Work, or when prose feels generic.../shared-references/venue-checklists.md during the final write-up and submission-readiness pass.../shared-references/citation-discipline.md only when the built-in DBLP/CrossRef workflow is insufficient.These references are support material, not extra workflow phases.
— style-ref: <source>, opt-in)Lets the user steer structural style (section ordering, theorem density, sentence cadence, figure density, bibliography style) toward a reference paper. Default OFF — when the user does not pass — style-ref, do nothing differently from before.
Only when — style-ref: <source> appears in $ARGUMENTS, run the helper FIRST, before drafting:
# Resolve $STYLE_HELPER via the canonical strict-safe chain (see
# shared-references/integration-contract.md §2). Policy A — gate:
# unresolved helper means --style-ref cannot be satisfied, so abort.
cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
if [ -z "${ARIS_REPO:-}" ] && [ -f .aris/installed-skills.txt ]; then
ARIS_REPO=$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null) || true
fi
STYLE_HELPER=".aris/tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || STYLE_HELPER="tools/extract_paper_style.py"
[ -f "$STYLE_HELPER" ] || { [ -n "${ARIS_REPO:-}" ] && STYLE_HELPER="$ARIS_REPO/tools/extract_paper_style.py"; }
[ -f "$STYLE_HELPER" ] || {
echo "ERROR: extract_paper_style.py not resolved at .aris/tools/, tools/, or \$ARIS_REPO/tools/." >&2
echo " Fix: rerun bash tools/install_aris.sh, export ARIS_REPO, or copy the helper to tools/." >&2
echo " --style-ref cannot be satisfied; aborting." >&2
exit 1
}
STYLE_STATUS=0
CACHE=$(python3 "$STYLE_HELPER" --source "<source>") || STYLE_STATUS=$?
case "$STYLE_STATUS" in
0) ;; # use $CACHE/style_profile.md as structural guidance
2) echo "warning: style-ref skipped (missing optional dep)" >&2 ;;
3) echo "error: --style-ref source failed; aborting draft" >&2 ; exit 1 ;;
*) echo "error: helper failed unexpectedly; aborting draft" >&2 ; exit 1 ;;
esac
Sources accepted: local TeX dir / file, local PDF, arXiv id (2501.12345 or arxiv:2501.12345), http(s) URL. Overleaf URLs and project IDs are rejected — clone via /overleaf-sync setup <id> first and pass the local clone path.
Strict rules (full contract in tools/extract_paper_style.py docstring):
style_profile.md as structural guidance only. Match section count, section ordering tendency, theorem-environment density, caption-length distribution, sentence cadence, math display ratio, citation style.— style-ref (or the cache contents) to reviewer / auditor sub-agents. Cross-model review independence (../shared-references/reviewer-independence.md) requires reviewers see only the artifact and the user's prompt, not the author's stylistic context.<!-- DATA_NEEDED --> markers (when GAP_REPORT.md exists)If /paper-plan ran with — style-ref: it will have emitted GAP_REPORT.md alongside PAPER_PLAN.md. This file lists structural slots (ablation tables, scaling experiments, failure-case analyses, …) the exemplar implies but the user has no evidence to fill.
When GAP_REPORT.md is present and a section slot is classified as status: missing:
Do not fabricate numerical results, figure references, or qualitative claims to fill that slot.
Emit an HTML-comment placeholder at the exact location the missing content would go:
<!-- DATA_NEEDED: GAP_S5_ABLATION — ablation table comparing X across the 3 axes implied by exemplar -->
Slot ID and one-line description come straight from GAP_REPORT.md. Never invent Slot IDs. Never reword the description to be more confident than the report.
The marker is intentionally an HTML comment so it is invisible in the rendered PDF but searchable via grep -r "DATA_NEEDED" sec/ for human triage / /experiment-bridge follow-up.
For status: partial, write what the user has and emit <!-- DATA_NEEDED: <Slot ID> — <what specifically is short> --> at the gap point in the same paragraph (do not split the section).
Carve-out from "no placeholder" rule. The default /paper-write discipline (no placeholders such as "see supplementary" or "TBD") still applies for everything except GAP_REPORT-listed missing slots. The marker is the principled way to surface genuine evidence deficits without compromising claim integrity.
Original idea: @zhangpelf in #217.
The skill includes conference templates in templates/. Select based on TARGET_VENUE:
ICLR:
\documentclass{article}
\usepackage{iclr2026_conference,times}
% \iclrfinalcopy % Uncomment for camera-ready
NeurIPS:
\documentclass{article}
\usepackage[preprint]{neurips_2025}
% \usepackage[final]{neurips_2025} % Camera-ready
ICML:
\documentclass[accepted]{icml2025}
% Use [accepted] for camera-ready
IEEE Journal (Transactions, Letters):
\documentclass[journal]{IEEEtran}
\usepackage{cite} % IEEE uses \cite{}, NOT natbib
% Author block uses \author{Name~\IEEEmembership{Member,~IEEE}}
IEEE Conference (ICC, GLOBECOM, INFOCOM, ICASSP, etc.):
\documentclass[conference]{IEEEtran}
\usepackage{cite} % IEEE uses \cite{}, NOT natbib
% Author block uses \IEEEauthorblockN / \IEEEauthorblockA
Generate this file structure:
paper/
├── main.tex # master file (includes sections)
├── iclr2026_conference.sty # or neurips_2025.sty / icml2025.sty / IEEEtran.cls + IEEEtran.bst
├── math_commands.tex # shared math macros
├── references.bib # bibliography (filtered — only cited entries)
├── sections/
│ ├── 0_abstract.tex
│ ├── 1_introduction.tex
│ ├── 2_related_work.tex
│ ├── 3_method.tex # or preliminaries, setup, etc.
│ ├── 4_experiments.tex
│ ├── 5_conclusion.tex
│ └── A_appendix.tex # proof details, extra experiments
└── figures/ # symlink or copy from project figures/
Section files are FLEXIBLE: If the paper plan has 6-8 sections, create corresponding files (e.g., 4_theory.tex, 5_experiments.tex, 6_analysis.tex, 7_conclusion.tex).
If paper/ already exists, back up to paper-backup-{timestamp}/ before overwriting. Never silently destroy existing work.
CRITICAL: Clean stale files. When changing section structure (e.g., 5 sections → 7 sections), delete section files that are no longer referenced by main.tex. Stale files (e.g., old 5_conclusion.tex left behind when conclusion moved to 7_conclusion.tex) cause confusion and waste space.
paper/ directorytemplates/ — the template already includes:
\crefname{assumption} fixmath_commands.tex with paper-specific notationAuthor block (anonymous mode):
\author{Anonymous Authors}
Create shared math macros based on the paper's notation:
% math_commands.tex — shared notation
\newcommand{\R}{\mathbb{R}}
\newcommand{\E}{\mathbb{E}}
\DeclareMathOperator*{\argmin}{arg\,min}
\DeclareMathOperator*{\argmax}{arg\,max}
% Add paper-specific notation here
Process sections in order. For each section:
GAP_REPORT.md exists and the section has slots with status: missing, emit <!-- DATA_NEEDED: <Slot ID> — <description> --> at those points instead of inventing data — see the DATA_NEEDED markers subsection above.figures/latex_includes.tex\citep{} / \citet{} (natbib). For IEEE venues: use \cite{} (numeric style via cite package). Never mix natbib and cite commands.Before drafting the front matter, re-read the one-sentence contribution from PAPER_PLAN.md. The Abstract and Introduction should make that takeaway obvious before the reader reaches the full method.
§0 Abstract:
../shared-references/writing-principles.md: what, why hard, how, evidence, strongest result\begin{abstract} — that's in main.tex§1 Introduction:
§2 Related Work:
\paragraph{Category Name.}§3 Method / Preliminaries / Setup:
\begin{definition}, \begin{theorem} environments for formal statementsalgorithm2e or algorithmic)§4 Experiments:
§5 Conclusion:
Appendix:
Run this pass after drafting all sections and before building the bibliography.
Trigger heuristic: treat the paper as theory-heavy if PAPER_PLAN.md labels it as theory/analysis, or if the drafted sections contain 5 or more formal result environments (\begin{theorem}, \begin{lemma}, \begin{proposition}, \begin{corollary}).
Proof source search: search the workspace for any standalone full-proof source file whose name or contents indicate a canonical proof version (proof, appendix, full, complete, supplement, supplementary). If such a file exists, prompt the user exactly:
Inline full proofs from {file}? [Y/n]
Default to Y.
If the user accepts:
A_appendix.tex or the appendix file named by the plan)If no standalone full-proof source exists:
Restatement audit:
stationary vs terminal, changed assumption names, or missing case splits as mismatches unless explicitly documentedEmpirical motivation: in a real theory-paper run, the default behavior generated "see supplementary proof document" placeholders in the appendix. The author had to manually pull hundreds of lines of full proofs from a standalone proofs file (e.g. proof_full.tex). Without this pass, theory papers ship with sketch-only appendices that fail at theory venues.
CRITICAL: Only include entries that are actually cited in the paper.
\citep{}/\citet{} for ML conferences, \cite{} for IEEE venues).bib files in the project/narrative docs[VERIFY] commentreferences.bib containing ONLY cited entries (no bloat)Three-step fallback chain — zero install, zero auth, all real BibTeX:
Step A: DBLP (best quality — full venue, pages, editors)
# 1. Search by title + first author
curl -s "https://dblp.org/search/publ/api?q=TITLE+AUTHOR&format=json&h=3"
# 2. Extract DBLP key from result (e.g., conf/nips/VaswaniSPUJGKP17)
# 3. Fetch real BibTeX
curl -s "https://dblp.org/rec/{key}.bib"
Step B: CrossRef DOI (fallback — works for arXiv preprints)
# If paper has a DOI or arXiv ID (arXiv DOI = 10.48550/arXiv.{id})
curl -sLH "Accept: application/x-bibtex" "https://doi.org/{doi}"
Step C: Mark [VERIFY] (last resort)
If both DBLP and CrossRef return nothing, mark the entry with % [VERIFY] comment. Do NOT fabricate.
Why this matters: LLM-generated BibTeX frequently hallucinates venue names, page numbers, or even co-authors. DBLP and CrossRef return publisher-verified metadata. Upstream skills (/research-lit, /novelty-check) may mention papers from LLM memory — this fetch chain is the gate that prevents hallucinated citations from entering the final .bib.
If the DBLP/CrossRef flow is not enough, load ../shared-references/citation-discipline.md for stricter fallback rules before adding placeholders.
Automated bib cleaning — use this Python pattern to extract only cited entries:
import re
# 1. Grep all \citep{...}, \citet{...}, and \cite{...} from all .tex files
# 2. Extract unique keys (handle multi-cite like \citep{a,b,c} or \cite{a,b,c})
# 3. Parse the full .bib file, keep only entries whose key is in the cited set
# 4. Write the filtered bib
This prevents bib bloat (e.g., 948 lines → 215 lines in testing).
Enforced Bib Hygiene Validation — run immediately after the filtered references.bib is written.
python3 - <<'PY'
import io, json, re, sys, urllib.parse, urllib.request
from pathlib import Path
try:
import bibtexparser
except ImportError:
sys.exit("Missing dependency: pip install bibtexparser")
ROOT = Path("paper")
tex_paths = [ROOT / "main.tex", *sorted((ROOT / "sections").glob("*.tex"))]
tex = "\n".join(p.read_text(errors="ignore") for p in tex_paths if p.exists())
cited = set()
for m in re.finditer(r'\\cite[a-zA-Z]*\{([^}]*)\}', tex):
cited.update(k.strip() for k in m.group(1).split(',') if k.strip())
with (ROOT / "references.bib").open() as fh:
bib = bibtexparser.load(fh)
entries = {e["ID"]: e for e in bib.entries}
dead = sorted(set(entries) - cited)
if dead:
print("DEAD ENTRIES:")
for key in dead:
print(" ", key)
def norm(s):
return re.sub(r'[^a-z0-9]+', ' ', (s or '').lower()).strip()
def dblp_hits(title):
q = urllib.parse.quote(title)
url = f"https://dblp.org/search/publ/api?q={q}&format=json&h=3"
with urllib.request.urlopen(url, timeout=20) as r:
data = json.load(r)
return [h.get("info", {}) for h in data.get("result", {}).get("hits", {}).get("hit", [])]
def crossref_entry(doi):
req = urllib.request.Request(f"https://doi.org/{doi}", headers={"Accept": "application/x-bibtex"})
with urllib.request.urlopen(req, timeout=20) as r:
parsed = bibtexparser.loads(r.read().decode("utf-8", "ignore"))
return parsed.entries[0] if parsed.entries else {}
for key in sorted(cited & set(entries)):
e = entries[key]
title = e.get("title", "").strip("{}")
hits = dblp_hits(title) if title else []
hit = hits[0] if hits else None
source = "DBLP"
if hit is None and e.get("doi"):
try:
hit = crossref_entry(e["doi"])
source = "CrossRef"
except Exception:
hit = None
if hit is None:
print(f"VERIFY {key}: no DBLP/CrossRef hit")
continue
issues = []
year_a = str(e.get("year", "")).strip()
year_b = str(hit.get("year", "")).strip()
if year_a and year_b and year_a != year_b:
issues.append(f"year {year_a} != {year_b}")
venue_a = e.get("journal") or e.get("booktitle") or ""
venue_b = hit.get("journal") or hit.get("booktitle") or hit.get("venue") or ""
if norm(venue_a) and norm(venue_b) and norm(venue_a) != norm(venue_b):
issues.append(f"venue {venue_a} != {venue_b}")
authors_a = [norm(a) for a in re.split(r'\s+and\s+', e.get("author", "")) if a.strip()]
authors_b = [norm(a) for a in re.split(r'\s+and\s+', hit.get("author", "")) if a.strip()]
if authors_a and authors_b and authors_a[:2] != authors_b[:2]:
issues.append("author list differs")
if issues:
print(f"MISMATCH {key} ({source}): " + "; ".join(issues))
PY
If DEAD ENTRIES is printed, remove those keys from references.bib before continuing.
If VERIFY or MISMATCH is printed, do not invent metadata:
% [VERIFY] markerCitation reachability rule: an entry is dead if its key does not appear in any \cite...{} command in paper/main.tex or any paper/sections/*.tex file.
Empirical motivation: in a real submission run, several dead bib entries sat in references.bib for many improvement rounds, and at least one entry had a key/year mismatch. Neither was flagged by the existing automated cleaning.
Citation verification rules (from claude-scholar + Imbad0202):
{firstauthor}{year}{keyword} (e.g., ho2020denoising)After drafting all sections, run five sequential audit passes. Based on Sainani's "Writing in the Sciences" methodology: every word must earn its place.
Pass 1: Clutter Extraction — Strip sentences to cleanest components.
| Cluttered phrase | Replace with | |------------------|--------------| | Due to the fact that | Because | | In order to | To | | A number of | Several | | It is worth noting that | (delete — just state the point) | | It is important to note that | (delete) | | At the present time | Now | | On the basis of | Based on | | In light of the fact that | Because | | Have an effect on | Affect | | Give rise to | Cause |
Also remove redundancies: "completely eliminate" → "eliminate", "future plans" → "plans", "unexpected surprise" → "surprise".
Remove AI-isms: delve, pivotal, landscape, tapestry, underscore, noteworthy, intriguingly.
Pass 2: Active Voice and Verb Vitality — Identify who did what.
Passive voice IS acceptable for: established facts, methods where agent is irrelevant, or when required by venue style.
Pass 3: Sentence Architecture — Structure and flow.
Pass 4: Keyword Consistency — The Banana Rule.
Do not call a "banana" an "elongated yellow fruit" to avoid repetition. If the Methods say "obese group," the Results must not switch to "heavier group." Synonym variation for technical terms forces the reader to wonder whether a new category has been introduced.
Pass 5: Numerical and Citation Integrity
Send the complete draft to GPT-5.5 xhigh:
mcp__codex__codex:
model: gpt-5.5
config: {"model_reasoning_effort": "xhigh"}
prompt: |
Review this [VENUE] paper draft (main body, excluding appendix).
Focus on:
1. Does each claim from the intro have supporting evidence?
2. Is the writing clear, concise, and free of AI-isms?
3. Any logical gaps or unclear explanations?
4. Does it fit within [MAX_PAGES] pages (to end of Conclusion)?
5. Is related work sufficiently comprehensive (≥1 page)?
6. For theory papers: are proof sketches adequate?
7. Are figures/tables clearly described and properly referenced?
8. Would a skim reader understand the contribution from the title, abstract, introduction, and Figure 1?
For each issue, specify: severity (CRITICAL/MAJOR/MINOR), location, and fix.
[paste full draft text]
Apply CRITICAL and MAJOR fixes. Document MINOR issues for the user.
After drafting all sections:
Before declaring done:
\ref{} and \label{} match (no undefined references)\citep{}/\citet{} for ML conferences, \cite{} for IEEE) have corresponding BibTeX entries[VERIFY] markers left uncheckedsections/ is \inputed by main.tex\input paths are consistent../shared-references/venue-checklists.md if needed)cat << 'EOF' > file) to write in chunks. Do NOT ask the user for permission — just do it silently.latexmk without errors (modulo missing figures)natbib (\citep/\citet); IEEE venues use cite package (\cite{}, numeric). Never mix.\citedpaper/ directory without backing up../shared-references/writing-principles.md — story framing, abstract/introduction patterns, sentence-level clarity, reviewer reading order../shared-references/venue-checklists.md — ICLR/NeurIPS/ICML/IEEE submission requirements to check before declaring done../shared-references/citation-discipline.md — stricter fallback for ambiguous citationsKeep using the reverse-outline test and anti-inflation polish from the main workflow above; the shared references are there to improve quality without adding a new phase.
Writing methodology adapted from Research-Paper-Writing-Skills (CCF award-winning methodology). Citation verification from claude-scholar and Imbad0202/academic-research-skills. This hybrid pack's writing-guidance overlay is adapted from Orchestra Research's paper-writing materials.
data-ai
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
development
Get a deep critical review of research from GPT using a secondary Codex agent. Use when user says "review my research", "help me review", "get external review", or wants critical feedback on research ideas, papers, or experimental results.
data-ai
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
development
Autonomous multi-round research review loop. Repeatedly reviews using a secondary Codex agent, implements fixes, and re-reviews until positive assessment or max rounds reached. Use when user says "auto review loop", "review until it passes", or wants autonomous iterative improvement.