skills/reference-corpus-analyzer/SKILL.md
Produce a multi-paper comparison matrix across a literature corpus with tiered read depth. Use when multiple papers need to be compared side-by-side for method differences, performance gaps, closest-work ranking, or trend identification — distinct from per-paper source cards (reference-reading-summarizer) and single-paper project linking (reference-project-synthesizer).
npx skillsauth add a-green-hand-jack/ml-research-skills reference-corpus-analyzerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Synthesize a literature corpus into a structured comparison matrix. This skill answers: across these N papers, who does what, how do they differ, and where is the open space?
Use this skill when:
Do not use this skill to create per-paper source cards — use reference-reading-summarizer for that. Do not use this skill to link a paper to your project's claims — use reference-project-synthesizer for that. Use this skill after source cards exist; avoid re-reading raw PDFs unless a card is insufficient.
Pair this skill with:
reference-reading-summarizer upstream: produce source cards before running corpus analysisreference-project-synthesizer upstream or downstream: link individual papers to project memory before or after comparisonrelated-work-positioning-writer downstream: use the comparison matrix to write novelty-boundary paragraphsbaseline-selection-audit downstream: use the ranking to identify must-have baselinesliterature-review-sprint when the corpus is not yet assembled and a broader topic survey is needed first<installed-skill-dir>/
├── SKILL.md
└── templates/
└── comparison-matrix.md
reference/cards/ to find available source cards before reading raw sources.reference/.agent/source-index.md or reference/.agent/reference-index.md to get the corpus inventory.memory/claim-board.md when the comparison should be anchored to specific project claims.templates/comparison-matrix.md before writing the output matrix.Read reference/.agent/source-index.md (or reference-index.md) to list available sources.
For each source, record:
has-card, no-card, partial-cardcore, related, background, tangentialIf cards are missing for sources that appear highly relevant, route to reference-reading-summarizer first.
Assign read depth to each source:
| Tier | Sources | Read depth | |---|---|---| | Deep | Top 3–5 closest by task + method overlap | Full source card; re-read raw source if card is insufficient | | Standard | Next 5–10 related works | Source card only | | Survey | Remaining background papers | Title + abstract + card summary |
Criterion for "closest": same task, same claim type, overlapping method family, or shared benchmark.
Read templates/comparison-matrix.md.
Dimensions to compare (select those relevant to the project):
For each tier-1 (deep) paper, also add:
Save to reference/corpus-analysis-<date>.md.
Produce a ranked list of the top-5 closest papers with:
Rank: 1
Paper: [title] ([venue year])
Overlap: task=high / method=medium / claim=high
Closest claim: [their specific claim that overlaps ours]
Differentiator: [one sentence: how we differ]
Novelty risk: high / medium / low
Reviewer action: cite as closest work / cite as baseline / cite as background
A paper with novelty risk: high and method=high overlap is the paper whose related-work paragraph needs the clearest boundary statement.
Gaps: what combinations of (task, method, constraint, benchmark) are not yet addressed by any paper in the corpus?
Gap: no paper addresses [X] under [constraint Y] with [method family Z]
Evidence: papers A, B, C address X but not under Y; papers D, E address Y but not X
Opportunity: our work fills this gap by [brief description]
Trends (optional, for survey mode):
reference/.agent/source-index.md with read-depth assignmentsmemory/risk-board.md for any high-novelty-risk closest-work findingsmemory/claim-board.md if the comparison changes the novelty framing of a claimreference/ — do not copy it into memory/Before finishing:
related-work-positioning-writertesting
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.