Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mathews-tom/citation-audit

Name: citation-audit
Author: mathews-tom

skills/citation-audit/SKILL.md

npx skillsauth add mathews-tom/armory citation-audit

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Citation Audit Skill

Purpose

Verify every citation in a manuscript against its actual source. LLMs hallucinate citations, invent arXiv IDs, misattribute findings, and confuse authors. This skill catches all of that by fetching and reading each cited work.

Why This Exists

LLMs are unreliable with citations in three distinct ways:

Ghost papers — The paper does not exist. Title, authors, or venue are fabricated.
Wrong metadata — The paper exists but the bib entry has the wrong arXiv ID, wrong authors, wrong year, or wrong venue.
Inverted claims — The paper exists and the bib is correct, but the manuscript mischaracterizes what the paper says.

All three are invisible to structural audits (cross-reference checks, compilation tests). They require reading the actual cited work.

Inputs

The manuscript .tex file(s)
The .bib file
Web access (to fetch papers from arXiv, conference sites, URLs)

Execution

Phase 1: Extract citation contexts

For each \citep{}, \citet{}, \cite{} in the manuscript:

Record the bib key
Record the surrounding sentence or paragraph (the claim context)
Classify the claim type:
- FACTUAL: "X et al. found Y" / "X et al. measured Y"
- METHODOLOGICAL: "We follow X" / "We use the benchmark from X"
- POSITIONAL: "Unlike X, we..." / "X does not measure..."
- PARENTHETICAL: "(X, 2024)" — no specific claim, just a reference
For FACTUAL and POSITIONAL claims, extract the specific assertion the manuscript makes about the cited work

Phase 2: Verify bib entry metadata

For each bib entry, verify against the actual source:

For arXiv papers (eprint field present):

Fetch https://arxiv.org/abs/{eprint_id}
Compare: title, authors, year
If the fetched paper has a DIFFERENT title/authors than the bib entry, this is a WRONG ID or GHOST PAPER

For conference/journal papers (booktitle or journal field):

Search for the paper by title + author on the web
Verify: venue, year, author list
If the paper cannot be found at the stated venue, flag as UNVERIFIABLE or GHOST PAPER

For web resources (howpublished with URL):

Fetch the URL
Verify it loads and the content matches the described resource
If the URL is dead or redirects to unrelated content, flag as DEAD LINK

For each entry, check:

[ ] Paper exists (reachable via arXiv, DOI, URL, or web search)
[ ] Title matches (exact or near-exact)
[ ] Authors match (at least first author correct)
[ ] Year matches
[ ] Venue matches (if applicable)
[ ] Entry type appropriate (@inproceedings for conferences, @article for journals, @misc for preprints/blogs)

Phase 3: Verify claim accuracy

For each FACTUAL or POSITIONAL claim:

Read the cited paper (abstract + relevant sections at minimum)
Compare the manuscript's claim against what the paper actually says
Classify:
- ACCURATE — The claim faithfully represents the cited work
- INACCURATE — The claim mischaracterizes the cited work
- INVERTED — The claim says the opposite of what the paper found
- OVERCLAIMED — The claim is stronger than what the paper supports
- UNDERCLAIMED — The cited work supports a stronger claim than stated
- UNVERIFIABLE — Cannot access the paper to verify

For INACCURATE and INVERTED findings, provide:

What the manuscript claims
What the cited paper actually says
The specific section/page of the cited paper that contradicts the claim
A suggested correction

Phase 4: Check for missing citations

Scan the manuscript for:

Claims that cite no source but should (empirical claims without attribution)
Tools, benchmarks, or datasets mentioned by name without citation
Methods described as "standard" or "well-known" that have a canonical citation

Output Format

Per-citation report

### [bib_key] — [VERDICT]

**Bib entry:** [title] by [authors] ([year])
**Actual paper:** [actual title] by [actual authors] ([actual year])
**Metadata match:** title [✓/✗] | authors [✓/✗] | year [✓/✗] | venue [✓/✗]

**Claim in manuscript (line N):** "[exact text]"
**What the paper actually says:** "[summary of actual finding]"
**Claim accuracy:** [ACCURATE / INACCURATE / INVERTED / OVERCLAIMED / UNDERCLAIMED]

**Fix required:** [description of what needs to change, or "None"]

Summary table

| Bib Key | Exists | Metadata | Claim | Verdict |
|---------|--------|----------|-------|---------|
| key1    | ✓      | ✓        | ✓     | PASS    |
| key2    | ✓      | ✗        | ✗     | FAIL    |
| key3    | ✗      | —        | —     | GHOST   |

Verdict categories

PASS — Paper exists, metadata correct, claims accurate
METADATA — Paper exists, bib entry has errors (wrong ID, wrong authors, wrong year)
CLAIM — Paper exists, metadata correct, but manuscript mischaracterizes it
GHOST — Paper does not exist as described
DEAD — URL/link is broken
UNVERIFIABLE — Cannot access the paper to verify

Severity

CRITICAL: GHOST papers, INVERTED claims
HIGH: Wrong arXiv IDs, wrong authors, INACCURATE claims
MEDIUM: Wrong year, wrong venue, OVERCLAIMED
LOW: Missing citations, incomplete bib entries, UNDERCLAIMED

Important notes

NEVER trust your own knowledge of papers. ALWAYS fetch and verify. Your training data contains hallucinated citations. The only way to verify is to read the actual source.
For arXiv papers, always fetch the abstract page to confirm the paper exists and matches.
For conference papers, search DBLP, ACM DL, or the conference site.
WebFetch and WebSearch are your primary tools. Do not skip verification because a citation "looks right."
Blog posts and documentation URLs change. Always check that the URL still works and points to the described content.
When a bib entry has both an eprint (arXiv ID) and a booktitle (venue), verify both independently.

Phase 5: Fix (when invoked with "fix" or "on")

When the user invokes with an argument containing "fix" or "on", execute Phases 1–4 as above, then apply fixes for every non-PASS citation.

What to auto-fix (no user confirmation needed)

These are mechanical corrections with a single correct answer:

METADATA errors (paper exists, bib entry wrong):

Wrong arXiv ID → replace eprint with the correct ID
Wrong authors → replace with authors from the actual paper
Wrong year → replace with year from the actual paper
Wrong title → replace with title from the actual paper
Wrong venue → replace with venue from the actual paper
Wrong entry type → change @misc/@inproceedings as appropriate

DEAD links:

URL redirects → update howpublished URL to the final destination
URL 404 but resource found at different URL → update URL
URL 404 and resource gone → flag as HUMAN-REQUIRED

Minor author corrections:

Misspelled author names → fix spelling
Missing authors from author list → add them
Collective author name where individual names are available → replace (keep collective name as a note if it is how the group identifies)

What requires HUMAN-REQUIRED decision

Present these and wait for the user:

GHOST papers:

Paper does not exist at all → present options: (a) Replace with a real paper that makes the same point (b) Remove the citation and adjust the prose (c) The user knows the paper exists and provides the correct reference

INVERTED or INACCURATE claims:

The manuscript says X about a paper that actually says Y → present:
- What the manuscript claims
- What the paper actually says
- A suggested rewrite of the prose that accurately represents the paper
- Whether the paper still supports the manuscript's argument (and how)
- Let the user decide the final wording

Dead URLs with no replacement found:

Blog post / resource deleted with no archive or alternative

Fix procedure

Apply all auto-fixes to the .bib file
For each HUMAN-REQUIRED item, present the options clearly
After user decisions, apply prose changes to the .tex file
Verify: re-read the .bib and .tex to confirm all fixes applied
Update the audit report: mark each finding as [FIXED], [RESOLVED], or [DEFERRED]

Safety rules

NEVER invent a replacement citation. If a ghost paper needs replacing, search for real papers that make the cited point. Present candidates to the user with abstracts. Let the user choose.
NEVER change the manuscript's argument. If an inverted claim needs fixing, present the rewrite as a suggestion, not an edit.
NEVER remove a citation without user confirmation, even if it is a ghost paper. The user may know something you do not.
When fixing URLs, always verify the new URL loads and contains the expected content before writing it.

Save report as

[name]-citation-audit.md in the manuscript directory.

mathews-tom/citation-audit

skills/citation-audit/SKILL.md

Verify that every citation in a manuscript is real, correctly attributed, and accurately described. Detects ghost papers, wrong arXiv IDs, inverted claims, and dead links by fetching each cited work. Optional fix mode applies bib metadata corrections and surfaces prose rewrites for claim errors. Triggers on: "check my citations", "verify references", "citation audit", "are my references real", "check bib", "reference check", "bib audit", "citation verification". Companion to manuscript-review (Pass 5 hygiene); this skill audits factual truth.

221 stars

testing

Updated May 4, 2026

$ install --global

skillsauth

npx skillsauth add mathews-tom/armory citation-audit

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 4, 2026, 7:03 AM292.6s1 file scanned

SKILL.md

name:: citation-audit
description:: Verify that every citation in a manuscript is real, correctly attributed, and accurately described. Detects ghost papers, wrong arXiv IDs, inverted claims, and dead links by fetching each cited work. Optional fix mode applies bib metadata corrections and surfaces prose rewrites for claim errors. Triggers on: "check my citations", "verify references", "citation audit", "are my references real", "check bib", "reference check", "bib audit", "citation verification". Companion to manuscript-review (Pass 5 hygiene); this skill audits factual truth.
version:: 1.0.0
complements:: [manuscript-review, research-critique, literature-review]
category:: review
tags:: [citations, bibliography, fact-check, hallucination, references]
difficulty:: advanced
phase:: review

Citation Audit Skill

Purpose

Why This Exists

LLMs are unreliable with citations in three distinct ways:

Ghost papers — The paper does not exist. Title, authors, or venue are fabricated.
Wrong metadata — The paper exists but the bib entry has the wrong arXiv ID, wrong authors, wrong year, or wrong venue.
Inverted claims — The paper exists and the bib is correct, but the manuscript mischaracterizes what the paper says.

All three are invisible to structural audits (cross-reference checks, compilation tests). They require reading the actual cited work.

Inputs

The manuscript .tex file(s)
The .bib file
Web access (to fetch papers from arXiv, conference sites, URLs)

Execution

Phase 1: Extract citation contexts

For each \citep{}, \citet{}, \cite{} in the manuscript:

Record the bib key
Record the surrounding sentence or paragraph (the claim context)
Classify the claim type:
- FACTUAL: "X et al. found Y" / "X et al. measured Y"
- METHODOLOGICAL: "We follow X" / "We use the benchmark from X"
- POSITIONAL: "Unlike X, we..." / "X does not measure..."
- PARENTHETICAL: "(X, 2024)" — no specific claim, just a reference
For FACTUAL and POSITIONAL claims, extract the specific assertion the manuscript makes about the cited work

Phase 2: Verify bib entry metadata

For each bib entry, verify against the actual source:

For arXiv papers (eprint field present):

Fetch https://arxiv.org/abs/{eprint_id}
Compare: title, authors, year
If the fetched paper has a DIFFERENT title/authors than the bib entry, this is a WRONG ID or GHOST PAPER

For conference/journal papers (booktitle or journal field):

Search for the paper by title + author on the web
Verify: venue, year, author list
If the paper cannot be found at the stated venue, flag as UNVERIFIABLE or GHOST PAPER

For web resources (howpublished with URL):

Fetch the URL
Verify it loads and the content matches the described resource
If the URL is dead or redirects to unrelated content, flag as DEAD LINK

For each entry, check:

[ ] Paper exists (reachable via arXiv, DOI, URL, or web search)
[ ] Title matches (exact or near-exact)
[ ] Authors match (at least first author correct)
[ ] Year matches
[ ] Venue matches (if applicable)
[ ] Entry type appropriate (@inproceedings for conferences, @article for journals, @misc for preprints/blogs)

Phase 3: Verify claim accuracy

For each FACTUAL or POSITIONAL claim:

Read the cited paper (abstract + relevant sections at minimum)
Compare the manuscript's claim against what the paper actually says
Classify:
- ACCURATE — The claim faithfully represents the cited work
- INACCURATE — The claim mischaracterizes the cited work
- INVERTED — The claim says the opposite of what the paper found
- OVERCLAIMED — The claim is stronger than what the paper supports
- UNDERCLAIMED — The cited work supports a stronger claim than stated
- UNVERIFIABLE — Cannot access the paper to verify

For INACCURATE and INVERTED findings, provide:

What the manuscript claims
What the cited paper actually says
The specific section/page of the cited paper that contradicts the claim
A suggested correction

Phase 4: Check for missing citations

Scan the manuscript for:

Claims that cite no source but should (empirical claims without attribution)
Tools, benchmarks, or datasets mentioned by name without citation
Methods described as "standard" or "well-known" that have a canonical citation

Output Format

Per-citation report

### [bib_key] — [VERDICT]

**Bib entry:** [title] by [authors] ([year])
**Actual paper:** [actual title] by [actual authors] ([actual year])
**Metadata match:** title [✓/✗] | authors [✓/✗] | year [✓/✗] | venue [✓/✗]

**Claim in manuscript (line N):** "[exact text]"
**What the paper actually says:** "[summary of actual finding]"
**Claim accuracy:** [ACCURATE / INACCURATE / INVERTED / OVERCLAIMED / UNDERCLAIMED]

**Fix required:** [description of what needs to change, or "None"]

Summary table

| Bib Key | Exists | Metadata | Claim | Verdict |
|---------|--------|----------|-------|---------|
| key1    | ✓      | ✓        | ✓     | PASS    |
| key2    | ✓      | ✗        | ✗     | FAIL    |
| key3    | ✗      | —        | —     | GHOST   |

Verdict categories

PASS — Paper exists, metadata correct, claims accurate
METADATA — Paper exists, bib entry has errors (wrong ID, wrong authors, wrong year)
CLAIM — Paper exists, metadata correct, but manuscript mischaracterizes it
GHOST — Paper does not exist as described
DEAD — URL/link is broken
UNVERIFIABLE — Cannot access the paper to verify

Severity

CRITICAL: GHOST papers, INVERTED claims
HIGH: Wrong arXiv IDs, wrong authors, INACCURATE claims
MEDIUM: Wrong year, wrong venue, OVERCLAIMED
LOW: Missing citations, incomplete bib entries, UNDERCLAIMED

Important notes

NEVER trust your own knowledge of papers. ALWAYS fetch and verify. Your training data contains hallucinated citations. The only way to verify is to read the actual source.
For arXiv papers, always fetch the abstract page to confirm the paper exists and matches.
For conference papers, search DBLP, ACM DL, or the conference site.
WebFetch and WebSearch are your primary tools. Do not skip verification because a citation "looks right."
Blog posts and documentation URLs change. Always check that the URL still works and points to the described content.
When a bib entry has both an eprint (arXiv ID) and a booktitle (venue), verify both independently.

Phase 5: Fix (when invoked with "fix" or "on")

When the user invokes with an argument containing "fix" or "on", execute Phases 1–4 as above, then apply fixes for every non-PASS citation.

What to auto-fix (no user confirmation needed)

These are mechanical corrections with a single correct answer:

METADATA errors (paper exists, bib entry wrong):

Wrong arXiv ID → replace eprint with the correct ID
Wrong authors → replace with authors from the actual paper
Wrong year → replace with year from the actual paper
Wrong title → replace with title from the actual paper
Wrong venue → replace with venue from the actual paper
Wrong entry type → change @misc/@inproceedings as appropriate

DEAD links:

URL redirects → update howpublished URL to the final destination
URL 404 but resource found at different URL → update URL
URL 404 and resource gone → flag as HUMAN-REQUIRED

Minor author corrections:

Misspelled author names → fix spelling
Missing authors from author list → add them
Collective author name where individual names are available → replace (keep collective name as a note if it is how the group identifies)

What requires HUMAN-REQUIRED decision

Present these and wait for the user:

GHOST papers:

Paper does not exist at all → present options: (a) Replace with a real paper that makes the same point (b) Remove the citation and adjust the prose (c) The user knows the paper exists and provides the correct reference

INVERTED or INACCURATE claims:

The manuscript says X about a paper that actually says Y → present:
- What the manuscript claims
- What the paper actually says
- A suggested rewrite of the prose that accurately represents the paper
- Whether the paper still supports the manuscript's argument (and how)
- Let the user decide the final wording

Dead URLs with no replacement found:

Blog post / resource deleted with no archive or alternative

Fix procedure

Apply all auto-fixes to the .bib file
For each HUMAN-REQUIRED item, present the options clearly
After user decisions, apply prose changes to the .tex file
Verify: re-read the .bib and .tex to confirm all fixes applied
Update the audit report: mark each finding as [FIXED], [RESOLVED], or [DEFERRED]

Safety rules

NEVER invent a replacement citation. If a ghost paper needs replacing, search for real papers that make the cited point. Present candidates to the user with abstracts. Let the user choose.
NEVER change the manuscript's argument. If an inverted claim needs fixing, present the rewrite as a suggestion, not an edit.
NEVER remove a citation without user confirmation, even if it is a ghost paper. The user may know something you do not.
When fixing URLs, always verify the new URL loads and contains the expected content before writing it.

Save report as

[name]-citation-audit.md in the manuscript directory.

Related Skills

mathews-tom/stacked-prs

testing

VerifiedTrustedCommunity

Manages dependent branch stacks and stacked pull requests using safe Git topology rules. Triggers on: "create stacked PRs", "publish this stack", "sync my PR stack", "rebase this stack", "merge the stack", "retarget child PRs", "split this branch into stacked PRs", "validate this stack", "cleanup stacked branches". Use when local branches or one source branch need to become a dependency-ordered PR stack with correct parent bases, validation, synchronization, merge order, and cleanup.

242SKILL.mdUpdated May 23, 2026

mathews-tom/stacked-prs

mathews-tom/project-context-setup

development

VerifiedTrustedCommunity

Scaffolds per-repository agent context so coding agents share the same issue tracker rules, triage label vocabulary, domain glossary, ADR layout, and handoff conventions. Triggers on: "set up project context", "configure agent docs", "create CONTEXT.md", "setup agent workflow", "agent issue tracker setup", "triage labels", "domain glossary for agents". Use when a repo needs durable context files before planning, triage, debugging, TDD, architecture review, or multi-agent implementation.

230SKILL.mdUpdated May 12, 2026

mathews-tom/project-context-setup

mathews-tom/task-decomposer

testing

VerifiedTrustedCommunity

Produces phased task boards from feature requests: dependency-mapped work items, parallelization flags, risk flags, edge cases, test matrices. Triggers on: "decompose this feature", "task breakdown with dependencies", "phased implementation plan", "work breakdown structure". NOT for effort estimates, use estimate-calibrator.

230SKILL.mdUpdated Apr 6, 2026

mathews-tom/task-decomposer

mathews-tom/debug-investigator

development

VerifiedTrustedCommunity

Hypothesis-driven debugging with ranked hypotheses, git bisect strategy, instrumentation planning, and minimal reproduction design. Triggers on: "debug this systematically", "root cause analysis", "bisect this bug", "rank hypotheses", "isolate this issue", "minimal reproduction". NOT for general reasoning.

230SKILL.mdUpdated Apr 6, 2026

mathews-tom/debug-investigator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mathews-tom/armory.git

# Copy into Claude Code skills folder (global)
cp -r armory/skills/citation-audit ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mathews-tom/armory

221 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT