plugins/muna-technical-writer/skills/fact-checking/SKILL.md
Use when fact-checking a research paper or document - dual-verified web search with claim extraction, independent researcher and adversarial verifier, structured JSON output and human-readable exception report. Invoke explicitly via /fact-check; expensive operation.
npx skillsauth add tachyon-beep/skillpacks fact-checkingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill orchestrates a four-phase pipeline to fact-check research papers with maximum rigor. Every verifiable claim is extracted, then independently verified by two agents (a researcher and an adversarial verifier) using web search. Results are reconciled and output as structured JSON plus a human-readable exception report.
This is a deliberately expensive, token-heavy skill. Only invoke it when you need rigorous fact-checking, not speculatively.
/fact-check <paths>This skill requires the following tools to be available:
If WebSearch or WebFetch are unavailable, this skill cannot function. Inform the user and stop.
The user provides one or more file paths to the paper:
/fact-check path/to/paper.md/fact-check path/to/section1.md path/to/section2.mdAll paths are read in order and treated as one continuous document.
Spawn a single agent to extract all verifiable claims from the paper. This agent reads but does NOT verify.
Use the Agent tool with these parameters:
Prompt the agent with:
You are a claim extractor. Read the following research paper files and extract every verifiable factual claim.
Files to read: [INSERT FILE PATHS]
For each claim, output a JSON object on its own line with these fields:
id: sequential integer starting at 1text: the verbatim claim from the paper (quote exactly)category: one of quantitative, citation, scientific, historical, definitionalsection: the section heading or location where the claim appearscontext: 1-2 surrounding sentences for disambiguationExtraction rules:
Output ONLY a JSON array of claim objects. No commentary, no markdown formatting, no explanation.
Parse the returned JSON array. Count the claims and report to the user:
"Extracted N claims from the paper. Organizing into batches of 8 for dual verification..."
If the JSON is malformed, retry the extraction once. If it fails again, report the error to the user and stop.
For each batch of 8 claims, spawn a researcher agent. Multiple batches can run in parallel.
Use the Agent tool with:
Prompt the agent with:
You are a research fact-checker. For each claim below, search the web for evidence that supports or refutes it.
CRITICAL RULES:
Claims to verify:
[INSERT JSON ARRAY OF CLAIMS FOR THIS BATCH]
For each claim, output a JSON object with:
id: the claim's id (match the input)verdict: one of verified, refuted, uncertain
verified: found reliable source(s) confirming the claimrefuted: found reliable source(s) contradicting the claimuncertain: could not find sufficient evidence either wayconfidence: one of high, medium, lowreasoning: 1-3 sentences explaining how the evidence supports your verdictevidence: array of objects, each with:
url: the actual URL you visitedquote: a relevant excerpt from the pagerelevance: why this source mattersOutput ONLY a JSON array of result objects. No commentary.
For the same batches, spawn independent verifier agents. These run IN PARALLEL with the researchers — they must never see Agent A's results.
Use the Agent tool with:
Prompt the agent with:
You are an adversarial fact-checker. Your job is to attempt to DISPROVE each claim below. Search for counter-evidence, check if numbers are misquoted, verify that cited sources actually say what is attributed to them.
CRITICAL RULES:
verified.uncertain — do NOT default to verified.Claims to check:
[INSERT JSON ARRAY OF CLAIMS FOR THIS BATCH]
For each claim, output a JSON object with:
id: the claim's id (match the input)verdict: one of verified, refuted, uncertainconfidence: one of high, medium, lowreasoning: 1-3 sentences explaining your adversarial assessmentevidence: array of objects, each with:
url: the actual URL you visitedquote: a relevant excerpt from the pagerelevance: why this source mattersOutput ONLY a JSON array of result objects. No commentary.
Split the claims array into batches of 8. For N claims, create ceil(N/8) batches. Batch size of 8 balances agent context limits against per-batch overhead.
Launch up to 2 batch-pairs per message. For each batch-pair, spawn the Researcher and Verifier agents in the same message (parallel Agent tool calls):
Message 1:
Agent tool call 1: Researcher batch 1
Agent tool call 2: Verifier batch 1
Agent tool call 3: Researcher batch 2
Agent tool call 4: Verifier batch 2
After these agents return, report progress, then launch the next set of batch-pairs.
As each agent returns, parse its JSON output and store the results keyed by claim ID. If an agent fails or returns malformed JSON, retry that specific agent once. If it fails again, mark all claims in that batch as uncertain with reasoning "agent failure — verification could not be completed".
After each message's agents return, report progress before launching the next set:
"Batches N-M of T complete — verified: X, refuted: Y, uncertain: Z so far..."
After ALL batches complete, reconcile the results. This runs in the main conversation — no subagent needed.
For each claim, compare the researcher verdict and verifier verdict:
| Researcher | Verifier | Final Verdict |
|-----------|----------|---------------|
| verified | verified | verified |
| refuted | refuted | refuted |
| uncertain | uncertain | uncertain |
| uncertain (agent failure) | any definitive | uncertain |
| any definitive | uncertain (agent failure) | uncertain |
| Any other mismatch | Any other mismatch | disputed |
For each claim, construct the final record:
{
"id": 1,
"text": "<verbatim claim>",
"category": "<category>",
"section": "<section>",
"context": "<context>",
"verdict": "<reconciled verdict>",
"research": {
"verdict": "<researcher verdict>",
"confidence": "<researcher confidence>",
"reasoning": "<researcher reasoning>",
"evidence": [{"url": "...", "quote": "...", "relevance": "..."}]
},
"verification": {
"verdict": "<verifier verdict>",
"confidence": "<verifier confidence>",
"reasoning": "<verifier reasoning>",
"evidence": [{"url": "...", "quote": "...", "relevance": "..."}]
}
}
Write fact-check-results.json to the same directory as the first input file. If the file already exists or the directory is not writable, ask the user for an alternative path before writing.
{
"metadata": {
"paper_files": ["<list of input file paths>"],
"total_claims": 142,
"verified": 118,
"refuted": 12,
"disputed": 8,
"uncertain": 4,
"timestamp": "<ISO 8601 timestamp>"
},
"claims": [
{
"id": 1,
"text": "The transformer architecture was introduced in 2017",
"category": "historical",
"section": "Section 2.1 - Background",
"context": "...surrounding text...",
"verdict": "verified",
"research": {
"verdict": "verified",
"confidence": "high",
"reasoning": "Multiple sources confirm Vaswani et al. published 'Attention Is All You Need' in June 2017",
"evidence": [
{
"url": "https://arxiv.org/abs/1706.03762",
"quote": "Submitted on 12 Jun 2017",
"relevance": "Primary source — the original paper"
}
]
},
"verification": {
"verdict": "verified",
"confidence": "high",
"reasoning": "Independently confirmed via multiple secondary sources",
"evidence": [
{
"url": "https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)",
"quote": "introduced in 2017 by a team at Google Brain",
"relevance": "Secondary source confirming date and attribution"
}
]
}
}
]
}
Use the Write tool to create this file. Pretty-print the JSON (2-space indent).
Write fact-check-exceptions.md to the same directory as the JSON results.
This report contains ONLY claims that are NOT verified. Verified claims are omitted entirely.
# Fact-Check Exception Report
**Paper**: <comma-separated input file paths>
**Date**: <YYYY-MM-DD>
**Summary**: <total> claims checked — <verified> verified, <refuted> refuted, <disputed> disputed, <uncertain> uncertain
---
## Refuted (<count>)
### Claim #<id> — <category> | <section>
> "<verbatim claim text>"
**Research**: <researcher verdict> (<researcher confidence>)
- <researcher reasoning> — [source](<url>)
**Verification**: <verifier verdict> (<verifier confidence>)
- <verifier reasoning> — [source](<url>)
---
## Disputed (<count>)
[Same format as Refuted, for each disputed claim]
---
## Uncertain (<count>)
[Same format as Refuted, for each uncertain claim]
Omit any section that has zero claims (e.g., if there are no disputed claims, omit the Disputed section entirely).
uncertain with reasoning "agent failure".uncertain with reasoning "no web sources found". Agent must NOT fall back to training knowledge.tools
Use when designing, implementing, or auditing an MCP (Model Context Protocol) server — tool API design, idempotency under agent retry, structured error envelopes agents can recover from, schema versioning across model drift, transport reliability (stdio / HTTP), output-shape and pagination discipline, and choosing between tools / resources / prompts / sampling. Also use when an MCP server's tools confuse agents, return unstructured errors, deadlock under concurrent calls, double-execute under retry, or lose state across reconnects. Do not use for general REST/GraphQL API design (use `/web-backend`), for client-side prompt engineering or tool-loop design (use `/llm-specialist`), for general in-process plugin architecture (use `/system-architect`), or for cryptographic-provenance audit trails (use `/audit-pipelines`).
development
Use when running **SQLite or DuckDB inside an application process** as the durable store — not as a development convenience but as the production database. Use when scaling an SQLite layer that worked at low concurrency and is now hitting SQLITE_BUSY, WAL bloat, lock contention, schema-migration ceremony, or correctness gaps under multi-process writers. Use when introducing DuckDB as an OLAP complement to an OLTP SQLite store, or when picking between the two for a new component. Pairs with `/web-backend` (the API surface above the DB) and `/audit-pipelines` (when the DB is also the audit trail). Do not load for server databases (Postgres, MySQL), key-value stores, or ORM choice in isolation.
development
Use when designing or critiquing the structure of a staged procedure — a wizard, configuration flow, troubleshooting tree, training curriculum, multi-stage approval pipeline, decision pipeline, or any decomposition of expert work into composable stages. Use for both producer work (build the decomposition) and critic work (audit a proposed decomposition). Use when reasoning about capacity, bottlenecks, or soundness of a procedural flow. Do not use for implementation-plan critique of code changes (use `/axiom-planning` instead), for execution-time dynamics (use `/simulation-foundations`), or for rendering an already-designed procedure as docs or UI (use `/technical-writer` or `/ux-designer`).
testing
Use when the user wants to draft fiction or creative nonfiction prose, get craft critique on prose they have written, or plan story structure, outline, or premise. Workshop-voiced. Three explicit modes (draft, critique, plan) and the router will refuse to begin work without a declared mode.