Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

oborchers/hallucination-prevention

Name: hallucination-prevention
Author: oborchers

deep-research/skills/hallucination-prevention/SKILL.md

npx skillsauth add oborchers/fractional-cto hallucination-prevention

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Hallucination Prevention

Hallucination is the single most important engineering concern for research agents. A 5,000-word report with 100 claims at 5% hallucination probability per claim has a 99.4% chance of containing at least one hallucinated claim. Even the best-performing models hallucinate at measurable rates — 0.7% for simple summarization, rising to 5-13% on harder tasks (Vectara Hallucination Leaderboard, 2025). In multi-agent systems, hallucinations compound across agent boundaries (OWASP ASI08).

Hallucination Taxonomy

Seven distinct hallucination types, ordered by detection difficulty:

| Type | What Happens | Detection | Prevention | |------|-------------|-----------|------------| | Citation hallucination | Inventing papers, URLs, or authors that do not exist | Easy — verify URL/DOI exists | Never cite a source not actually retrieved and read | | Temporal hallucination | Wrong dates or temporal ordering | Moderate — check against known timelines | Include dates from source text, not from memory | | Factual fabrication | Entirely false statements presented as fact | Moderate — requires external lookup | Only state facts found in retrieved sources | | Numerical hallucination | Fabricated statistics, percentages, counts | Hard — requires finding actual source | Copy numbers verbatim from source; never round or approximate without noting | | Attribution hallucination | Real fact attributed to wrong source | Hard — requires cross-referencing | Track which source produced which claim | | Negation hallucination | Reversing the polarity of a claim | Hard — requires careful reading | Quote or closely paraphrase source language | | Conflation hallucination | Merging details from different sources into one false claim | Very hard — each component may be correct | Maintain per-source notes; do not blend findings until synthesis |

The Cardinal Rules

These rules are non-negotiable for all research output:

Never cite a source not actually retrieved. If WebFetch was not called on a URL, that URL cannot appear as a citation.
Copy numbers from sources verbatim. Do not round, approximate, or "recall" statistics. If the source says "83.7%", write "83.7%", not "approximately 84%".
Preserve qualifiers. If the source says "may reduce", do not write "reduces". If it says "in a limited study", include that context.
Track provenance per-claim. Every factual claim in a research document must trace back to a specific source. Orphaned claims (facts with no source) are hallucination candidates.
Flag uncertainty explicitly. When confidence is low, say "This could not be independently verified" rather than asserting or omitting.

Circuit Breaker Patterns

Prevent hallucination cascading across agent boundaries:

Treat each agent's output as untrusted input. When a research-worker agent returns findings, the synthesizing agent must not assume those findings are correct. Cross-reference key claims against other workers' findings or against the original sources.

Structured error responses. When a search or fetch fails, return an explicit error — never let an agent fill in missing data from "memory." A structured gap ("No information found on X") is infinitely better than a fabricated answer.

Validation gates between pipeline stages. Before synthesis begins, verify:

Every cited URL was actually fetched
Numerical claims match the source text
Key facts appear in at least 2 independent sources (for critical claims)

Citation Verification Rules

For citation formatting and management (inline format, Sources section structure), see the synthesis-and-reporting skill. This section covers verification of citation accuracy.

Core principle: Follow Perplexity's rule: "You are not supposed to say anything that you didn't retrieve."

Verification checklist for each citation:

[ ] URL was actually fetched via WebFetch (not generated from memory)
[ ] The cited claim actually appears in the fetched content
[ ] Numbers match the source exactly
[ ] The source is attributed to the correct author/organization
[ ] Qualifiers from the source are preserved

Confidence Scoring

Assign confidence levels to findings and communicate them in the output:

| Level | Criteria | Report Language | |-------|---------|----------------| | High | Claim appears in 2+ independent T1-T3 sources with consistent numbers (see source-evaluation skill for tier definitions) | State directly with citations | | Moderate | Claim appears in 1 T1-T3 source or 2+ T4-T5 sources | "According to [Source]..." | | Low | Single T4+ source, or sources partially conflict | "One source reports... though this could not be independently verified" | | Unverified | No retrieved source supports the claim | Do not include, or explicitly flag as unverified |

Propagation rule: When combining findings from multiple agents, confidence of the combined finding equals the lowest confidence of its component claims.

Ground-Truth Validation

Prefer deterministic validation over LLM-based validation:

| Check Type | Method | Example | |-----------|--------|---------| | URL existence | HTTP HEAD request | Verify cited URLs return 200 | | Date verification | Parse and compare | Check if stated dates match source dates | | Numerical consistency | String matching | Compare quoted numbers to source text | | Cross-reference | Multi-source comparison | Same fact from independent sources |

LLM-based verification (asking a model "is this true?") is unreliable — models exhibit 17.8-57.3% bias-consistent behavior and high sycophancy rates (up to 58% initial compliance with wrong premises). Use code-based checks wherever possible.

Verification Pipeline Integration

The /deep-research:research command enforces hallucination prevention through a three-stage pipeline:

Workers produce findings with a structured Verifiable Claims Table (exact values + verbatim source text)
Verifiers re-fetch sources independently and check claims (one verifier per worker, in parallel)
Synthesizer reads both worker docs and verification reports, applying corrections before writing

This architecture addresses three hallucination failure modes:

Self-verification failure: Workers cannot reliably verify their own work due to sycophancy and lost context from incremental writing. Verifiers operate in a fresh context with adversarial instructions.
Cascading trust: The synthesizer previously trusted worker output implicitly. Verification reports make trust explicit and graduated (High/Moderate/Low/Corrected).
Missing confidence signals: The Verifiable Claims table and verification reports feed directly into the Confidence Assessment appendix in the final output.

Pipeline enforcement of confidence levels:

High: Claim verified by the verifier AND corroborated by 2+ independent sources
Moderate: Claim verified by the verifier from a single source
Low: Claim could not be verified, or verifier found conflicting information
Corrected: Original claim was incorrect; corrected value from verification

Known Limitations

Things current automated systems cannot reliably verify:

Whether a source itself is accurate (garbage in, garbage out)
Subtle semantic errors (correct words, wrong meaning)
Whether a claim's context changes its truth value
Claims about very recent events not yet indexed
Absence of evidence vs. evidence of absence

For high-stakes research, human review of the final output remains essential.

Reference Files

For detailed hallucination research, OWASP ASI08 analysis, and verification architecture:

references/hallucination-research.md — Quantitative hallucination rates, cascading failure mechanics, AgentAsk error taxonomy, and multi-agent consensus patterns

oborchers/hallucination-prevention

deep-research/skills/hallucination-prevention/SKILL.md

This skill should be used when producing any research output, verifying claims from web sources, checking citation accuracy, assessing confidence in findings, preventing hallucination cascading across agent boundaries, or reviewing research documents for factual reliability. Covers the hallucination taxonomy (7 types), OWASP ASI08 cascading failures, circuit breaker patterns, citation verification rules, confidence scoring, ground-truth validation, and known limitations of automated verification.

10 stars

development

Updated May 13, 2026

$ install --global

skillsauth

npx skillsauth add oborchers/fractional-cto hallucination-prevention

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 13, 2026, 6:43 AM162.6s2 files scanned

SKILL.md

name:: hallucination-prevention
description:: This skill should be used when producing any research output, verifying claims from web sources, checking citation accuracy, assessing confidence in findings, preventing hallucination cascading across agent boundaries, or reviewing research documents for factual reliability. Covers the hallucination taxonomy (7 types), OWASP ASI08 cascading failures, circuit breaker patterns, citation verification rules, confidence scoring, ground-truth validation, and known limitations of automated verification.
version:: 1.2.0

Hallucination Prevention

Hallucination Taxonomy

Seven distinct hallucination types, ordered by detection difficulty:

The Cardinal Rules

These rules are non-negotiable for all research output:

Never cite a source not actually retrieved. If WebFetch was not called on a URL, that URL cannot appear as a citation.
Copy numbers from sources verbatim. Do not round, approximate, or "recall" statistics. If the source says "83.7%", write "83.7%", not "approximately 84%".
Preserve qualifiers. If the source says "may reduce", do not write "reduces". If it says "in a limited study", include that context.
Track provenance per-claim. Every factual claim in a research document must trace back to a specific source. Orphaned claims (facts with no source) are hallucination candidates.
Flag uncertainty explicitly. When confidence is low, say "This could not be independently verified" rather than asserting or omitting.

Circuit Breaker Patterns

Prevent hallucination cascading across agent boundaries:

Validation gates between pipeline stages. Before synthesis begins, verify:

Every cited URL was actually fetched
Numerical claims match the source text
Key facts appear in at least 2 independent sources (for critical claims)

Citation Verification Rules

For citation formatting and management (inline format, Sources section structure), see the synthesis-and-reporting skill. This section covers verification of citation accuracy.

Core principle: Follow Perplexity's rule: "You are not supposed to say anything that you didn't retrieve."

Verification checklist for each citation:

[ ] URL was actually fetched via WebFetch (not generated from memory)
[ ] The cited claim actually appears in the fetched content
[ ] Numbers match the source exactly
[ ] The source is attributed to the correct author/organization
[ ] Qualifiers from the source are preserved

Confidence Scoring

Assign confidence levels to findings and communicate them in the output:

Propagation rule: When combining findings from multiple agents, confidence of the combined finding equals the lowest confidence of its component claims.

Ground-Truth Validation

Prefer deterministic validation over LLM-based validation:

Verification Pipeline Integration

The /deep-research:research command enforces hallucination prevention through a three-stage pipeline:

Workers produce findings with a structured Verifiable Claims Table (exact values + verbatim source text)
Verifiers re-fetch sources independently and check claims (one verifier per worker, in parallel)
Synthesizer reads both worker docs and verification reports, applying corrections before writing

This architecture addresses three hallucination failure modes:

Self-verification failure: Workers cannot reliably verify their own work due to sycophancy and lost context from incremental writing. Verifiers operate in a fresh context with adversarial instructions.
Cascading trust: The synthesizer previously trusted worker output implicitly. Verification reports make trust explicit and graduated (High/Moderate/Low/Corrected).
Missing confidence signals: The Verifiable Claims table and verification reports feed directly into the Confidence Assessment appendix in the final output.

Pipeline enforcement of confidence levels:

High: Claim verified by the verifier AND corroborated by 2+ independent sources
Moderate: Claim verified by the verifier from a single source
Low: Claim could not be verified, or verifier found conflicting information
Corrected: Original claim was incorrect; corrected value from verification

Known Limitations

Things current automated systems cannot reliably verify:

Whether a source itself is accurate (garbage in, garbage out)
Subtle semantic errors (correct words, wrong meaning)
Whether a claim's context changes its truth value
Claims about very recent events not yet indexed
Absence of evidence vs. evidence of absence

For high-stakes research, human review of the final output remains essential.

Reference Files

For detailed hallucination research, OWASP ASI08 analysis, and verification architecture:

references/hallucination-research.md — Quantitative hallucination rates, cascading failure mechanics, AgentAsk error taxonomy, and multi-agent consensus patterns

Related Skills

oborchers/using-planning-tools

tools

VerifiedTrustedCommunity

This skill should be used when the user invokes any /plan-* command from the planning-tools plugin (/plan-context, /plan-master, /plan-open-questions, /plan-verify, /plan-tick, /plan-progress, /plan-delete), asks how Claude Code's plan files work, asks where plans are stored, asks to author or audit a multi-phase master planning document, asks how to walk through a plan's Open Questions interactively, asks how to write progress entries, or mentions ~/.claude/plans/ or .claude/planning-tools.local.md. Provides the index of planning-tools commands, the master-plan workflow lifecycle, the v0.3.0+ list-shape mandate (phases and questions as headings + bulleted scope items, never tables), the v0.3.2+ plain-bullet shape (no `- [ ]` checkboxes — heading emoji is the sole tick signal), the progress-entry methodology, and the mechanics of Claude Code's plan-mode file storage.

15SKILL.mdUpdated May 13, 2026

oborchers/using-planning-tools

oborchers/plan-verification-checklist

testing

VerifiedTrustedCommunity

This skill should be used by the plan-verifier agent and the /plan-verify command to audit a drafted master plan against a fixed checklist. Covers universal-core completeness, the v0.3.0+ no-tables-for-phases-or-questions rule, trigger-based section-coverage gaps, phase actionability (heading + per-phase TL;DR + bulleted scope + exit criteria), the v0.3.1+ per-phase TL;DR requirement, the v0.3.2+ plain-bullet scope shape (legacy `- [ ]`/`- [x]` accepted silently), the v0.3.3+ context-block shape (plan-level `**TL;DR:**` + bulleted metadata, legacy `>` blockquote accepted silently), integer phase numbering enforcement, dependency traceability, citation resolution, callout/evidence convention compliance, Open Questions placement, and the one-PR-per-master-plan rule. Single-owner of the audit checklist.

15SKILL.mdUpdated May 13, 2026

oborchers/plan-verification-checklist

oborchers/master-plan-methodology

tools

VerifiedTrustedCommunity

This skill should be used when authoring, reviewing, or modifying a multi-phase master planning document via the planning-tools plugin (especially the /plan-master and /plan-verify commands). Codifies the universal core sections, trigger-based optional sections, integer-only phase numbering, Open Questions placement, one-PR-per-plan rule, status conventions, evidence attribution, callouts, cross-reference formats, the v0.3.0 list-shape mandate (phases and questions are heading + bulleted list, never markdown tables), the v0.3.1 per-phase TL;DR requirement (1–3 sentence what/why summary under each phase heading for glance-ability), the v0.3.2 plain-bullet scope shape (`- <action>` items, no `- [ ]` checkboxes — the phase status emoji is the sole tick signal), and the v0.3.3 context-block shape (a plan-level `**TL;DR:**` + a bulleted metadata list instead of a `>` blockquote; legacy blockquote blocks accepted silently). Project-agnostic — no ticket-prefix or plan-type taxonomy.

15SKILL.mdUpdated May 13, 2026

oborchers/master-plan-methodology

oborchers/whitespace-density

testing

VerifiedTrustedCommunity

This skill should be used when the user is adjusting spacing, padding, margins, content density, section gaps, vertical rhythm, or separation between elements. Also applies when reviewing whether a design feels cramped or too sparse, choosing between borders and whitespace for separation, or defining a spacing system. Covers the 4px/8px spacing system, macro vs micro whitespace, content density spectrum, separation techniques (whitespace > background shifts > borders), and vertical rhythm.

12SKILL.mdUpdated May 22, 2026

oborchers/whitespace-density

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/oborchers/fractional-cto.git

# Copy into Claude Code skills folder (global)
cp -r fractional-cto/deep-research/skills/hallucination-prevention ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

oborchers/fractional-cto

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT