Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

oaustegard/verifying-claims

Name: verifying-claims
Author: oaustegard

verifying-claims/SKILL.md

npx skillsauth add oaustegard/claude-skills verifying-claims

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

verifying-claims

Check that what a document says about code is true, by reading the document, the code, and the tests together and reporting where they disagree.

What changed (v0.1 → v0.2)

v0.1 was a comment-DSL: you hand-wrote  next to prose and a script checked the comment against the code. That had a fatal gap — the comment and the prose were two artifacts stapled together, and only the comment was checked, while humans read the prose. The prose could lie with a green run.

v0.2 drops the DSL. The reviewer is the agent: it reads the prose's meaning directly and compares it to what the code does and what the tests assert. No shadow copy, because the thing being checked is the thing the human reads. (Existing tools already own the alternatives — Gherkin binds executable scenarios, Lean's Verso transcludes facts into prose, TDD couples code to tests. This fills the remaining slot: free-prose documentation, judged.)

Division of labor — read this first

This skill does NOT gate merges and is NOT a test framework.

The test suite (TDD/CI) owns the behavioral contract: deterministic, cheap, auditable, gated. A green check is something you can hold CI to.
This skill owns the prose layer: does the documentation match reality? That needs semantic judgment across artifacts, which is non-deterministic and fallible — so it runs as a triggered review (before docs ship, on request, as a sweep), not as a per-commit gate. "The agent said the docs match" is not a guarantee you gate a merge on; it's a review you act on.

Tests are the anchor. The docs are correct when they agree with what the tests assert about the code. So write/keep good tests first; this skill keeps the prose pinned to them.

Procedure

Identify the document(s) to check and the code + tests they describe.
Gather consistent input: run scripts/gather_context.py --doc DOC --src SRC --tests TESTS. It ast-parses source (no imports, no execution) and bundles the document text, the public API surface, and the test inventory.
Extract the claims the prose makes — every checkable assertion about the code (signatures, behavior, return shapes, defaults, guarantees, examples). Do this by reading; there are no claim markers.
Judge each claim against the API surface and the tests:
- Does the code actually do what the prose says?
- Is the claim backed by a test, or merely asserted?
- Does it reference something that no longer exists?
Report drift, ranked by severity, each finding citing the prose claim and the contradicting reality (file/function). Use the verdicts below.
Optionally fix: rewrite the prose to match reality, and/or flag claims that need a test (an UNSUPPORTED claim is a missing test, not just a doc bug).

Verdicts

PASS — the prose claim matches the code and is exercised by a test.
FAIL — the code contradicts the claim (the doc is wrong, or the code regressed and the doc caught it).
UNSUPPORTED — the claim matches the current code but no test backs it, so nothing protects it from future drift. Surface as a missing test.
STALE — the claim refers to something removed or renamed.

Invoking

"Check the README against the code before I publish it."
"Does docs/api.md still match pkg/?"
"Sweep the docs for drift after this refactor."

Run it at moments that matter — pre-publish, post-refactor, on a docs PR — not on every commit. The deterministic gate is the test suite; this is the layer tests can't reach.

Honest limits

Non-deterministic and fallible: a review can miss drift or misjudge. Treat output as a careful review, not a proof.
Cost/latency: reading three artifacts and reasoning is expensive next to a test run. Don't wire it where a cheap deterministic check belongs.
It checks prose against code+tests; it does not verify the tests themselves are correct. Garbage tests → confident-but-wrong PASS. TDD discipline upstream still matters.

When NOT to use

As a CI merge gate (use the test suite).
To verify behavior (write a test).
On prose with no factual claims about code (nothing to check).

Files

scripts/gather_context.py — deterministic input bundler (doc + API surface + test inventory), ast-only, no imports.
references/drift-report-example.md — what a review report looks like.

oaustegard/verifying-claims

verifying-claims/SKILL.md

Check that a document's claims about code are actually true by reading the prose, the code, and the tests and reporting (or fixing) where they disagree. Use whenever the user wants to verify a README, guide, spec, or docstring still matches the code; whenever they mention documentation drift, doc-code sync, "is this still accurate", stale docs, or keeping docs/tests/code consistent; before publishing or merging a docs change; or as a periodic doc-accuracy sweep. The agent reads the prose's meaning directly — there is no claim-comment DSL to maintain. Pairs with TDD — the test suite is the deterministic behavioral gate, this skill is the semantic prose-vs-reality review.

125 stars

development

Updated Jun 15, 2026

$ install --global

skillsauth

npx skillsauth add oaustegard/claude-skills verifying-claims

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 15, 2026, 7:07 AM207.4s5 files scanned

SKILL.md

name:: verifying-claims
description:: Check that a document's claims about code are actually true by reading the prose, the code, and the tests and reporting (or fixing) where they disagree. Use whenever the user wants to verify a README, guide, spec, or docstring still matches the code; whenever they mention documentation drift, doc-code sync, "is this still accurate", stale docs, or keeping docs/tests/code consistent; before publishing or merging a docs change; or as a periodic doc-accuracy sweep. The agent reads the prose's meaning directly — there is no claim-comment DSL to maintain. Pairs with TDD — the test suite is the deterministic behavioral gate, this skill is the semantic prose-vs-reality review.
version:: 0.2.0

verifying-claims

Check that what a document says about code is true, by reading the document, the code, and the tests together and reporting where they disagree.

What changed (v0.1 → v0.2)

Division of labor — read this first

This skill does NOT gate merges and is NOT a test framework.

The test suite (TDD/CI) owns the behavioral contract: deterministic, cheap, auditable, gated. A green check is something you can hold CI to.
This skill owns the prose layer: does the documentation match reality? That needs semantic judgment across artifacts, which is non-deterministic and fallible — so it runs as a triggered review (before docs ship, on request, as a sweep), not as a per-commit gate. "The agent said the docs match" is not a guarantee you gate a merge on; it's a review you act on.

Tests are the anchor. The docs are correct when they agree with what the tests assert about the code. So write/keep good tests first; this skill keeps the prose pinned to them.

Procedure

Identify the document(s) to check and the code + tests they describe.
Gather consistent input: run scripts/gather_context.py --doc DOC --src SRC --tests TESTS. It ast-parses source (no imports, no execution) and bundles the document text, the public API surface, and the test inventory.
Extract the claims the prose makes — every checkable assertion about the code (signatures, behavior, return shapes, defaults, guarantees, examples). Do this by reading; there are no claim markers.
Judge each claim against the API surface and the tests:
- Does the code actually do what the prose says?
- Is the claim backed by a test, or merely asserted?
- Does it reference something that no longer exists?
Report drift, ranked by severity, each finding citing the prose claim and the contradicting reality (file/function). Use the verdicts below.
Optionally fix: rewrite the prose to match reality, and/or flag claims that need a test (an UNSUPPORTED claim is a missing test, not just a doc bug).

Verdicts

PASS — the prose claim matches the code and is exercised by a test.
FAIL — the code contradicts the claim (the doc is wrong, or the code regressed and the doc caught it).
UNSUPPORTED — the claim matches the current code but no test backs it, so nothing protects it from future drift. Surface as a missing test.
STALE — the claim refers to something removed or renamed.

Invoking

"Check the README against the code before I publish it."
"Does docs/api.md still match pkg/?"
"Sweep the docs for drift after this refactor."

Run it at moments that matter — pre-publish, post-refactor, on a docs PR — not on every commit. The deterministic gate is the test suite; this is the layer tests can't reach.

Honest limits

Non-deterministic and fallible: a review can miss drift or misjudge. Treat output as a careful review, not a proof.
Cost/latency: reading three artifacts and reasoning is expensive next to a test run. Don't wire it where a cheap deterministic check belongs.
It checks prose against code+tests; it does not verify the tests themselves are correct. Garbage tests → confident-but-wrong PASS. TDD discipline upstream still matters.

When NOT to use

As a CI merge gate (use the test suite).
To verify behavior (write a test).
On prose with no factual claims about code (nothing to check).

Files

scripts/gather_context.py — deterministic input bundler (doc + API surface + test inventory), ast-only, no imports.
references/drift-report-example.md — what a review report looks like.

Related Skills

oaustegard/writing-instructions

development

VerifiedTrustedCommunity

Write effective instructions for Claude: project instructions, standalone prompts, and skill content. Use when users need help writing prompts, setting up project instructions, choosing between instruction formats, or improving how they communicate with Claude. Covers writing principles, model-aware calibration, and format selection. For building and testing complete skills, use skill-creator instead.

134SKILL.mdUpdated Jul 26, 2026

oaustegard/writing-instructions

oaustegard/finding-skills

data-ai

VerifiedTrustedCommunity

Discover and load skills on demand from /mnt/skills/user/. Use when you need a capability but don't know which skill provides it, when the boot-emitted skill list is names-only and you need a full description, or when you want to list the catalog. Verbs are list (names only), search (rank by name/description match against a query), and show (emit the full SKILL.md for a named skill).

134SKILL.mdUpdated Jul 26, 2026

oaustegard/finding-skills

oaustegard/transcribing-images

documentation

VerifiedTrustedCommunity

Reads the visual content of slides, pages, and images the way a human would, not just their embedded text. Use when a PPTX or PDF has image slides, screenshots, charts, scanned figures, or flattened-to-image layouts that the built-in pptx/pdf skills read as empty; when asked to transcribe, describe, OCR, or extract what is shown in an image, slide deck, or document page; or when embedded-text extraction returned little or nothing from a visually rich file. Triggers on 'read this deck', 'what's on these slides', 'transcribe', 'OCR', 'extract text from image', 'describe this chart/diagram', .pptx/.pdf/.png/.jpg with visual content.

134SKILL.mdUpdated Jul 26, 2026

oaustegard/transcribing-images

oaustegard/svg-portrait-mode

development

VerifiedTrustedCommunity

Portrait Mode for SVGs — foveated vectorization with 4-zone selective detail. Combines vision annotations, MediaPipe segmentation/landmarks, and optional saliency. Like phone portrait mode, but vectorized. Use when vectorizing a portrait or photo where subject detail should outrank background detail.

134SKILL.mdUpdated Jul 26, 2026

oaustegard/svg-portrait-mode

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/oaustegard/claude-skills.git

# Copy into Claude Code skills folder (global)
cp -r claude-skills/verifying-claims ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

oaustegard/claude-skills

125 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT