Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

raddue/ledger

Name: ledger
Author: raddue

skills/ledger/SKILL.md

npx skillsauth add raddue/crucible ledger

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Ledger weekly renderer

Renders docs/ledger/weekly-YYYY-Www.md from the central ledger ~/.claude/crucible/ledger/runs.jsonl (override CRUCIBLE_LEDGER_DIR; plus falsification.jsonl beside it for the cross-link). The SKILL.md is a thin prompt wrapper; the source of truth is the testable Python core at scripts/render_ledger.py. Reports are written to docs/ledger/ relative to your cwd — run from the crucible repo to update the committed reports.

Skill type: Utility — direct execution, no subagent dispatch.

Trigger

Manual /ledger [--weeks N] (default N=1, the most recent ISO week).

What it does

Invoke the renderer directly. Resolve scripts/render_ledger.py by absolute path from the plugin root (it self-locates its own modules via __file__, so no PYTHONPATH is needed and it runs from any cwd):

python3 <plugin_root>/scripts/render_ledger.py --weeks N

That command:

Reads the central ledger ~/.claude/crucible/ledger/runs.jsonl (the --ledger default; CRUCIBLE_LEDGER_DIR overrides). First-time-ever: if the file is MISSING, print "no ledger data yet" and exit 0 — do NOT write an empty report.
Groups entries by ISO week (YYYY-Www).
For each of the most-recent N weeks, writes docs/ledger/weekly-YYYY-Www.md.

Algorithm summary

Tolerant load (load_runs): skips blank / malformed / partial-trailing lines, dedups defensively by (run_id, skill) latest-position-wins.
Honest "caught N" headline (caught_count): counts entries with would_have_shipped_without_gate == true, EXCLUDING backfilled == true. Backfilled entries carry severity_histogram: null ⇒ WHS null (the mechanical L-3 rule) and are reported in a SEPARATE "Backfilled historical context" section — never in the headline (L-5). This is what test T-5 asserts.
Per-skill significant_rate / fatal_rate (week_summary): computed from forward-captured entries only (backfilled == false AND severity_histogram != null). Raw rates are printed from week 1.
Inflation detector (§4a) (inflation_alert): alerts when a skill's significant_rate or fatal_rate exceeds 3× its 4-week rolling median. Silent for a skill until 4 weeks of forward data exist (the v1 bootstrap — with no forward history yet, the detector is silent, but raw rates still print so a human can eyeball drift).
Falsified cross-link (falsified_count): the all-time count of falsified verdicts via the L-9 latest-entry-wins reduction over falsification.jsonl (scripts.ledger_reduce.reduce, the canonical helper cited above). Graceful degradation: if falsification.jsonl is absent (Phase 4 reconciler not built yet), the reduction returns {} ⇒ count 0; the renderer never crashes.
Monthly spot-check (§4a): on the first /ledger run of a calendar month, the report appends an advisory checklist prompting a spot-check of 5 random would_have_shipped_without_gate: true entries against skills/shared/severity-rubric.md.

Commit citations

Design §4 calls for "findings with commit citations". The v1 schema has no commit field (artifact_hash is null for backfill). So:

Backfilled entries (backfill-<PR>-quality-gate) cite PR #<PR>, extracted from the deterministic run_id.
Forward entries (UUIDv7 run_id) have no commit in v1 ⇒ the citation is omitted gracefully. No SHAs are invented. A future schema rev capturing the gating commit can replace this with a real SHA.

Honest-count rule (binding)

The headline is would_have_shipped_without_gate == true minus backfilled. Backfilled entries seed the corpus but never inflate the caught-N number; they appear only in the separate historical-context section. Inflation drift is defended structurally (the 3× detector) and culturally (the monthly spot-check).

raddue/ledger

skills/ledger/SKILL.md

Render the Crucible calibration ledger weekly report — the honest "Crucible caught N silent bugs" headline, verdict breakdown, per-skill severity rates, and the inflation detector. Triggers on "/ledger", "weekly report", "weekly ledger", "caught N", "quality ledger", "calibration report", "render the ledger".

10 stars

testing

Updated Jun 2, 2026

$ install --global

skillsauth

npx skillsauth add raddue/crucible ledger

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 2, 2026, 5:06 AM175.9s1 file scanned

SKILL.md

name:: ledger
description:: Render the Crucible calibration ledger weekly report — the honest "Crucible caught N silent bugs" headline, verdict breakdown, per-skill severity rates, and the inflation detector. Triggers on "/ledger", "weekly report", "weekly ledger", "caught N", "quality ledger", "calibration report", "render the ledger".
origin:: crucible

Ledger weekly renderer

Skill type: Utility — direct execution, no subagent dispatch.

Trigger

Manual /ledger [--weeks N] (default N=1, the most recent ISO week).

What it does

python3 <plugin_root>/scripts/render_ledger.py --weeks N

That command:

Reads the central ledger ~/.claude/crucible/ledger/runs.jsonl (the --ledger default; CRUCIBLE_LEDGER_DIR overrides). First-time-ever: if the file is MISSING, print "no ledger data yet" and exit 0 — do NOT write an empty report.
Groups entries by ISO week (YYYY-Www).
For each of the most-recent N weeks, writes docs/ledger/weekly-YYYY-Www.md.

Algorithm summary

Tolerant load (load_runs): skips blank / malformed / partial-trailing lines, dedups defensively by (run_id, skill) latest-position-wins.
Honest "caught N" headline (caught_count): counts entries with would_have_shipped_without_gate == true, EXCLUDING backfilled == true. Backfilled entries carry severity_histogram: null ⇒ WHS null (the mechanical L-3 rule) and are reported in a SEPARATE "Backfilled historical context" section — never in the headline (L-5). This is what test T-5 asserts.
Per-skill significant_rate / fatal_rate (week_summary): computed from forward-captured entries only (backfilled == false AND severity_histogram != null). Raw rates are printed from week 1.
Inflation detector (§4a) (inflation_alert): alerts when a skill's significant_rate or fatal_rate exceeds 3× its 4-week rolling median. Silent for a skill until 4 weeks of forward data exist (the v1 bootstrap — with no forward history yet, the detector is silent, but raw rates still print so a human can eyeball drift).
Falsified cross-link (falsified_count): the all-time count of falsified verdicts via the L-9 latest-entry-wins reduction over falsification.jsonl (scripts.ledger_reduce.reduce, the canonical helper cited above). Graceful degradation: if falsification.jsonl is absent (Phase 4 reconciler not built yet), the reduction returns {} ⇒ count 0; the renderer never crashes.
Monthly spot-check (§4a): on the first /ledger run of a calendar month, the report appends an advisory checklist prompting a spot-check of 5 random would_have_shipped_without_gate: true entries against skills/shared/severity-rubric.md.

Commit citations

Design §4 calls for "findings with commit citations". The v1 schema has no commit field (artifact_hash is null for backfill). So:

Backfilled entries (backfill-<PR>-quality-gate) cite PR #<PR>, extracted from the deterministic run_id.
Forward entries (UUIDv7 run_id) have no commit in v1 ⇒ the citation is omitted gracefully. No SHAs are invented. A future schema rev capturing the gating commit can replace this with a real SHA.

Honest-count rule (binding)

Related Skills

raddue/delve

testing

VerifiedTrustedCommunity

Standalone instance-bug reviewer — runs a parallel finder fan-out + verify gate over a diff or a path and prints ranked, verified findings. Use when the user says "delve", "find bugs in this diff", "review this for bugs", "scan this file/subsystem for defects", "instance-bug sweep", or wants concrete reproducible defects (not a merge verdict, not systemic health). Works on a PR id, a base..head range, or a path, on any forge (GitHub, GitLab, Bitbucket, self-hosted).

10SKILL.mdUpdated Jun 4, 2026

raddue/grudge

development

VerifiedTrustedCommunity

The Book of Grudges — cross-session bug graveyard. Every fixed bug is recorded as a structured "grudge"; before touching code, skills query the grudgebook for the files in scope and surface past regressions as forced "DO NOT REPEAT" context. Read mode (pre-flight) and write mode (on bug resolution / fix(*) PR). Machine-local, per-repo, never committed. Triggers on /grudge, "check grudges", "record a grudge", "any past bugs here", "regression oracle", "bug graveyard".

10SKILL.mdUpdated Jun 2, 2026

raddue/calibration-reconcile

testing

VerifiedTrustedCommunity

Reconcile the Crucible calibration ledger — walk merged fix/hotfix branches to falsify the originating gating-verdicts, compute per-skill Brier calibration scores, and append a falsification log. Triggers on "/calibration-reconcile", "reconcile ledger", "reconcile calibration", "falsify verdicts", "brier score", "calibration reconcile", "compute brier".

10SKILL.mdUpdated Jun 2, 2026

raddue/calibration-reconcile

raddue/temper-eval-collect

tools

VerifiedTrustedCommunity

Live-dispatch phase of temper eval harness. Reads stage-manifest.json from a pre-staged dispatch dir; fans Task-tool reviewer dispatches in parallel (max 6); writes per-seq result files; exits. Single bounded session. Pairs with `python -m skills.temper.evals.run_evals stage` and `score`.

10SKILL.mdUpdated May 25, 2026

raddue/temper-eval-collect

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/raddue/crucible.git

# Copy into Claude Code skills folder (global)
cp -r crucible/skills/ledger ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

raddue/crucible

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT