skills/inquisitor/SKILL.md
Use when a full feature is assembled and you want to hunt cross-component bugs before final quality gate. Runs 5 parallel adversarial dimensions against the complete implementation diff. Triggers on 'inquisitor', 'hunt bugs', 'cross-component test', 'find integration issues', or automatically in build pipeline Phase 4.
npx skillsauth add raddue/crucible inquisitorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Hunt cross-component bugs by dispatching 5 parallel adversarial dimensions against the full feature diff. Each dimension writes and runs tests targeting a different class of failure mode that per-task testing misses.
Announce at start: "I'm using the inquisitor skill to hunt cross-component bugs across the full implementation."
Skill type: Rigid -- follow exactly, no shortcuts.
Model: Opus (orchestrating parallel adversarial subagents requires precise coordination)
<!-- CANONICAL: shared/dispatch-convention.md -->All subagent dispatches use disk-mediated dispatch. See shared/dispatch-convention.md for the full protocol.
Per-task adversarial testing (Phase 3) sees one task's diff. It catches bugs within each task's scope. But the bugs you find instantly after a big feature lands live in the seams -- wiring that's almost right, state that initializes in one task but gets consumed in another, edge cases that only appear when the whole feature is assembled.
The inquisitor sees the FULL diff and attacks the interactions.
| Agent | Question | Output | Scope | |-------|----------|--------|-------| | Red-team | "What's wrong with this artifact?" | Written findings (Fatal/Significant/Minor) | Attacks designs, plans, code quality | | Adversarial Tester | "What runtime behavior breaks?" | Executable tests (per-task) | Single task's changes | | Inquisitor | "What breaks between components?" | Executable tests (full feature) | Cross-component interactions across all tasks |
Each dimension is a parallel subagent with a specific attack lens. All 5 are dispatched simultaneously.
Question: "Is everything actually connected?"
Looks for:
Test style: Instantiate the system and verify new components are reachable and callable through their intended entry points.
Question: "Do the new pieces talk to each other correctly?"
Looks for:
Test style: Set up 2+ new components and verify data flows correctly end-to-end through the interaction chain.
Question: "What happens at the boundaries?"
Looks for:
Test style: Call new APIs with boundary inputs and verify graceful handling (no crash, correct error, or documented behavior).
Question: "Is state managed correctly across the feature?"
Looks for:
Test style: Exercise lifecycle sequences (create, use, dispose, re-create) and verify correct behavior at each stage.
Question: "Did we break anything that used to work?"
Looks for:
Test style: Exercise existing functionality through paths that touch newly modified code, verifying prior behavior is preserved.
Compute the diff covering ALL implementation changes:
git diff <base-sha>..HEAD.git merge-base HEAD main (or the user-specified base branch) to find the diverge point. Use git diff <merge-base>..HEAD.If the diff is empty or contains only non-behavioral files (.md, .json, .yaml, .uss, .uxml), report "No behavioral changes to investigate" and stop.
Calibration-weighted dispatch (advisory). Before the 5-dimension fan-out, derive the file list from Step 1's diff (git diff --name-only <base>..HEAD), resolve scripts/brier_advisory.py by absolute path from the plugin root, and run python3 <script> advise inquisitor <file list…>. If it prints a DispatchAdvice block, include it verbatim in each dimension subagent's dispatch context as scrutiny hints (NOT as findings, NOT scored). Best-effort: on empty output or any error, dispatch normally. See shared/calibration-weighted-dispatch.md.
For each dimension, dispatch a fresh subagent (Opus) using ./inquisitor-prompt.md:
All 5 dimensions are dispatched in parallel. Do not wait for one to finish before starting another.
Status update format while dispatching:
"Inquisitor: dispatching 5 dimensions in parallel — Wiring, Integration, Edge Cases, State & Lifecycle, Regression."
When all 5 subagents complete, aggregate their reports into a single INQUISITOR REPORT (see Report Format below).
Classify overall results:
Status update format:
"Inquisitor complete: Wiring 3/3 PASS, Integration 4/4 PASS, Edge Cases 3/5 PASS (2 FAIL), State 4/4 PASS, Regression 3/3 PASS. Dispatching fixes for 2 edge case failures."
For each dimension with FAIL results:
Cascading fix detection: If fixes for one dimension cause failures in another dimension's tests, escalate to user immediately -- this indicates a deeper design issue that automated fixes can't resolve.
Commit passing fixes: fix: inquisitor [dimension] findings
Output the aggregated INQUISITOR REPORT including fix outcomes. The report must include:
## INQUISITOR REPORT
### Summary
- Dimensions dispatched: 5
- Total attack vectors tested: N
- Tests PASSING (robust): N
- Tests FAILING (weaknesses found): N
- Tests ERROR (discarded): N
- Dimensions clean: N/5
- Fix cycles required: N
### Dimension: Wiring
#### Attack Vector 1: [Title]
- **What was tested:** [specific wiring concern]
- **Likelihood:** High/Medium/Low
- **Impact:** High/Medium/Low
- **Test:** `TestClassName.TestMethodName`
- **Result:** PASS/FAIL
- **If FAIL -- fix guidance:** [what to change]
[repeat for each attack vector]
### Dimension: Integration
[same structure]
### Dimension: Edge Cases
[same structure]
### Dimension: State & Lifecycle
[same structure]
### Dimension: Regression
[same structure]
### Fix Outcomes (if applicable)
- [Dimension]: [N] failures fixed in [N] attempts
- [Dimension]: [N] failures escalated to user
### Fix Footprint
- Pre-inquisitor SHA: [commit SHA before any fixes]
- Files changed by fixes: [N]
- Code review re-run recommended: YES/NO
After the aggregated INQUISITOR REPORT is emitted (Step 5: Final Report — the terminal point of an inquisitor run), emit ONE Tier B STUB JSONL line to the central ledger (~/.claude/crucible/ledger/runs.jsonl) via the emit CLI per the canonical protocol at skills/shared/ledger-append.md — resolve scripts/ledger_append.py by absolute path from the plugin root and run python3 <script> emit - '<json>'.
emit CLI owns the mechanics: graceful skip on CRUCIBLE_CALIBRATION_DISABLED=1 (L-6), dedup by (run_id, skill="inquisitor") (L-2), and auto-fill of repo + schema_version. If the script can't be resolved, warn to stderr and skip — never block.repo is auto-filled by the emit CLI): schema_version: 2, run_id (UUIDv7 via scripts/uuid7.py), skill: "inquisitor", tier: "B", verdict (all dimensions clean or all FAILs fixed within the cap → PASS; a dimension escalated to the user after the 2-attempt cap or cascading-fix escalation → ESCALATED), timestamp (ISO-8601 UTC), gated_files (the files in the investigated feature diff, repo-relative), artifact_type: "code" (inquisitor only operates on behavioral code diffs).shared/ledger-append.md: severity_histogram, highest_finding, would_have_shipped_without_gate, findings_count, confidence, chunk_hash, rounds, predicted_falsifier — all null. Also gated_files_truncated: 0 and comment: null. Keep backfilled: false, falsified: null, falsified_by: null.Inquisitor subagents must NOT:
The orchestrator must NOT:
When used within the build pipeline (Phase 4):
git diff <base-sha>..HEAD where base-sha is the commit before Phase 3 execution beganThe build orchestrator handles the inquisitor as a single step. It does NOT invoke the inquisitor's fix cycle independently -- the inquisitor skill manages its own fix cycle internally and reports the final outcome.
Invoke directly with /inquisitor when you want to assault a feature branch:
main, or user-specified)git merge-base HEAD <base-branch>git diff <merge-base>..HEADThe orchestrator (not this skill) decides whether to skip. When used standalone, use your judgment:
This skill produces adversarial tests across 5 dimensions. The tests themselves are the quality mechanism. When used in the build pipeline, the quality-gate that follows reviews the final state including any inquisitor fixes. No additional quality gate is needed on the inquisitor's own output.
Disabled by default due to cost: with N external models, this adds 5*N API calls per inquisitor run (one per dimension per model).
Only active if skills.inquisitor is explicitly set to true in external review config. If the external_review MCP tool is unavailable or the call fails for any dimension, skip silently and proceed with host findings only.
Per-dimension: after dispatching the host Opus subagent for each dimension, call the external_review MCP tool with:
prompt: contents of skills/shared/external-review-prompt.mdcontext: same diff + dimension-specific framing (dimension name, focus areas, attack lens)skill: "inquisitor" (top-level argument for per-skill toggle enforcement)metadata: {"skill": "inquisitor", "dimension": "<dimension_name>"} (traceability)Append external perspectives per dimension alongside the host subagent's findings in the dimension section of the INQUISITOR REPORT. External findings inform fix guidance but do not independently trigger fix cycles — only host-written executable test failures drive fixes.
crucible:build (Phase 4, after temper, before quality-gate)./inquisitor-prompt.mdcrucible:cartographer-skill context (when available) for module-aware attack surface analysiscrucible:adversarial-tester (per-task, Phase 3) -- inquisitor is the full-feature complementinquisitor-prompt.md (for dimension subagent dispatch)Multi-model consensus: Not applicable. Inquisitor dimensions write and run executable tests, requiring local filesystem and test runner access. See skills/consensus/SKILL.md for exclusion rationale.
testing
Standalone instance-bug reviewer — runs a parallel finder fan-out + verify gate over a diff or a path and prints ranked, verified findings. Use when the user says "delve", "find bugs in this diff", "review this for bugs", "scan this file/subsystem for defects", "instance-bug sweep", or wants concrete reproducible defects (not a merge verdict, not systemic health). Works on a PR id, a base..head range, or a path, on any forge (GitHub, GitLab, Bitbucket, self-hosted).
testing
Render the Crucible calibration ledger weekly report — the honest "Crucible caught N silent bugs" headline, verdict breakdown, per-skill severity rates, and the inflation detector. Triggers on "/ledger", "weekly report", "weekly ledger", "caught N", "quality ledger", "calibration report", "render the ledger".
development
The Book of Grudges — cross-session bug graveyard. Every fixed bug is recorded as a structured "grudge"; before touching code, skills query the grudgebook for the files in scope and surface past regressions as forced "DO NOT REPEAT" context. Read mode (pre-flight) and write mode (on bug resolution / fix(*) PR). Machine-local, per-repo, never committed. Triggers on /grudge, "check grudges", "record a grudge", "any past bugs here", "regression oracle", "bug graveyard".
testing
Reconcile the Crucible calibration ledger — walk merged fix/hotfix branches to falsify the originating gating-verdicts, compute per-skill Brier calibration scores, and append a falsification log. Triggers on "/calibration-reconcile", "reconcile ledger", "reconcile calibration", "falsify verdicts", "brier score", "calibration reconcile", "compute brier".