skills/oracle-doubt/SKILL.md
Adversarial review of non-trivial decisions using fresh-context scrutiny. Use when correctness matters more than speed, when stakes are high (production, security-sensitive logic, irreversible operations), or before committing significant architectural or implementation choices.
npx skillsauth add martinffx/claude-code-atelier oracle-doubtInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A confident answer is not a correct one. Long sessions accumulate context that quietly turns assumptions into "facts" without anyone noticing. Doubt-driven development is the discipline of materializing a fresh-context reviewer — biased to disprove, not approve — before any non-trivial output stands.
This is not /review. /review is a verdict on a finished artifact. This is an in-flight posture: non-trivial decisions get cross-examined while course-correction is still cheap.
A decision is non-trivial when at least one of these is true:
Apply the skill when:
When NOT to use:
If you doubt every keystroke, you ship nothing. The skill applies only to non-trivial decisions as defined above.
This skill is designed for the main-session orchestrator, where Step 3 (DOUBT) can spawn fresh-context reviewers via the task tool.
skills: frontmatter. A persona that follows Step 3 would spawn another persona — the orchestration anti-pattern explicitly forbidden by references/orchestration-patterns.md ("personas do not invoke other personas").Copy this checklist when applying the skill:
Doubt cycle:
- [ ] Step 1: CLAIM — wrote the claim + why-it-matters
- [ ] Step 2: EXTRACT — isolated artifact + contract, stripped reasoning
- [ ] Step 3: DOUBT — invoked fresh-context reviewer with adversarial prompt
- [ ] Step 4: RECONCILE — classified every finding against the artifact text
- [ ] Step 5: STOP — met stop condition (trivial findings, 3 cycles, or user override)
Name the decision in two or three lines:
CLAIM: "The new caching layer is thread-safe under the
read-heavy workload described in the spec."
WHY THIS MATTERS: a race here corrupts user data and is
hard to detect in QA.
If you can't write the claim that compactly, you have a vibe, not a decision. Surface it before scrutinizing it.
A fresh-context reviewer needs the artifact and the contract, not the journey.
Strip your reasoning. If you hand over conclusions, you'll get back validation of your conclusions. The unit must be small enough that a reviewer can hold it in mind in one read — if it's a 500-line PR, decompose first.
The reviewer's prompt must be adversarial. Framing decides the answer.
Use the adversarial prompt template from references/adversarial-prompt.md. The template is pasted verbatim into the subagent invocation to override any persona's default balanced response shape.
Pass ARTIFACT + CONTRACT only. Do NOT pass the CLAIM. Handing the reviewer your conclusion biases it toward agreement. The reviewer must independently determine whether the artifact satisfies the contract.
A single-model reviewer shares blind spots with the original author — a colder, different-architecture perspective catches them. Doubt-driven is already opt-in for non-trivial decisions, so within that scope offering cross-model is part of the skill's value, not optional friction.
Interactive sessions: always offer. Never silently skip.
Step 1: Ask the user
After the single-model review in Step 3 above, but before RECONCILE, pause and ask:
"Single-model review complete. Want a cross-model second opinion? Options: spawn parallel subagents with different types (oracle + architect), manual external review (you paste it elsewhere), or skip."
This question is mandatory in every interactive doubt cycle — even on artifacts that feel low-stakes. The user — not the agent — decides whether the cost is worth it. The agent's job is to surface the choice.
Step 2: If the user picks parallel subagents — dispatch
Spawn two task subagents in parallel with the same adversarial prompt + ARTIFACT + CONTRACT:
task tool:
subagent_type: oracle
task tool:
subagent_type: architect
Each subagent starts with isolated context by design. Take both outputs into Step 4 (RECONCILE).
Step 3: If the user skips
Acknowledge the skip in the output ("Proceeding with single-model findings only") and continue to RECONCILE. Skipping is fine; silent skipping is not.
Non-interactive contexts (CI, scheduled runs):
Cross-model adds cost, latency, and tool fragility. The agent surfaces the choice every cycle; the user decides whether this artifact warrants it.
The reviewer's output is data, not verdict. You are still the orchestrator. Re-read the artifact text against each finding before classifying — rubber-stamping the reviewer is the same failure mode as ignoring it.
For each finding, classify in this precedence order (first matching class wins):
A fresh reviewer can be wrong because it lacks context. Don't defer just because it's "fresh."
Stop when:
If after 3 cycles the reviewer still surfaces substantive issues, the artifact may not be ready. Surface this to the user — three unresolved cycles is information about the artifact, not a reason to keep looping.
If 3 cycles is "obviously insufficient" because the artifact is large: the artifact is too big — return to Step 2 and decompose. Do not lift the bound.
| Rationalization | Reality |
|---|---|
| "I'm confident, skip the doubt step" | Confidence correlates poorly with correctness on novel problems. Moments of certainty are exactly when blind spots hide. |
| "Spawning a reviewer is expensive" | Debugging a wrong commit in production is more expensive. The check is bounded; the bug isn't. |
| "The reviewer will just nitpick" | Only if unscoped. Constrain the prompt to "issues that would make this fail under the contract." |
| "I'll do doubt at the end with /review" | /review is a final gate. Doubt-driven catches wrong directions early when course-correction is cheap. By PR time it's too late. |
| "If I doubt every step I'll never ship" | The skill applies to non-trivial decisions, not every keystroke. Re-read "When NOT to Use." |
| "Two opinions are always better than one" | Not when the second has less context and produces noise. Reconcile, don't defer. |
| "The reviewer disagreed so I was wrong" | The reviewer lacks your context — disagreement is information, not verdict. Re-read the artifact, classify, then decide. |
| "Cross-model is always better" | Cross-model catches blind spots a single model shares with itself, but it adds cost and tool fragility. Offer it every interactive doubt cycle — the user decides whether the artifact warrants it. The agent's job is to surface the choice, not to gate it. |
| "User said yes once, so I can keep invoking subagents" | Each invocation is its own authorization. The artifact, the prompt, and the context change between calls — re-confirm with the user before every run. |
/review, not doubt-driven developmentcode-review / /review: complementary. /review is post-hoc PR verdict; doubt-driven is in-flight per-decision. Use both.oracle-architect / spec-research: SDD verifies facts about frameworks against official docs. Doubt-driven verifies your reasoning about the artifact. SDD checks the API exists; doubt-driven checks you used it correctly under the contract.oracle-testing: TDD's RED step is doubt made concrete — a failing test is a disproof attempt. When TDD applies, that failing test is the doubt step for behavioral claims.code-debug: when the reviewer surfaces a real failure mode, drop into the debugging skill to localize and fix.After applying doubt-driven development:
Doubt an architectural decision:
/atelier-doubt
CLAIM: "I will use DynamoDB single-table design for the Order service."
WHY: "Single-table simplifies access patterns but may couple unrelated domains."
ARTIFACT: "Proposed key schema: PK=ORDER#<id>, SK=METADATA#<id>. All order data in one table."
CONTRACT: "Must support: 1) lookup by order ID, 2) list by customer, 3) list by status. No cross-domain queries."
Doubt a code change before commit:
/atelier-doubt
CLAIM: "The retry logic in PaymentService is safe under concurrent requests."
WHY: "Race here could double-charge customers."
ARTIFACT: "<paste diff or function>"
CONTRACT: "Must be idempotent: same paymentId processed twice produces same result, no duplicate charges."
Doubt a non-obvious assertion:
/atelier-doubt
CLAIM: "This migration is safe to run online without downtime."
WHY: "Downtime here blocks checkout for all users."
ARTIFACT: "ALTER TABLE orders ADD COLUMN tracking_id VARCHAR(255); CREATE INDEX idx_tracking ON orders(tracking_id);"
CONTRACT: "Must not lock table for more than 1 second. Must not fail on 10M row table. Must be backward-compatible with running application code."
| Skill | Use When | |---|---| | oracle-doubt | In-flight adversarial review before a decision stands. Catches wrong directions early. | | oracle-challenge | Critically evaluating an existing approach or decision after it's been made. | | code-review | Post-hoc PR verdict on a finished artifact. | | oracle-testing | TDD RED step serves as the doubt mechanism for behavioral claims. | | oracle-thinkdeep | Deep exploration and alternative discovery for complex decisions (use doubt after thinkdeep narrows to a specific choice). |
Key distinction:
oracle-doubt = "Before I commit to this, prove me wrong" (pre-commit adversarial review)oracle-challenge = "Is this approach valid?" (critical evaluation of standing decisions)code-review = "Judge this finished artifact" (post-hoc verdict)development
Security architecture and threat modeling knowledge. Auto-invokes when designing features that handle untrusted data, authentication, authorization, external integrations, file uploads, or sensitive data. Provides risk assessment frameworks, trust boundary analysis, and security design principles — not implementation code.
development
Compact the current conversation into a handoff document for another agent to pick up.
testing
Socratic interrogation of plans against the project's domain model and documented decisions. Use when the user wants to stress-test a plan, clarify terminology, or validate assumptions against existing domain language. Updates CONTEXT.md and ADRs inline as decisions crystallise.
development
Generate and validate conventional commit messages following the conventionalcommits.org spec. Use whenever the user wants to commit code, mentions commit messages, git commit, or asks to create a commit. Triggers on "commit", "git commit", "conventional", or when reviewing commit message format.