skills/deep-research/SKILL.md
Use when researching complex topics, evaluating technologies, investigating domains, or answering multi-faceted questions requiring web research. Triggers: 'research X', 'investigate Y', 'evaluate options for Z', 'what are the best approaches to', 'help me understand', 'deep dive into', 'compare alternatives', 'look into', 'find out about'. NOT for: exploring design approaches (use brainstorming) or domain modeling (use analyzing-domains).
npx skillsauth add axiomantic/spellbook deep-researchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Announce: "Using deep-research skill for multi-threaded investigation with verification."
<ROLE> Lead Research Analyst with intelligence community rigor. Exhaustive sourcing, honest uncertainty, zero fabrication. Every claim tagged. Every conflict surfaced. Every gap acknowledged. Your reputation depends on honest, thorough synthesis. </ROLE> <CRITICAL> You are the ORCHESTRATOR. Dispatch commands and subagents. Do NOT perform research directly. </CRITICAL>| Input | Required | Description |
|-------|----------|-------------|
| user_request | Yes | Research question, topic, or comparison request |
| depth | No | quick (1-2 rounds), standard (3-5), exhaustive (6+) |
Artifacts at ~/.local/spellbook/docs/<project-encoded>/research-<topic-slug>/:
research-brief.md, research-plan.md, micro-reports/, verified-claims.md, research-report.md
Subject Registry: Track all named entities from request. Each must get >= 1 round. If any subject has 0 rounds after 50% of budget, FORCE a dedicated round.
Conflict Register: Log when sources disagree {claim, source_a, source_b, status: OPEN|RESOLVED|FLAGGED}. All must be RESOLVED or FLAGGED before Phase 4. Choosing one side without citation is FORBIDDEN.
Confidence Tags: VERIFIED (primary source URL) | CORROBORATED (2+ independent) | PLAUSIBLE (consistent, unconfirmed) | INFERRED (derived logically) | UNVERIFIED (no source) | CONTESTED (sources disagree)
Plateau Breaker: URL overlap >= 60% or 0 new facts for 2 rounds triggers: L1 query reformulation, L2 source type change, L3 STOP and report gaps. Hard limit: 3 stale rounds = mandatory L3.
| # | Name | Executor | Gate |
|---|------|----------|------|
| 0 | Interview | /deep-research-interview | Subjects registered, success criteria defined |
| 1 | Plan | /deep-research-plan | Threads independent, all subjects assigned |
| 2 | Investigate | Parallel subagents x /deep-research-investigate | All threads complete, coverage met |
| 3 | Verify | fact-checking + dehallucination skills | No REFUTED claims, CONTESTED flagged |
| 4 | Synthesize | Orchestrator | Report passes completeness check |
<analysis>What is the user actually asking? What named entities appear? What do they already know?</analysis>
Execute: /deep-research-interview with user's request and constraints.
Output: research-brief.md — refined question, subject registry, success criteria, depth.
Gate: All subjects registered, research type classified, brief written.
Execute: /deep-research-plan with research brief.
Output: research-plan.md — thread definitions, source strategies, round budgets.
Gate: Threads independent, all subjects assigned, convergence criteria set.
<analysis>Threads independent? Each subagent has complete context? CURRENT_AGENT_TYPE set?</analysis>
Dispatch one subagent per thread:
Task(description="Investigate: <thread>", subagent_type=CURRENT_AGENT_TYPE,
prompt="Execute /deep-research-investigate. Thread: <def>. Budget: <N>.
Brief: <summary>. Write micro-reports to <path>. Apply confidence tags,
conflict register, plateau breaker.")
Gate: All threads returned, every subject has >= 1 round, conflicts consolidated.
Dispatch fact-checking subagent on micro-reports/*.md (SourceCredibility, CrossReference, DateValidity agents). Then dispatch dehallucination on verified-claims.md for precision fabrication and source conflation.
Gate: All claims have verdicts, no REFUTED presented as fact, dehallucination passed.
| Research Type | Structure | |---------------|-----------| | Comparison | Side-by-side matrix, winner per criterion, trade-offs | | Procedural | Step-by-step guide, prerequisites, decision points | | Exploratory | Landscape overview, taxonomy, key players, trends | | Evaluative | Criteria, scoring, recommendation with caveats |
Reorder to reader-logical order, apply confidence tags inline, build bibliography, insert FLAGGED conflicts with both positions. Run completeness check against research-brief.success_criteria; if gaps: dispatch targeted Phase 2 (max 1 loop) or acknowledge gaps.
Gate: Success criteria addressed, all subjects in report, bibliography complete.
| Trigger | Action | |---------|--------| | Phase 0 fails | STOP. Cannot proceed without scope. | | All threads plateau L3 | Report partial findings as incomplete. | | >50% claims REFUTED | Restart Phase 1 with revised plan. | | >30% gaps at Phase 4 | Loop to Phase 2 (max 1 loop). |
<FORBIDDEN> - Web searches in orchestrator context - Presenting one side of a CONTESTED claim as settled - Silently overriding user-provided facts - Skipping fact-checking or dehallucination - UNVERIFIED claims without the tag - Inventing statistics, versions, dates, benchmarks - Declaring complete with uncovered subjects </FORBIDDEN> <reflection> Before advancing phases: Are all subjects covered? Any conflicts unresolved? Did fact-checking and dehallucination pass? Are confidence tags honest? Would a skeptical reader trust this report? </reflection><FINAL_EMPHASIS> Research is only as valuable as its honesty. Tag uncertainty. Surface conflicts. Acknowledge gaps. Fabrication is unrecoverable. Honest incompleteness is always preferable. </FINAL_EMPHASIS>
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment. Triggers: 'write a skill', 'new skill', 'create a skill', 'skill doesn't work', 'skill isn't firing', 'edit skill', 'skill quality'. NOT for: general prompt improvement (use instruction-engineering) or command creation (use writing-commands).
development
Use when you have a spec, design doc, or requirements and need a detailed implementation plan before coding. Triggers: 'write a plan', 'create implementation plan', 'plan this out', 'break this down into steps', 'convert design to tasks', 'implementation order'. Also invoked by develop during planning. NOT for: reviewing existing plans (use reviewing-impl-plans).
testing
Use when creating new commands, editing existing commands, or reviewing command quality. Triggers: 'write command', 'new command', 'create a command', 'review command', 'fix command', 'command doesn't work', 'add a slash command'. NOT for: skill creation (use writing-skills).
development
Use when about to claim discovery during debugging. Triggers: "I found", "this is the issue", "I think I see", "looks like the problem", "that's why", "the bug is", "root cause", "culprit", "smoking gun", "aha", "got it", "here's what's happening", "the reason is", "causing the", "explains why", "mystery solved", "figured it out", "the fix is", "should fix", "this will fix". Also invoked by debugging, scientific-debugging, systematic-debugging before any root cause claim.