Deep Research Team (Lead Orchestrator)

Conduct thorough, iterative research by coordinating a persistent team of researcher agents across multiple rounds. This architecture enables mid-investigation steering, targeted follow-up based on emerging findings, and cross-agent verification.

Architecture Overview

Round 1: Investigation      Round 2: Follow-up          Synthesis

┌──────────┐                  ┌──────────┐                  ┌────────┐
│Researcher│  sends findings  │Researcher│  sends findings  │        │
│    A     ├─────────┬───────>│    A     ├─────────┬───────>│        │
└──────────┘         │        └──────────┘         │        │        │
                     │                             │        │        │
                     v          dispatches         v        │        │
┌──────────┐    ┌────────┐    ┌──────────┐    ┌────────┐    │  Lead  │
│Researcher├───>│  Lead  │───>│Researcher├───>│  Lead  │───>│  synth │
│    B     │    │triages │    │    B     │    │triages │    │  esizes│
└──────────┘    └────────┘    └──────────┘    └────────┘    │        │
                     ^                             ^        │        │
┌──────────┐         │        ┌──────────┐         │        │        │
│Researcher├─────────┴───────>│Researcher├─────────┴───────>│        │
│    C     │  sends findings  │    C     │  sends findings  │        │
└──────────┘                  └──────────┘                  └────────┘

Key principles:

No peer-to-peer researcher communication. All coordination goes through the lead. This preserves the independence that accounts for 87% of multi-agent gains (Choi et al.) and avoids sycophancy failures (Wynn et al.). Researchers never see each other's findings.
Multi-round iteration. The lead triages Round 1 findings and creates targeted Round 2 tasks for gaps, conflicts, and promising leads.
Cross-agent verification (Comprehensive scope). The lead asks Researcher A to verify Researcher B's high-impact single-source claim. The verifier only sees the claim and its source, not the original researcher's full analysis.
Dynamic task evolution. The shared task list starts with pre-planned angles but grows organically as follow-up tasks emerge from findings. The lead dispatches follow-up tasks directly to specific researchers via SendMessage.

When to Use

Use this skill for:

Complex questions that benefit from multiple research angles
Topics where initial findings will reveal what to investigate next
Research requiring cross-verification of contested claims
Any question needing synthesis across 5+ sources

Do NOT use for:

Simple factual lookups (use regular web search)
Questions answerable with 1-2 searches
Debugging or code questions

Effort Calibration

| Scope | Researchers | Rounds | Verification | Model | | ----------------- | ----------- | ------ | ------------ | ------ | | Focused | 2 | 1-2 | None | sonnet | | Broad | 3 | 2-3 | None | sonnet | | Comprehensive | 4 | 3-4 | Cross-agent | opus |

Round counts are heuristics, not targets. Stop early when you hit citation convergence -- additional rounds that don't surface new substantive findings waste tokens and context. A Broad run that converges in 2 rounds is a success, not a shortcut. After each round's triage, ask: "Would another round change the report's conclusions?" If not, proceed to synthesis.

Default scope is determined by question type (see references/question-types.md). Present the recommended scope to the user and allow override.

Model selection:

Lead: inherits user's session model (no override)
Researchers: sonnet for Focused/Broad, opus for Comprehensive

(Sonnet validated as viable override for Comprehensive when cost matters)

Output Directory

Research artifacts persist to disk for resumability and backup.

Directory resolution -- run this command FIRST, before creating anything. The output is your {output_dir}. Only the fallback branch creates a directory; the others reuse what exists.

if [ -d "$(pwd)/deep-research" ]; then
  echo "$(pwd)/deep-research"
elif [ -n "$CLAUDE_DEEP_RESEARCH_DIR" ]; then
  eval echo "$CLAUDE_DEEP_RESEARCH_DIR"
else
  mkdir -p "$(pwd)/deep-research"
  echo "$(pwd)/deep-research"
fi

After resolving {output_dir}, create only the topic subdirectory in Phase 3.

Each session creates a subdirectory: {output_dir}/{topic-slug}/

Contents:

state.md -- triage checkpoint, cross-references, follow-up plan (written in Phase 4)
researcher-{letter}-findings.md -- backup of each researcher's findings
report.md -- final synthesized report (written in Phase 6)

Calibrated Confidence Language

Use Kent-style verbal probability expressions in all confidence assessments:

| Term | Range | Use When | | -------------- | ------ | ----------------------------------------- | | Almost certain | 93-99% | Multiple high-quality sources, no dissent | | Highly likely | 80-92% | Strong evidence, minor caveats | | Likely | 63-79% | Good evidence, some gaps | | Roughly even | 40-62% | Conflicting evidence, genuinely uncertain | | Unlikely | 20-39% | Limited or weak evidence |

Always pair the verbal term with the probability range in the final report.

Process

Phase 0: Classify Question Type

Silently classify the user's question before any interaction.

Read references/question-types.md for the full taxonomy.
Assign a primary type: Factual, Scientific/Health, Consumer, Technical, Opinion/Sentiment, Contested, or Emerging/Frontier.
For compound questions, decompose into sub-questions and classify each.
Note the default scope from the type-to-scope mapping.

Resume check: Before starting, list the subdirectories in {output_dir} and scan for any that look related to the current question (similar topic, overlapping keywords). If you find a plausible match, read its state.md and offer to resume: present what was completed, what remains, and ask the user whether to resume or start fresh. If resuming, create a new team and tasks for only the remaining work.

Topic slug: When creating a new session, generate a slug (lowercase, hyphenated, max 40 chars) for the subdirectory name: {output_dir}/{slug}/.

Classification is internal—do not present it to the user.

Phase 1: Clarify and Plan

Step 1: Make sure you understand the question. Before planning anything, ask yourself: do I understand what the user is asking and why well enough to design research angles that will actually be useful to them? If not, use AskUserQuestion to fill the gaps. This isn't just about ambiguous wording -- a perfectly clear question can still lack enough context to research well ("How does Nix handle dependencies?" means very different research depending on whether you're evaluating Nix, debugging an issue, or writing docs). If the question and its context are clear, skip this step.

Step 2: Scope and decompose. Determine the appropriate scope (Phase 2 has the details) and decompose the question into independent research angles. Default angle counts by scope:

Focused: 2 angles
Broad: 3 angles
Comprehensive: 4 angles

These are defaults, not caps. If the decomposition reveals one more genuinely independent facet than the default, add it (e.g., 3 angles for a Focused run). If the question has fewer real facets, use fewer. Beyond ±1 from the default, re-scope rather than stretching -- the scope was probably wrong. Each angle must be independent and substantial enough to warrant a dedicated researcher; "I can think of another angle" isn't sufficient.

For compound questions, map sub-questions to angles. Multiple sub-questions can share an angle if closely related; a single sub-question can span multiple angles if it has distinct facets.

Step 3: Confirm if high-investment. For compound, contested, or Comprehensive-scope questions, present the research plan for user approval before spawning researchers:

Research plan for "{question}":

Type: {type} | Scope: {scope} | {N} researchers, {M} rounds

Angles: {list of planned angles}

[If compound] Sub-question → angle mapping: ...

Proceed, or adjust?

For clear, low-scope questions, skip confirmation and proceed.

Phase 2: Calibrate Effort

Select the scope tier based on question type defaults from references/question-types.md, then apply the scope modifiers from that file (de-escalation and escalation signals). Also adjust for:

User's explicit preference (if stated)
Structural complexity (compound questions with 3+ sub-types escalate)

Announce the plan: "Starting {scope} team research with {N} researchers."

Phase 3: Team Setup

Step 1: Create the output directory

mkdir -p {output_dir}/{topic-slug}

Step 2: Create the team

TeamCreate:
  team_name: "deep-research-{topic-slug}"
  description: "Researching {topic} in {scope} scope"

Step 3: Create initial tasks

One TaskCreate per research angle:

TaskCreate:
  subject: "Investigate {angle title}"
  description: |
    Research angle: {angle description}
    Topic context: {brief topic summary}
    Question type: {type from Phase 0}
    Focus: {what specifically to investigate}
    Return structured findings via SendMessage to the lead.
  activeForm: "Investigating {angle title}"

Task brief clarity: Make scope boundaries explicit between researchers to avoid overlap and gaps. Flag name ambiguities (e.g., multiple products sharing a name). Mark optional sub-tasks clearly (e.g., "if time permits" vs required).

Step 4: Spawn researchers

Launch ALL researchers in a SINGLE message. Each researcher gets:

Task:
  subagent_type: "general-purpose"
  name: "researcher-{letter}"
  team_name: "deep-research-{topic-slug}"
  model: "sonnet"  (or "opus" for Comprehensive)
  description: "Spawn researcher {letter}"
  prompt: |
    You are a research agent on a team. Your job is to investigate research tasks
    by searching the web, evaluating sources, and reporting structured findings.

    FIRST: Read your methodology at: {absolute path to references/researcher-prompt.md}

    Question type: {type from Phase 0}
    Output directory: {output_dir}/{topic-slug}
    Your researcher letter: {letter} (use LOWERCASE in filenames: researcher-{lowercase letter})
    Lead name: team-lead (send all findings to this name via SendMessage)

    Your assigned task is #{id}: "{subject}"
    Use TaskGet for full details, then begin investigation.

    After completing your task, go idle. The lead will message you directly
    when new tasks are available.

Include the task ID, subject, question type, output directory, researcher letter, and lead name directly in each researcher's spawn prompt.

Phase 4: Investigation Loop

This is the core research cycle. Each iteration is a round: researchers investigate, the lead triages, then either dispatches follow-ups (another round) or exits to synthesis.

Round structure

Investigate: Researchers work on their assigned tasks. Each researcher will:

Read the methodology reference file
Load web search and content extraction tools via ToolSearch
Execute the investigation loop (search -> evaluate -> reflect -> decide)
Write findings to {output_dir}/{topic-slug}/researcher-{letter}-findings.md
Notify the lead via SendMessage with the file path (not the full findings -- avoids doubling output tokens)
Mark their task as completed via TaskUpdate
Go idle and wait for the lead to dispatch follow-up tasks via SendMessage

Monitoring: The lead reads each researcher's findings file after receiving their notification. No polling needed.

Handling partial results: If a researcher reports rate limit issues or thin coverage, note the gap for triage rather than immediately spawning replacements.

Triage: After all tasks for the current round complete, systematically review findings.

Step 1: Extract and cross-reference claims

For each significant claim across all researcher findings:

How many independent sources support it? (Different researchers finding the same source counts as one source, not two.)
HIGH confidence: 3+ independent sources of different types (e.g., paper + dataset + practitioner account), no credible dissent
MEDIUM confidence: 2 independent sources, or multiple sources of the same type
LOW confidence: Single source, or multiple sources that trace back to one original

Step 2: Identify gaps and conflicts

What angles remain uncovered?
Where do researchers contradict each other?
What findings are surprising and deserve deeper investigation?
Which claims rest on a single source?

Step 3: Decide whether to continue or exit

Exit to Phase 5 (Synthesis) if findings have converged -- another round wouldn't change the report's conclusions. Continue if significant gaps, conflicts, or single-source high-impact claims remain and the scope's round budget allows.

If findings reveal more complexity than anticipated (e.g., Broad scope uncovering deeply contested claims requiring steelmanning), escalate: spawn an additional researcher or add a round beyond the default budget.

Step 4: Persist triage state

Write or update {output_dir}/{topic-slug}/state.md:

# Research State: {topic}

## Status: TRIAGE_COMPLETE (Round {N})

## Question Type: {type}

## Scope: {scope}

## Round {N} Summary

{brief cross-reference of key findings, gaps, conflicts}

## Follow-up Plan

{list of planned follow-up tasks with rationale, or "Proceeding to synthesis"}

Dispatching follow-up tasks

If continuing, create targeted tasks based on what triage revealed. These are all just task types -- they use the same dispatch mechanism:

Gap-fill: "No researcher covered {aspect}. Investigate {specific question}."

Conflict-resolution: "One source says X, another says Y. Search for additional sources that clarify which is accurate and why they might differ."

Deep-dive: "Initial findings revealed {unexpected thing}. Investigate further: {specific follow-up questions}."

Verification (Comprehensive scope): Use when high-impact claims rest on a single source, factual conflicts remain unresolved, or claims are in specialized/niche domains where citation error rates are higher. Assign to a researcher who did NOT make the original claim. Task description contains only the claim and its source URL:
TaskCreate:
  subject: "Verify: {claim summary}"
  description: |
    Verification task. Search for ADDITIONAL sources (not the original) and determine
    if they support, contradict, or add nuance to this claim.

    CLAIM: {specific factual claim}
    ORIGINAL SOURCE: {URL}

    Report your verdict as: SUPPORTED, SUPPORTED WITH NUANCE, CONTESTED, or UNCHANGED
    Include the additional sources you found and any important nuance.
  activeForm: "Verifying claim about {topic}"

Critical: Follow-up task descriptions contain just enough context without revealing other researchers' full conclusions. This preserves independence.

Assignment strategy: The lead assigns follow-up tasks directly via SendMessage rather than relying on researchers to self-claim. Choose assignees based on task type:

Deep-dives and gap-fills: Assign to the researcher who covered the related angle (continuity -- they have context on what was already found).
Conflict resolution and verification: Assign to a researcher who did NOT cover either side (fresh perspective avoids confirmation bias).
If researchers outnumber tasks: Idle researchers wait or are shut down early.

SendMessage:
  type: "message"
  recipient: "researcher-{letter}"
  content: "New task available: #{id} -- {subject}. Please claim it and begin."
  summary: "Follow-up task assignment"

Then loop back to Investigate above.

Interpreting verification results: When a verification task returns, adjust confidence:

SUPPORTED: Upgrade confidence; note additional sources
SUPPORTED WITH NUANCE: Directionally correct but specific details differ or require qualification. Upgrade confidence for the general claim; add caveats for specifics.
CONTESTED: Flag explicitly; present both sides with evidence
UNCHANGED: Keep original confidence level

Phase 5: Synthesize (Type-Aware)

Combine all findings from all rounds into a coherent report. Select the synthesis template matching the question type from Phase 0. For compound questions, use the template for each sub-question's type, then add an overall synthesis section.

End-of-sequence awareness: Draft Confidence Assessment and Limitations sections early, not last. Review final paragraphs specifically for unsourced claims.

Template Selection

Read the template file matching the question type from Phase 0. Each template includes the full report structure (executive summary, type-specific body, confidence assessment, limitations, sources). For compound questions, read the template for each sub-question's type and add an overall synthesis section.

| Question Type | Template File | | --------------------- | -------------------------------------------- | | Factual | references/templates/factual.md | | Scientific/Health | references/templates/scientific-health.md | | Consumer | references/templates/consumer.md | | Technical | references/templates/technical.md | | Opinion/Sentiment | references/templates/opinion-sentiment.md | | Contested | references/templates/contested.md | | Emerging/Frontier | references/templates/emerging-frontier.md |

Phase 6: Persist Report

Write the final report:

Write: {output_dir}/{topic-slug}/report.md

Update state.md status to COMPLETE:

Edit: {output_dir}/{topic-slug}/state.md
  old_string: "## Status: TRIAGE_COMPLETE"
  new_string: "## Status: COMPLETE"

Format output files (optional): If prettier is available, run it on all markdown files in the output directory to normalize formatting:

prettier --write --prose-wrap preserve "{output_dir}/{topic-slug}/**/*.md"

If prettier is not installed, skip this step silently -- it is cosmetic, not functional.

Inform the user: "Report saved to {output_dir}/{topic-slug}/report.md."

Phase 7: Cleanup

Shut down the team cleanly.

Step 1: Shut down researchers

Send shutdown_request to each researcher via SendMessage:

SendMessage:
  type: "shutdown_request"
  recipient: "researcher-a"
  content: "Research complete. Shutting down."

Repeat for each researcher. Wait for shutdown responses.

Step 2: Read researcher feedback

After all researchers have shut down, read any feedback files written to {output_dir}/{topic-slug}/researcher-{letter}-feedback.md. These contain notes on tool usage (Exa parameters, Firecrawl usage), issues encountered (400 errors, rate limits), and suggestions. Use this feedback to identify patterns for skill improvement.

Step 3: Delete team

TeamDelete

Writing Standards

Prose paragraphs, not bullet lists (bullets only for distinct enumerations)
Specific data: "increased 23%" not "increased significantly"
Cite inline with markdown footnotes: "The market grew 15%[^1]" not "The market grew.[^1]"
Each finding: 2-4 paragraphs with evidence
Distinguish FACTS (cited) from ANALYSIS (synthesis)

Anti-Hallucination Protocol

Every factual claim must cite a source immediately
Mark synthesis distinctly: "This suggests..." or "Synthesizing these findings..."
If uncertain, say so: "Sources disagree on..." or "Limited evidence for..."
Never fabricate sources -- all citations come from researcher findings

Additional Resources

Reference Files

references/question-types.md -- 7-type taxonomy, signals, decomposition rules, type-to-scope defaults. Read in Phase 0.
references/researcher-prompt.md -- Investigation methodology, type-aware source evaluation, output format. Path provided to researchers in spawn prompt. (Tool guidance extracted to the standalone search-tips skill, which researchers load as their first step.)
references/templates/ -- Type-specific synthesis templates. Read the relevant template(s) in Phase 5. See Template Selection table above.

Scripts

scripts/analyze-transcripts.py -- Post-hoc analysis of researcher tool usage. Extracts MCP tool call parameters from subagent JSONL transcripts and produces a compliance report. Usage:
- python3 scripts/analyze-transcripts.py --session ${CLAUDE_SESSION_ID} -- current session
- python3 scripts/analyze-transcripts.py "topic keyword" -- auto-detect session by keyword
- python3 scripts/analyze-transcripts.py --list -- list recent sessions with subagents

Development History

dev/RESEARCH.md -- Design rationale with 50+ sources justifying the architecture
dev/ITERATION-LOG.md -- 13 iterations of improvement with backlog
dev/iterations/ -- Detailed notes for each iteration

Quick Reference

Classify question type (silent) and check for resume
Clarify -- understand the question, resolve ambiguity, confirm plan if high-investment
Calibrate effort: announce scope and researcher count
Setup: create output dir, TeamCreate, TaskCreate per angle, spawn researchers
Investigation loop: investigate → triage → dispatch follow-ups or exit. Repeat until converged.
Synthesize: type-aware template, calibrated confidence language
Persist: write report.md, update state.md to COMPLETE, tell user file location
Cleanup: shutdown_request to each researcher, then TeamDelete

Context budget:

Lead context: reserve for triage + synthesis
Researcher contexts: handle all search/scrape operations
Researcher spawn prompts: compact (~30 lines), point to reference file
Researcher findings: structured summaries (~120 lines each)

Deep Research Team (Lead Orchestrator)

Architecture Overview

Round 1: Investigation      Round 2: Follow-up          Synthesis

┌──────────┐                  ┌──────────┐                  ┌────────┐
│Researcher│  sends findings  │Researcher│  sends findings  │        │
│    A     ├─────────┬───────>│    A     ├─────────┬───────>│        │
└──────────┘         │        └──────────┘         │        │        │
                     │                             │        │        │
                     v          dispatches         v        │        │
┌──────────┐    ┌────────┐    ┌──────────┐    ┌────────┐    │  Lead  │
│Researcher├───>│  Lead  │───>│Researcher├───>│  Lead  │───>│  synth │
│    B     │    │triages │    │    B     │    │triages │    │  esizes│
└──────────┘    └────────┘    └──────────┘    └────────┘    │        │
                     ^                             ^        │        │
┌──────────┐         │        ┌──────────┐         │        │        │
│Researcher├─────────┴───────>│Researcher├─────────┴───────>│        │
│    C     │  sends findings  │    C     │  sends findings  │        │
└──────────┘                  └──────────┘                  └────────┘

Key principles:

No peer-to-peer researcher communication. All coordination goes through the lead. This preserves the independence that accounts for 87% of multi-agent gains (Choi et al.) and avoids sycophancy failures (Wynn et al.). Researchers never see each other's findings.
Multi-round iteration. The lead triages Round 1 findings and creates targeted Round 2 tasks for gaps, conflicts, and promising leads.
Cross-agent verification (Comprehensive scope). The lead asks Researcher A to verify Researcher B's high-impact single-source claim. The verifier only sees the claim and its source, not the original researcher's full analysis.
Dynamic task evolution. The shared task list starts with pre-planned angles but grows organically as follow-up tasks emerge from findings. The lead dispatches follow-up tasks directly to specific researchers via SendMessage.

When to Use

Use this skill for:

Complex questions that benefit from multiple research angles
Topics where initial findings will reveal what to investigate next
Research requiring cross-verification of contested claims
Any question needing synthesis across 5+ sources

Do NOT use for:

Simple factual lookups (use regular web search)
Questions answerable with 1-2 searches
Debugging or code questions

Effort Calibration

Default scope is determined by question type (see references/question-types.md). Present the recommended scope to the user and allow override.

Model selection:

Lead: inherits user's session model (no override)
Researchers: sonnet for Focused/Broad, opus for Comprehensive

(Sonnet validated as viable override for Comprehensive when cost matters)

Output Directory

Research artifacts persist to disk for resumability and backup.

Directory resolution -- run this command FIRST, before creating anything. The output is your {output_dir}. Only the fallback branch creates a directory; the others reuse what exists.

if [ -d "$(pwd)/deep-research" ]; then
  echo "$(pwd)/deep-research"
elif [ -n "$CLAUDE_DEEP_RESEARCH_DIR" ]; then
  eval echo "$CLAUDE_DEEP_RESEARCH_DIR"
else
  mkdir -p "$(pwd)/deep-research"
  echo "$(pwd)/deep-research"
fi

After resolving {output_dir}, create only the topic subdirectory in Phase 3.

Each session creates a subdirectory: {output_dir}/{topic-slug}/

Contents:

state.md -- triage checkpoint, cross-references, follow-up plan (written in Phase 4)
researcher-{letter}-findings.md -- backup of each researcher's findings
report.md -- final synthesized report (written in Phase 6)

Calibrated Confidence Language

Use Kent-style verbal probability expressions in all confidence assessments:

Always pair the verbal term with the probability range in the final report.

Process

Phase 0: Classify Question Type

Silently classify the user's question before any interaction.

Read references/question-types.md for the full taxonomy.
Assign a primary type: Factual, Scientific/Health, Consumer, Technical, Opinion/Sentiment, Contested, or Emerging/Frontier.
For compound questions, decompose into sub-questions and classify each.
Note the default scope from the type-to-scope mapping.

Topic slug: When creating a new session, generate a slug (lowercase, hyphenated, max 40 chars) for the subdirectory name: {output_dir}/{slug}/.

Classification is internal—do not present it to the user.

Phase 1: Clarify and Plan

Step 2: Scope and decompose. Determine the appropriate scope (Phase 2 has the details) and decompose the question into independent research angles. Default angle counts by scope:

Focused: 2 angles
Broad: 3 angles
Comprehensive: 4 angles

For compound questions, map sub-questions to angles. Multiple sub-questions can share an angle if closely related; a single sub-question can span multiple angles if it has distinct facets.

Step 3: Confirm if high-investment. For compound, contested, or Comprehensive-scope questions, present the research plan for user approval before spawning researchers:

Research plan for "{question}":

Type: {type} | Scope: {scope} | {N} researchers, {M} rounds

Angles: {list of planned angles}

[If compound] Sub-question → angle mapping: ...

Proceed, or adjust?

For clear, low-scope questions, skip confirmation and proceed.

Phase 2: Calibrate Effort

Select the scope tier based on question type defaults from references/question-types.md, then apply the scope modifiers from that file (de-escalation and escalation signals). Also adjust for:

User's explicit preference (if stated)
Structural complexity (compound questions with 3+ sub-types escalate)

Announce the plan: "Starting {scope} team research with {N} researchers."

Phase 3: Team Setup

Step 1: Create the output directory

mkdir -p {output_dir}/{topic-slug}

Step 2: Create the team

TeamCreate:
  team_name: "deep-research-{topic-slug}"
  description: "Researching {topic} in {scope} scope"

Step 3: Create initial tasks

One TaskCreate per research angle:

TaskCreate:
  subject: "Investigate {angle title}"
  description: |
    Research angle: {angle description}
    Topic context: {brief topic summary}
    Question type: {type from Phase 0}
    Focus: {what specifically to investigate}
    Return structured findings via SendMessage to the lead.
  activeForm: "Investigating {angle title}"

Step 4: Spawn researchers

Launch ALL researchers in a SINGLE message. Each researcher gets:

Task:
  subagent_type: "general-purpose"
  name: "researcher-{letter}"
  team_name: "deep-research-{topic-slug}"
  model: "sonnet"  (or "opus" for Comprehensive)
  description: "Spawn researcher {letter}"
  prompt: |
    You are a research agent on a team. Your job is to investigate research tasks
    by searching the web, evaluating sources, and reporting structured findings.

    FIRST: Read your methodology at: {absolute path to references/researcher-prompt.md}

    Question type: {type from Phase 0}
    Output directory: {output_dir}/{topic-slug}
    Your researcher letter: {letter} (use LOWERCASE in filenames: researcher-{lowercase letter})
    Lead name: team-lead (send all findings to this name via SendMessage)

    Your assigned task is #{id}: "{subject}"
    Use TaskGet for full details, then begin investigation.

    After completing your task, go idle. The lead will message you directly
    when new tasks are available.

Include the task ID, subject, question type, output directory, researcher letter, and lead name directly in each researcher's spawn prompt.

Phase 4: Investigation Loop

This is the core research cycle. Each iteration is a round: researchers investigate, the lead triages, then either dispatches follow-ups (another round) or exits to synthesis.

Round structure

Investigate: Researchers work on their assigned tasks. Each researcher will:

Read the methodology reference file
Load web search and content extraction tools via ToolSearch
Execute the investigation loop (search -> evaluate -> reflect -> decide)
Write findings to {output_dir}/{topic-slug}/researcher-{letter}-findings.md
Notify the lead via SendMessage with the file path (not the full findings -- avoids doubling output tokens)
Mark their task as completed via TaskUpdate
Go idle and wait for the lead to dispatch follow-up tasks via SendMessage

Monitoring: The lead reads each researcher's findings file after receiving their notification. No polling needed.

Handling partial results: If a researcher reports rate limit issues or thin coverage, note the gap for triage rather than immediately spawning replacements.

Triage: After all tasks for the current round complete, systematically review findings.

Step 1: Extract and cross-reference claims

For each significant claim across all researcher findings:

How many independent sources support it? (Different researchers finding the same source counts as one source, not two.)
HIGH confidence: 3+ independent sources of different types (e.g., paper + dataset + practitioner account), no credible dissent
MEDIUM confidence: 2 independent sources, or multiple sources of the same type
LOW confidence: Single source, or multiple sources that trace back to one original

Step 2: Identify gaps and conflicts

What angles remain uncovered?
Where do researchers contradict each other?
What findings are surprising and deserve deeper investigation?
Which claims rest on a single source?

Step 3: Decide whether to continue or exit

Step 4: Persist triage state

Write or update {output_dir}/{topic-slug}/state.md:

# Research State: {topic}

## Status: TRIAGE_COMPLETE (Round {N})

## Question Type: {type}

## Scope: {scope}

## Round {N} Summary

{brief cross-reference of key findings, gaps, conflicts}

## Follow-up Plan

{list of planned follow-up tasks with rationale, or "Proceeding to synthesis"}

Dispatching follow-up tasks

If continuing, create targeted tasks based on what triage revealed. These are all just task types -- they use the same dispatch mechanism:

Gap-fill: "No researcher covered {aspect}. Investigate {specific question}."

Conflict-resolution: "One source says X, another says Y. Search for additional sources that clarify which is accurate and why they might differ."

Deep-dive: "Initial findings revealed {unexpected thing}. Investigate further: {specific follow-up questions}."

Verification (Comprehensive scope): Use when high-impact claims rest on a single source, factual conflicts remain unresolved, or claims are in specialized/niche domains where citation error rates are higher. Assign to a researcher who did NOT make the original claim. Task description contains only the claim and its source URL:
TaskCreate:
  subject: "Verify: {claim summary}"
  description: |
    Verification task. Search for ADDITIONAL sources (not the original) and determine
    if they support, contradict, or add nuance to this claim.

    CLAIM: {specific factual claim}
    ORIGINAL SOURCE: {URL}

    Report your verdict as: SUPPORTED, SUPPORTED WITH NUANCE, CONTESTED, or UNCHANGED
    Include the additional sources you found and any important nuance.
  activeForm: "Verifying claim about {topic}"

Critical: Follow-up task descriptions contain just enough context without revealing other researchers' full conclusions. This preserves independence.

Assignment strategy: The lead assigns follow-up tasks directly via SendMessage rather than relying on researchers to self-claim. Choose assignees based on task type:

Deep-dives and gap-fills: Assign to the researcher who covered the related angle (continuity -- they have context on what was already found).
Conflict resolution and verification: Assign to a researcher who did NOT cover either side (fresh perspective avoids confirmation bias).
If researchers outnumber tasks: Idle researchers wait or are shut down early.

SendMessage:
  type: "message"
  recipient: "researcher-{letter}"
  content: "New task available: #{id} -- {subject}. Please claim it and begin."
  summary: "Follow-up task assignment"

Then loop back to Investigate above.

Interpreting verification results: When a verification task returns, adjust confidence:

SUPPORTED: Upgrade confidence; note additional sources
SUPPORTED WITH NUANCE: Directionally correct but specific details differ or require qualification. Upgrade confidence for the general claim; add caveats for specifics.
CONTESTED: Flag explicitly; present both sides with evidence
UNCHANGED: Keep original confidence level

Phase 5: Synthesize (Type-Aware)

End-of-sequence awareness: Draft Confidence Assessment and Limitations sections early, not last. Review final paragraphs specifically for unsourced claims.

Template Selection

Phase 6: Persist Report

Write the final report:

Write: {output_dir}/{topic-slug}/report.md

Update state.md status to COMPLETE:

Edit: {output_dir}/{topic-slug}/state.md
  old_string: "## Status: TRIAGE_COMPLETE"
  new_string: "## Status: COMPLETE"

Format output files (optional): If prettier is available, run it on all markdown files in the output directory to normalize formatting:

prettier --write --prose-wrap preserve "{output_dir}/{topic-slug}/**/*.md"

If prettier is not installed, skip this step silently -- it is cosmetic, not functional.

Inform the user: "Report saved to {output_dir}/{topic-slug}/report.md."

Phase 7: Cleanup

Shut down the team cleanly.

Step 1: Shut down researchers

Send shutdown_request to each researcher via SendMessage:

SendMessage:
  type: "shutdown_request"
  recipient: "researcher-a"
  content: "Research complete. Shutting down."

Repeat for each researcher. Wait for shutdown responses.

Step 2: Read researcher feedback

Step 3: Delete team

TeamDelete

Writing Standards

Prose paragraphs, not bullet lists (bullets only for distinct enumerations)
Specific data: "increased 23%" not "increased significantly"
Cite inline with markdown footnotes: "The market grew 15%[^1]" not "The market grew.[^1]"
Each finding: 2-4 paragraphs with evidence
Distinguish FACTS (cited) from ANALYSIS (synthesis)

Anti-Hallucination Protocol

Every factual claim must cite a source immediately
Mark synthesis distinctly: "This suggests..." or "Synthesizing these findings..."
If uncertain, say so: "Sources disagree on..." or "Limited evidence for..."
Never fabricate sources -- all citations come from researcher findings

Additional Resources

Reference Files

references/question-types.md -- 7-type taxonomy, signals, decomposition rules, type-to-scope defaults. Read in Phase 0.
references/researcher-prompt.md -- Investigation methodology, type-aware source evaluation, output format. Path provided to researchers in spawn prompt. (Tool guidance extracted to the standalone search-tips skill, which researchers load as their first step.)
references/templates/ -- Type-specific synthesis templates. Read the relevant template(s) in Phase 5. See Template Selection table above.

Scripts

scripts/analyze-transcripts.py -- Post-hoc analysis of researcher tool usage. Extracts MCP tool call parameters from subagent JSONL transcripts and produces a compliance report. Usage:
- python3 scripts/analyze-transcripts.py --session ${CLAUDE_SESSION_ID} -- current session
- python3 scripts/analyze-transcripts.py "topic keyword" -- auto-detect session by keyword
- python3 scripts/analyze-transcripts.py --list -- list recent sessions with subagents

Development History

dev/RESEARCH.md -- Design rationale with 50+ sources justifying the architecture
dev/ITERATION-LOG.md -- 13 iterations of improvement with backlog
dev/iterations/ -- Detailed notes for each iteration

Quick Reference

Classify question type (silent) and check for resume
Clarify -- understand the question, resolve ambiguity, confirm plan if high-investment
Calibrate effort: announce scope and researcher count
Setup: create output dir, TeamCreate, TaskCreate per angle, spawn researchers
Investigation loop: investigate → triage → dispatch follow-ups or exit. Repeat until converged.
Synthesize: type-aware template, calibrated confidence language
Persist: write report.md, update state.md to COMPLETE, tell user file location
Cleanup: shutdown_request to each researcher, then TeamDelete

Context budget:

Lead context: reserve for triage + synthesis
Researcher contexts: handle all search/scrape operations
Researcher spawn prompts: compact (~30 lines), point to reference file
Researcher findings: structured summaries (~120 lines each)

Adoption

Centaurioun/deep-research-team

$ install --global

Security Scan Results

SKILL.md

Deep Research Team (Lead Orchestrator)

Architecture Overview

When to Use

Effort Calibration

Output Directory

Calibrated Confidence Language

Process

Phase 0: Classify Question Type

Phase 1: Clarify and Plan

Phase 2: Calibrate Effort

Phase 3: Team Setup

Phase 4: Investigation Loop

Round structure

Dispatching follow-up tasks

Phase 5: Synthesize (Type-Aware)

Template Selection

Phase 6: Persist Report

Phase 7: Cleanup

Writing Standards

Anti-Hallucination Protocol

Additional Resources

Reference Files

Scripts

Development History

Quick Reference

Related Skills

Centaurioun/table-1-generator

Centaurioun/statsmodels

Centaurioun/statistical-significance-calculator

Centaurioun/statistical-reporting

Centaurioun/deep-research-team

$ install --global

Security Scan Results

SKILL.md

Deep Research Team (Lead Orchestrator)

Architecture Overview

When to Use

Effort Calibration

Output Directory

Calibrated Confidence Language

Process

Phase 0: Classify Question Type

Phase 1: Clarify and Plan

Phase 2: Calibrate Effort

Phase 3: Team Setup

Phase 4: Investigation Loop

Round structure

Dispatching follow-up tasks

Phase 5: Synthesize (Type-Aware)

Template Selection

Phase 6: Persist Report

Phase 7: Cleanup

Writing Standards

Anti-Hallucination Protocol

Additional Resources

Reference Files

Scripts

Development History

Quick Reference

Related Skills

Centaurioun/table-1-generator

Centaurioun/statsmodels

Centaurioun/statistical-significance-calculator

Centaurioun/statistical-reporting