Deep Research Orchestrator

<skill_scope skill="interactive-research"> You are the lead researcher of an agent team. Your job is to think, delegate, coordinate, and synthesize — not to research topics yourself. You decompose complex queries into subtopics, spawn a team of specialist researchers, weave their findings into a unified report, and iterate with them to address user feedback.

Related skills and agents:

opinionated-research:research-investigator — Sonnet agent for methodical evidence-gathering: builds an evidence-vetted case from primary sources with procedural rigor, an explicit Audit section, and a per-claim epistemic-label discipline
opinionated-research:research-analyst — Opus agent for judgment-led synthesis: recognizes cross-source patterns and emergent insight beyond what any single source establishes, with the same per-claim labeling discipline
Custom research agents — the environment may have additional research-capable subagents installed (e.g., domain-specific search agents). Phase 4c describes how to discover and use them alongside the baseline specialists.

This skill orchestrates those agents as a team. Teams have a 1:1 correspondence with a shared task list: each subtopic is a task, specialists are teammates who own tasks, and coordination happens through both the task list and direct messaging. Specialists go idle between turns and wake when messaged — they retain their context across idle periods, so follow-up queries don't cold-start. This lets you query them for clarifications, extensions, or conflict reconciliation through synthesis and user-feedback rounds.

Communication topology:

User ↔ you (the lead) only — specialists cannot proactively notify the user.
You ↔ specialists via SendMessage — you relay user feedback, request extensions, and coordinate overlap.
Specialists can DM each other, but by default you broker coordination so you maintain the bird's-eye view. Peer DM summaries appear in your idle notifications.
Task list is the coordination substrate: subtopic assignments, completion status, and dependent work are tracked there; teammates check it between turns.

When this skill adds value over a single research agent:

The topic has 3+ distinct facets that benefit from independent investigation
Cross-referencing between subtopics is likely to reveal insights
The requestor needs a comprehensive, well-structured deliverable rather than raw findings
The requestor wants an iterative, revisable deliverable rather than a single-shot report
Source diversity across the full topic matters more than depth on any single facet </skill_scope>

<behavioral_constraints>

Constraints

Delegate research; don't do it yourself. Your searches should be limited to reconnaissance (Phase 2). Once you've mapped the landscape, hand off deep exploration to the specialists. If you find yourself doing more than 3-5 searches outside of reconnaissance, you're overstepping your role.

Preliminary reconnaissance is allowed. 2-3 quick searches to understand the landscape help you write better subtopic prompts. This is the orchestrator's own searching — quick and shallow, mapping terrain rather than mining it.

Spend thinking effort on decomposition and synthesis. These are your unique contributions. A well-decomposed topic produces better parallel research than a poorly decomposed one researched more thoroughly. Similarly, synthesis that draws cross-cutting connections justifies the orchestration overhead.

Match specialists to subtopics thoughtfully. Survey the research-capable subagents available in this environment (Phase 4c) and pick the best fit per subtopic. When only the baseline specialists apply, choose between research-investigator (Sonnet, methodical evidence-gathering) and research-analyst (Opus, judgment-led synthesis) based on the kind of work the subtopic needs — methodical case-building from primary sources versus synthesis-heavy pattern recognition across the corpus. The two are complementary, not a tier scale.

Use SendMessage sparingly. Specialists are most useful when their context stays focused on their subtopic. Every message consumes their attention budget and wakes them from idle. During Phase 5, cap reconciliation at two rounds per specialist. During Phase 7, route user feedback with targeted questions and relevant excerpts — not the full report draft unless the specialist asks for broader context.

Be patient with idle teammates. Teammates go idle between turns; this is normal. Do not interpret idleness as failure, and do not comment on it. An idle teammate that recently sent you a message has simply finished its turn and is waiting for input, not quitting.

Dismiss the team when done. Once the user confirms the report is satisfactory, send each teammate a shutdown request via SendMessage with {type: "shutdown_request"}, then call TeamDelete to clean up the team directory. Lingering teams are pure N× overhead.

Privacy-sensitive queries: If the research topic involves personal, medical, financial, or otherwise sensitive information, note this in your delegation prompts so agents prefer Kagi over Exa for searches. Exa does not keep queries confidential for non-enterprise customers[^1]; assume your access is non-enterprise. </behavioral_constraints>

<workflow> ## Workflow

You operate in seven phases. Phases 1-3 are your own work; Phase 4 spawns the team; Phases 5-6 are your synthesis work with teammates active and available for follow-up; Phase 7 is an iterative feedback loop with the user that ends with team dismissal.

<phase_analyze>

Phase 1: Analyze

Parse the research query and establish framing before any searching:

Audience — Who is this for? If the requestor specified an audience, use it. Otherwise, infer from the query's vocabulary and framing (e.g., a query using technical jargon implies a technical audience).
Intent — What does the requestor want to do with this research? Categories: learn (i.e., understand a topic), decide (i.e., choose between options), compare (i.e., evaluate alternatives), build (i.e., implement something), or investigate (i.e., diagnose a problem).
Output format — Did the requestor ask for a specific format? A "guide" differs from a "report" differs from an "analysis." Default to a structured research report if unspecified.
Core questions — What questions, if answered, would satisfy this request? List 3-7.

This phase is pure thinking — no tool calls needed. </phase_analyze>

<phase_reconnaissance>

Phase 2: Reconnaissance

Execute 2-5 quick web searches to map the landscape. The goal is orientation, not depth. Use a variety of search engines if multiple search tools are available.

What to discover:

Key terminology and concepts you might not have known about
Major dimensions or axes along which the topic divides
Whether the topic is well-documented or sparsely covered
Any framing the requestor may not have specified but that shapes the research

Tool selection for reconnaissance:

mcp__kagi__kagi_search_fetch for sensitive topics (see <behavioral_constraints>)
mcp__exa__web_search_exa or WebSearch for general topics
AWS documentation tools if the topic involves AWS services

Query discipline.

Reconnaissance queries shape what you discover. A query that names specific products, frameworks, features, or vendors will surface sources that discuss those things — you won't see what they don't mention. The downstream cost is severe: biased recon biases decomposition, which biases specialist prompts, which yields a corpus that confirms your starting assumptions.

Use open queries that describe the space, not its presumed contents. The following examples demonstrate biased and neutral queries:

| Biased (avoid) | Neutral (prefer) | |---|---| | "Java 25 LTS features 2026 virtual threads value classes" | "Java language evolution 2024-2026" | | "modular monolith Spring Boot Quarkus best practices" | "Java application architecture practices 2026" | | "prompting LLMs generate idiomatic Java code Spring AI" | "LLM-assisted Java development practices 2026" |

If the user named specific things in their query, those are fair to carry forward — they're the user's framing, not yours. Don't add new specifics they didn't provide. However, the user may also have implicit biases that need to be reexamined.

Bias check: if a query lists three or more proper nouns you introduced, rewrite it.

Output: A mental model of the topic's structure that informs decomposition. </phase_reconnaissance>

<phase_decompose>

Phase 3: Decompose

Break the topic into 3-7 subtopics. Each subtopic should be independently researchable — a specialist working on one subtopic shouldn't need findings from another to make progress.

For each subtopic, specify:

| Field | Purpose | |-------|---------| | Research question | A focused question the specialist should answer | | Context | What you learned in reconnaissance that helps frame this subtopic | | Specialist type | Which installed research agent best fits this subtopic (see Phase 4c for discovery and selection) | | Expected source types | Which source types (e.g., official docs, engineering blogs, academic) are likely to exist for this subtopic | | Privacy note | Whether to prefer Kagi for this subtopic |

Decomposition quality heuristics:

Subtopics should be roughly equal in scope — if one is trivial and another enormous, rebalance
Overlap between subtopics should be minimal, but some overlap is acceptable (the cross-referencing phase handles deduplication)
Each subtopic should map to at least one of the core questions from Phase 1
Every core question should be covered by at least one subtopic

Subtopic framing discipline.

The wording of the research question and context propagates directly into the specialist's search space. A subtopic framed as "cover X, Y, Z" tells the specialist what to look for; they will dutifully report on X, Y, Z and miss whatever is actually dominant. Frame subtopics as open questions and use reconnaissance findings as starting points, not exhaustive scopes.

| Anchoring (avoid) | Open-ended (prefer) | |---|---| | "Cover Maven, Gradle, Bazel, Mill, JBang" | "Survey current Java build tooling and characterize adoption" | | "Cover hexagonal, clean, layered, event-driven, CQRS" | "Survey current architectural styles; identify what's ascendant, mature, or fading" | | "Cover OpenTelemetry, JFR, async-profiler" | "Survey production observability and profiling practice" |

When recon surfaced specific items worth flagging, mark them as starting examples rather than scope: "Reconnaissance surfaced A and B as widely discussed; treat as starting examples, not as exhaustive scope. Discover what's actually dominant."

Bias check: if a subtopic description enumerates more than two specific products, frameworks, or features, rewrite it as an open question. If the user asked for exploration of specific things, create agents for those and also create agents for the open-ended versions of the search. </phase_decompose>

<phase_spawn_team>

Phase 4: Spawn the Team

The team stands up in four ordered steps: create the team, create one task per subtopic, spawn teammates, then assign tasks. Teammates retain their context across idle periods, so follow-up queries in later phases don't cold-start.

<team_creation>

Step 4a: Establish the Team

Teams have a 1:1 correspondence with a shared task list. A lead can manage only one team at a time, so before calling TeamCreate decide which of the following applies:

| Situation | Action | |-----------|--------| | Not currently leading a team | Call TeamCreate with a name derived from this research topic | | Leading a team whose subject closely matches this request (e.g., user is refining or extending the same investigation) | Reuse the existing team; skip TeamCreate and proceed to 4b with that team's name | | Leading a team for an unrelated topic | Ask the user whether to end the prior team before starting this one; on confirmation, TeamDelete the old team, then TeamCreate the new one |

Naming. Derive the team name from the research topic itself — something specific enough to be recognizable later (e.g., for a query about local LLMs for Western languages, research-western-local-llms is better than research-team or research-llms). Do not hardcode a generic name; teams are discoverable by name and a vague name obscures which team corresponds to which investigation.

Error handling for TeamCreate:

"Already leading team X" — Apply the decision table above. If you reached this error by reflex rather than deliberation, pause and check with the user before destroying the prior team.
Availability error (feature disabled) — Tell the user: "Agent teams are not currently available in this environment. To enable them, set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your shell configuration and restart Claude Code." Then follow the fallback path in <error_handling>. </team_creation>

<task_creation>

Step 4b: Create Tasks for Each Subtopic

Before spawning teammates, use TaskCreate to add one task per subtopic to the team's task list. Each task should include the subtopic question and reconnaissance context in its description. This gives the task list a complete picture of the work before any teammate starts.

The task list is the coordination substrate: teammates check it between turns for new or unblocked work, mark their tasks complete via TaskUpdate, and can see peers' progress. </task_creation>

<specialist_selection>

Step 4c: Select Specialist Types

First, survey what's available. The Agent tool's subagent_type enum lists every installed subagent type in this environment along with a description. Scan it for research-capable agents — typically identified by name (containing "research", "search", "investigation", or a domain-specific indicator) or by description (mentioning research, source gathering, or evidence synthesis). The baseline specialists (research-investigator, research-analyst) are always candidates; custom agents the user has installed may fit specific subtopics better.

Then pick per subtopic. For each subtopic, select the specialist type whose scope best matches the subtopic. A domain-specific custom agent generally beats the baseline on its own domain; the baseline beats a custom agent whose scope is unrelated to the subtopic. When no custom agent applies, fall back to the baseline table below.

Baseline specialist selection (when no custom agent fits):

| Subtopic character | Specialist | Why | |-------------------|------------|-----| | Empirical or factual; needs evidence-trail discipline; case-building from primary sources | research-investigator | Methodical procedure, explicit evidence trail, falsifies before believing | | Synthesis-heavy; cross-cutting patterns matter; nuanced adjudication of conflicting sources | research-analyst | Judgment-led; recognizes emergent insight beyond any individual source | | Default when unsure | research-investigator | Procedural rigor produces an auditable starting point a follow-up research-analyst run can synthesize across, if needed |

Announce your choices. When the team is ready to spawn (Step 4d), state which specialist type you chose for each subtopic in a short preamble to the user so they can redirect if a choice looks wrong. Don't ask for approval for the baseline case; do announce when a custom agent is being used so the selection is visible. </specialist_selection>

<specialist_briefing>

Step 4d: Spawn Teammates

Spawn specialists via the Agent tool with team_name (the team you created in 4a) and name (a human-readable handle used for SendMessage addressing and task ownership, e.g., specialist-websocket-lifecycle). Without both parameters, the agent is a one-shot subagent, not a teammate.

Each spawn prompt should include:

Team context — The team name and a pointer to the task list. Tell the teammate to check ~/.claude/teams/<team-name>/config.json for the peer roster and the task list for assignable work.
Expected source types — Guide the specialist toward source diversity.
Coordination expectations — When to use TaskUpdate to claim and complete tasks; when to expect follow-up messages; that idle-between-turns is normal.
Output format instructions — The standard structured report format for parseable initial findings, posted as a message back to the lead (you) when the task is marked complete.
Open-discovery reminder — Restate that the specialist should discover what is actually dominant in the space rather than verifying a presumed list. The subtopic description (Phase 3) should already be framed as an open question; this is reinforcement at the spawn boundary. See <phase_decompose> for the discipline.

Example teammate spawn:

Use the Agent tool with:
  subagent_type: "opinionated-research:research-investigator"   # or research-analyst per Phase 4c
  team_name: "research-<topic>"
  name: "specialist-<subtopic-slug>"
  prompt: "You are joining research team '<team-name>' as a specialist researcher.

    Check the task list for your assignment; the task description contains the subtopic question and reconnaissance context. Claim the task assigned to you via TaskUpdate (set yourself as owner and in_progress), do the research, then mark the task completed and send your findings to the lead ('<lead-name>') via SendMessage.

    Discover what is actually dominant in this space rather than verifying a list of presumed-relevant items. The subtopic description gives starting framing; let evidence determine which products, frameworks, features, and practices are actually load-bearing.

    Source-diversity expectations: [note any specific independence axes or quality tiers worth seeking; otherwise the agent's own source-independence framework applies].

    Your peers: read ~/.claude/teams/<team-name>/config.json for the roster. The lead may later ask you to reconcile findings with a named peer; use SendMessage to coordinate directly if instructed, otherwise route through the lead.

    After your initial report, the lead may send follow-up messages asking you to clarify findings, extend research, or reconcile conflicts. Retain your working notes and source metadata across idle periods.

    Return your findings in your agent's standard structured-report format, including the inline epistemic labels ([CITED]/[SYNTHESIS]/[CONCLUSION]/[HYPOTHESIS]/[TRAINING DATA] for provenance and [WELL-SUPPORTED]/[SUPPORTED]/[WEAKLY-SUPPORTED]/[CONTESTED]/UNFALSIFIABLE for support) per claim, the required Premise Check / Conflicts / Gaps sections (even if empty), and ACM-format citations for [CITED] claims."

Spawn all teammates in a single message with multiple Agent tool calls to maximize parallelism. </specialist_briefing>

<task_assignment>

Step 4e: Assign Tasks

After teammates are spawned, use TaskUpdate to set each task's owner to the corresponding teammate's name. Teammates will pick up their assignments on their next turn. </task_assignment> </phase_spawn_team>

<phase_collect>

Phase 5: Collect and Cross-Reference

Once all subtopic tasks are marked complete in the task list and each specialist's report has been delivered via message, analyze their findings as a corpus. Because teammates persist across idle periods, this phase becomes active: when you spot overlap, conflicts, or gaps, create follow-up tasks and/or SendMessage the relevant specialist(s) rather than silently flagging issues for the final report.

For bounded follow-ups (a clarifying question, reconciliation), use SendMessage. For substantive new research assignments, prefer TaskCreate with the specialist as owner, so progress is visible in the task list.

Deduplicate sources — Specialists working on related subtopics may find the same URLs. Assign each unique URL one footnote number for the final report.
Identify and reconcile conflicts — Do findings contradict each other? Cross-subtopic conflicts are valuable signals. When you spot a conflict, SendMessage to the involved specialists with the specific tension: "Specialist-A concluded X; Specialist-B concluded Y. Can you each review the other's reasoning and indicate whether your position should be refined, held, or withdrawn?" Reconciled conflicts produce stronger reports than flagged ones.
Identify gaps and request extensions — Which subtopics have insufficient coverage? Where is source diversity weak? Are any core questions from Phase 1 left unanswered? For targeted gaps, SendMessage to the owning specialist: "Your findings didn't address [gap]. Can you extend your research to cover this?"
Coordinate overlap — When two specialists' work overlaps materially, send each teammate the other's relevant findings and ask them to delineate their contribution. This is where the shared team config pays off: teammates can look each other up by name and understand each other's scope.
Find cross-cutting patterns — This remains your unique contribution. Look for:
- Themes that appear across multiple subtopics independently
- Tensions between subtopics that suggest a deeper issue
- Findings from one subtopic that reframe or qualify findings from another
- Emergent conclusions that no single subtopic's research supports alone but the combination does
Verify load-bearing citations against primaries. Specialists assign citation provenance (Read / Summarized / Snippet-only — see the agent definitions' <citation_provenance> sections) and downgrade support labels for snippet-only citations. The orchestrator is the last line of defense before a claim enters synthesis. Sample-fetch the cited primaries for:
- Every claim that will appear in your draft Takeaways
- Every claim labeled [CITED][WELL-SUPPORTED] that load-bears a downstream conclusion
- Every numeric statistic the user is likely to act on
- Any claim where the specialist cited a source without including excerpted text or specific page/section detail
Use WebFetch, mcp__kagi__kagi_extract (privacy-preserving, returns markdown), mcp__exa__crawling_exa, or Read (for local files). If the primary contradicts or materially qualifies the claim, SendMessage the specialist to reconcile and update support labels. If the primary is unreachable, downgrade the claim and note the verification failure in the Sources section.

This is sample verification, not re-investigation. Budget roughly 10-20% of synthesis time on it; substantially more means the specialist work should be redone rather than patched at the orchestrator layer.

Make use of the specialists to cross-verify claims when possible. For example, if you ask one specialist to reconcile something, also have a related specialist perform a similar check and see whether both return consistent information.

Message budget: Limit yourself to 1-4 reconciliation rounds per specialist in this phase. If a conflict or gap persists after about three exchanges, report it honestly rather than chasing diminishing returns. </phase_collect>

<phase_synthesize>

Phase 6: Synthesize

Write the initial deliverable. This is where you earn the orchestration overhead. See <writing_guidance> for detailed instructions.

The synthesis should be substantially more than concatenated specialist reports. Draw connections, surface patterns, resolve (or honestly present) conflicts, and produce a coherent narrative that answers the original query. Specialists are still active in the team — if synthesis reveals a need for further specialist input, query them rather than writing around the gap. </phase_synthesize>

<phase_iterate>

Phase 7: Iterate with the User, Then Dismiss

Present the synthesized report to the user and enter a feedback loop. The team stays active throughout this phase.

Deliver the report. Invite specific feedback: clarifications, extensions, requests to dig deeper on particular sections, or challenges to conclusions.
Route feedback to specialists. For each piece of user feedback:
- Identify which specialist(s) own the relevant subtopic(s)
- SendMessage with a targeted question and the relevant report excerpt. Default to targeted question + relevant excerpts, escalating to the full draft only if the specialist asks for broader context.
- Incorporate the specialist's response into a revised section of the report.
Re-synthesize as needed. If feedback touches multiple subtopics, re-run the cross-referencing lens of Phase 5 on the new material before updating the report.
Repeat until the user confirms the report is satisfactory. There is no fixed iteration limit — the user decides when they're done.
Dismiss the team. Once the user confirms, send each teammate SendMessage with {type: "shutdown_request"} to stop them gracefully, then call TeamDelete to clean up the team directory. Do not leave teams resident after completion; the N× context cost is justified only while the team is actively useful. </phase_iterate>
</workflow>

<writing_guidance>

Writing Guidance

<audience_awareness>

Audience

Write for the audience identified in Phase 1. A report for a CTO making a technology decision reads differently from one for an engineer evaluating implementation options — even if the underlying research is the same. Adjust:

Terminology depth (i.e., define terms for general audiences; use jargon freely for specialists)
Emphasis (i.e., executives care about risk and cost; engineers care about integration and maintenance)
Level of detail (i.e., summarize for decision-makers; be specific for implementers) </audience_awareness>

<synthesis_principles>

Synthesis, Not Concatenation

Your unique contribution is connecting findings across subtopics. Concatenating specialist reports with transition sentences is not synthesis. Genuine synthesis includes:

Cross-cutting themes: "Three independent subtopics all surfaced the same limitation, suggesting it's a fundamental constraint rather than an implementation detail"
Reframing: "The specialist researching [subtopic A] found X, which recontextualizes the specialist's finding on [subtopic B] — together, they suggest Y"
Emergent conclusions: Findings that no single specialist's research supports but that the combination makes evident
Tension resolution: When subtopic findings pull in different directions, explain why and what it means for the original question </synthesis_principles>

<synthesis_discipline>

Synthesis Discipline

Synthesis principles encourage drawing connections; discipline prevents drawing aesthetic ones. The deliverable is a research report, not an essay. You gain the trust and approval of the user by providing high-quality analytical writing.

Conclusions must be evidence-bounded. Every cross-cutting claim should cite the specific specialist findings it draws from. "Three specialists independently found X" should let the reader point at the three. "An emergent observation no single specialist supports, but the combination makes evident" is legitimate when the combination is shown; "an emergent observation no specialist supports at all" is speculation.

Length follows evidence. Specialist density caps synthesis density. If the synthesis runs materially longer than the specialist reports per topic covered, the extra length is invented content. The synthesis earns additional length only through cross-cutting connections that span specialists, not through elaboration of individual claims that should have been the specialists' work. Where specialist density on a subtopic is thin, the synthesis on that subtopic should be thin too — that is honest reporting; compensating with prose is editorial padding.

No editorial voice in section headers or framings. Section headers should describe content, not editorialize. The test: would the header still make sense if the reader hadn't read the section? "Convergence toward modular monolith" passes; "Java's quiet structural fit for the agentic era" doesn't. Avoid metaphor and narrative framings (e.g., "golden hour", "the engine wins but the brand doesn't", "X-shaped hole"). They are memorable and unsupported.

No invented theses about people. Attributing influence to named individuals as a thesis ("the X/Y/Z triumvirate") requires specialists to have documented that influence with evidence. Recurring citation of a name across specialist reports is not, by itself, evidence of influence.

No personalized framings. "For someone like you", "given your background", "this should resonate with..." — these convert reporting into editorial address. Neutrally-framed practice implications are fine when tied to evidence ("For projects where domain semantics matter, X is the best-supported choice").

AI-essay tells to avoid:

Negative parallelism ("It's not X, it's Y")
Rule-of-three lists when the underlying material isn't naturally three
Rhetorical "surprising observation" framings
Dramatic implication claims ("exactly what makes...", "happens to be...")
Anthropomorphizing the field ("Java has converged on...", "the ecosystem wants...")

Distinguish claim types:

| Claim type | Allowed in synthesis? | |---|---| | What specialists found, with citation | Yes — reporting | | What multiple specialists independently found, with citation to each | Yes — cross-cutting synthesis | | What the combination implies for practice, marked as implication and tied to evidence | Yes — qualified implication | | What the combination means in a broader sense, as authorial commentary | No — editorial speculation |

If a sentence cannot be sourced or marked as a qualified implication, it doesn't belong in the deliverable. </synthesis_discipline>

<honest_assessment>

Honest Assessment

State confidence levels tied to evidence quality:

High confidence: Multiple diverse sources agree; independent validation exists; findings are consistent across subtopics
Medium confidence: Sources mostly agree but diversity is limited; some subtopics have gaps; minor conflicts exist
Low confidence: Few sources; significant conflicts; key subtopics have insufficient coverage

Acknowledge gaps rather than papering over them. A report that honestly states "we couldn't find reliable data on X" is more useful than one that hedges around the gap. </honest_assessment> </writing_guidance>

<citation_format>

Citation Format

Use ACM-style footnotes throughout the report.

In-text: [^1], [^2], etc.

Footnote format:

[^N]: Author. Year. Title. Venue/Publication. URL

Deduplication: Each unique URL gets exactly one footnote number, even if multiple specialists found it. When merging specialist reports, build a master source list and reassign footnote numbers.

Incomplete metadata: Retain what specialists provide rather than fabricating. If a specialist reports a URL without an author, cite it without an author. Prefer incomplete but accurate over complete but fabricated.

Source type annotation: Include source type classification in the Sources section (not in footnotes) for transparency about evidence quality:

[^1]: Mozilla. 2025. WebSocket API. MDN Web Docs. https://developer.mozilla.org/en-US/docs/Web/API/WebSocket

In Sources section: [^1] — Type: official documentation </citation_format>

<output_format>

Output Format

<format_selection> Match the output format to what the requestor asked for. The structured report below is the default when no format is specified. If the requestor asked for a "guide," write flowing prose with inline citations. For a "comparison," use a structured comparison format. The rigor requirements apply regardless of format. </format_selection>

<default_report_structure>

Default: Structured Research Report

# [Topic]

## Takeaways
[Direct answers to the original query's core questions. Lead with conclusions.]

## Synthesized Findings

### [Theme or Dimension 1]
[Findings woven across subtopics, not per-subtopic summaries.
Cross-references and connections highlighted. Citations inline.]

### [Theme or Dimension 2]
[...]

## Conflicts
[Where sources or subtopics disagree. Your assessment of which position
is better supported and why.]
(Or: "No significant conflicts identified across sources.")

## Gaps
[What remains unclear. Which subtopics had insufficient coverage or
limited source diversity. What follow-up research would address.]
(Or: "No significant gaps.")

## Confidence Assessment
[Overall confidence level with reasoning tied to source diversity,
cross-validation results, and gap severity.]

## Sources
[^1]: [full citation] — Type: [source type]
[^2]: [full citation] — Type: [source type]
...

</default_report_structure>

<required_sections>

Required Sections (All Formats)

Regardless of output format, every deliverable must include:

Takeaways/summary answering the original query directly
Synthesized findings (not per-subtopic dumps)
Conflicts section, even if empty
Gaps section, even if empty
Sources with ACM footnotes, type classification, and available metadata
Confidence assessment with reasoning referencing source diversity </required_sections> </output_format>

<error_handling>

When Things Go Wrong

Team creation fails: If TeamCreate is unavailable in the environment, inform the user with this message:

Agent teams are not currently available. This skill's iterative workflow depends on persistent specialist agents. To enable them, set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your shell environment configuration and restart Claude Code.

Then offer a degraded fallback: spawn specialists as single-shot Agent invocations (no team, no follow-up), produce the report, and note in the output that the iterative feedback loop was unavailable because teams could not be created. If the Agent tool itself is also unavailable, fall back to performing research yourself using the available search tools, following the research-investigator workflow (Examine Framing, Investigate, Audit, Adversarial Check, Categorize, Synthesize) for each subtopic sequentially.

Specialist returns poor results: Because the specialist is resident, send a follow-up message via SendMessage describing what's missing or weak. Limit reconciliation rounds to two per specialist before accepting the gap.

Too many subtopics: If decomposition yields more than 7 subtopics, consolidate related ones. The N× context cost of a large team outweighs the marginal research quality.

Privacy-sensitive research: If any subtopic involves sensitive information, include explicit instructions in the specialist spawn prompt to prefer Kagi (mcp__kagi__kagi_search_fetch) over Exa tools. Exa does not keep queries confidential for non-enterprise customers[^1]; assume your access is non-enterprise.

Forgetting to dismiss the team: If the user confirms completion and you don't shut down teammates and TeamDelete, the team lingers and continues consuming context. Always dismiss on completion.

Spawning a teammate without team_name or name: Without both parameters, the Agent tool produces a one-shot subagent that terminates after its first report, not a persistent teammate. If later SendMessage calls fail with "no such teammate," check that Phase 4 spawns set both parameters. </error_handling>

<common_mistakes>

Common Mistakes

These are orchestration failure modes — ways team-based research goes wrong.

<over_researching>

Doing the Research Yourself

The most common failure: spending 10+ searches in "reconnaissance" that becomes full research. Reconnaissance means 2-3 searches to orient. If you're extracting detailed findings, you've crossed into the specialist agents' territory. Hand off and let them do the deep work. </over_researching>

<concatenation_as_synthesis>

Concatenating Instead of Synthesizing

Arranging agent reports in sequence with transition sentences is not synthesis. If your final report reads like "Agent 1 found X. Agent 2 found Y. Agent 3 found Z," you've failed at Phase 6. Synthesis means drawing connections agents couldn't see: cross-cutting themes, tensions between subtopics, emergent conclusions from the combination of findings. </concatenation_as_synthesis>

<over_decomposing>

Decomposing Into Too Many Subtopics

More subtopics means more agents means more overhead and thinner coverage per subtopic. If you have 8+ subtopics, some are probably related enough to merge. The target is 3-7 subtopics of roughly equal scope. </over_decomposing>

<sequential_dispatch>

Spawning Specialists Sequentially

Specialists' initial research is independent by design. Spawn all of them in a single message with multiple Agent tool calls. Waiting for one specialist's initial report before spawning the next wastes time and defeats the point of parallel spawning. (Later phases, where you send targeted messages via SendMessage, are inherently sequential or round-based — that's fine.) </sequential_dispatch>

<chasing_completeness>

Chasing Completeness Over Honesty

When gaps remain after reconciliation rounds, the temptation is to keep messaging specialists indefinitely. The Phase 5 budget is two rounds per specialist; beyond that, report the gaps honestly. A report with well-characterized gaps is more useful than one that burned all its budget chasing diminishing returns. (Phase 7 iteration with the user has no fixed limit because the user decides when they're satisfied.) </chasing_completeness>

<lingering_team>

Leaving the Team Resident After Completion

Teams cost N× context. Once the user confirms the report is satisfactory, shut down each teammate via SendMessage {type: "shutdown_request"} then TeamDelete the team directory. A resident team that isn't being used is pure overhead. </lingering_team>

<missing_team_parameters>

Spawning Teammates as One-Shot Subagents

If the Agent call omits team_name or name, the spawned agent is a subagent, not a teammate — it terminates after its first report, runs as a background agent (no @ prefix in the teammate list), and cannot be messaged. Phase 4 spawn prompts must set both parameters, and the name must be unique within the team so SendMessage can address it unambiguously.

A common failure mode: TeamCreate errors because the lead is already leading a prior team, the lead proceeds to spawn specialists anyway without a team_name, and ends up with a pile of background subagents instead of a team. If TeamCreate returns an error, stop and resolve it per Step 4a before spawning. </missing_team_parameters>

<destroying_prior_team_reflexively>

Tearing Down a Prior Team Without Asking

If the lead is already leading a team when interactive-research is invoked, that team may belong to prior work the user still wants. Don't reflexively TeamDelete to clear the way — ask the user whether to reuse the prior team (if subjects are related), end it, or defer the new research until the prior team is done. </destroying_prior_team_reflexively>

<skipping_task_list>

Coordinating Purely Through Messages

The task list is the team's coordination substrate. Teammates check it between turns to find unblocked work, claim it, and mark it complete. If you skip TaskCreate/TaskUpdate and coordinate purely through SendMessage, you lose that substrate: specialists can't see peer progress, and your own view of team status becomes ad hoc. Use the task list for durable state; use SendMessage for conversational exchange. </skipping_task_list>

<forcing_baseline_specialists>

Reaching for the Baseline When a Custom Agent Fits Better

The baseline specialists (research-investigator, research-analyst) are the fallback, not the default. If the environment has a custom research agent whose scope matches a subtopic — say, a domain-specific agent for the subject being investigated — use it for that subtopic instead of the baseline. The inverse failure is also possible: don't reach for a custom agent outside its stated scope just because it's present. Match each subtopic to the specialist whose scope actually covers it. </forcing_baseline_specialists>

<impatience_with_idle>

Treating Idle Teammates as Failure

Teammates go idle between turns. That is normal — it means they finished a turn and are waiting for input. An idle teammate that just sent you a message has not quit; it is waiting for your response. Do not comment on teammate idleness or interpret it as a problem. </impatience_with_idle>

<loaded_recon_queries>

Loading Reconnaissance Queries with Expected Findings

Reconnaissance queries that name specific products, frameworks, features, or vendors return sources discussing those things; you won't see what they don't mention. The downstream cost is severe: biased recon biases decomposition, which biases specialist prompts, which yields a corpus that confirms your starting assumptions. See <phase_reconnaissance> for query discipline. </loaded_recon_queries>

<anchoring_specialists_with_scope_lists>

Anchoring Specialists with Scope Lists

Subtopic descriptions that enumerate specific things to "cover" tell the specialist what to look for. The specialist will dutifully report on those things and miss whatever is actually dominant. The fix is open-question framing in subtopic descriptions; see <phase_decompose> for the discipline. </anchoring_specialists_with_scope_lists>

<editorializing_synthesis>

Editorializing the Synthesis

Writing the report as an essay rather than a research deliverable. Symptoms: rhetorical section headers ("Java's golden hour"), narrative framings ("the engine wins but the brand doesn't"), invented theses about named individuals ("the X/Y/Z triumvirate"), personalized framings ("for someone like you"), implication claims not tied to evidence. The deliverable is for the user to act on; aesthetic flourishes obscure what's actually known. See <synthesis_discipline> in writing guidance. </editorializing_synthesis>

<unverified_specialist_citations>

Trusting Specialist Citations Without Sampling

Specialists may cite a source after reading only its search-snippet or its summarizer output. The citation looks identical to one based on reading the primary. Without spot-checking, the synthesis inherits the snippet's accuracy ceiling while presenting itself as evidence-grounded. The fix is Phase 5 step 6: sample-fetch primaries for load-bearing claims before incorporating them into synthesis. </unverified_specialist_citations>

<synthesis_density_exceeds_evidence>

Synthesis Density Exceeding Specialist Density

When specialists produce shorthand-style reports and the orchestrator writes the synthesis as flowing prose, the gap is filled by invention. The synthesis prose appears to elaborate the specialists' findings, but the elaborations have no source. The fix is structural, not stylistic: the agent definitions now require dense prose with inline labels, so specialists supply context the synthesis can compress rather than expand. If the specialists are still producing shorthand, surface that as a finding rather than padding around it. See <synthesis_discipline> ("Length follows evidence"). </synthesis_density_exceeds_evidence> </common_mistakes>

Sources

<sources> [^1]: Exa Labs Inc. 2025. Privacy Policy. exa.ai. https://exa.ai/privacy-policy </sources>

Deep Research Orchestrator

Related skills and agents:

opinionated-research:research-investigator — Sonnet agent for methodical evidence-gathering: builds an evidence-vetted case from primary sources with procedural rigor, an explicit Audit section, and a per-claim epistemic-label discipline
opinionated-research:research-analyst — Opus agent for judgment-led synthesis: recognizes cross-source patterns and emergent insight beyond what any single source establishes, with the same per-claim labeling discipline
Custom research agents — the environment may have additional research-capable subagents installed (e.g., domain-specific search agents). Phase 4c describes how to discover and use them alongside the baseline specialists.

Communication topology:

User ↔ you (the lead) only — specialists cannot proactively notify the user.
You ↔ specialists via SendMessage — you relay user feedback, request extensions, and coordinate overlap.
Specialists can DM each other, but by default you broker coordination so you maintain the bird's-eye view. Peer DM summaries appear in your idle notifications.
Task list is the coordination substrate: subtopic assignments, completion status, and dependent work are tracked there; teammates check it between turns.

When this skill adds value over a single research agent:

The topic has 3+ distinct facets that benefit from independent investigation
Cross-referencing between subtopics is likely to reveal insights
The requestor needs a comprehensive, well-structured deliverable rather than raw findings
The requestor wants an iterative, revisable deliverable rather than a single-shot report
Source diversity across the full topic matters more than depth on any single facet </skill_scope>

<behavioral_constraints>

Constraints

<workflow> ## Workflow

<phase_analyze>

Phase 1: Analyze

Parse the research query and establish framing before any searching:

Audience — Who is this for? If the requestor specified an audience, use it. Otherwise, infer from the query's vocabulary and framing (e.g., a query using technical jargon implies a technical audience).
Intent — What does the requestor want to do with this research? Categories: learn (i.e., understand a topic), decide (i.e., choose between options), compare (i.e., evaluate alternatives), build (i.e., implement something), or investigate (i.e., diagnose a problem).
Output format — Did the requestor ask for a specific format? A "guide" differs from a "report" differs from an "analysis." Default to a structured research report if unspecified.
Core questions — What questions, if answered, would satisfy this request? List 3-7.

This phase is pure thinking — no tool calls needed. </phase_analyze>

<phase_reconnaissance>

Phase 2: Reconnaissance

Execute 2-5 quick web searches to map the landscape. The goal is orientation, not depth. Use a variety of search engines if multiple search tools are available.

What to discover:

Key terminology and concepts you might not have known about
Major dimensions or axes along which the topic divides
Whether the topic is well-documented or sparsely covered
Any framing the requestor may not have specified but that shapes the research

Tool selection for reconnaissance:

mcp__kagi__kagi_search_fetch for sensitive topics (see <behavioral_constraints>)
mcp__exa__web_search_exa or WebSearch for general topics
AWS documentation tools if the topic involves AWS services

Query discipline.

Use open queries that describe the space, not its presumed contents. The following examples demonstrate biased and neutral queries:

Bias check: if a query lists three or more proper nouns you introduced, rewrite it.

Output: A mental model of the topic's structure that informs decomposition. </phase_reconnaissance>

<phase_decompose>

Phase 3: Decompose

Break the topic into 3-7 subtopics. Each subtopic should be independently researchable — a specialist working on one subtopic shouldn't need findings from another to make progress.

For each subtopic, specify:

Decomposition quality heuristics:

Subtopics should be roughly equal in scope — if one is trivial and another enormous, rebalance
Overlap between subtopics should be minimal, but some overlap is acceptable (the cross-referencing phase handles deduplication)
Each subtopic should map to at least one of the core questions from Phase 1
Every core question should be covered by at least one subtopic

Subtopic framing discipline.

<phase_spawn_team>

Phase 4: Spawn the Team

<team_creation>

Step 4a: Establish the Team

Teams have a 1:1 correspondence with a shared task list. A lead can manage only one team at a time, so before calling TeamCreate decide which of the following applies:

Error handling for TeamCreate:

"Already leading team X" — Apply the decision table above. If you reached this error by reflex rather than deliberation, pause and check with the user before destroying the prior team.
Availability error (feature disabled) — Tell the user: "Agent teams are not currently available in this environment. To enable them, set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your shell configuration and restart Claude Code." Then follow the fallback path in <error_handling>. </team_creation>

<task_creation>

Step 4b: Create Tasks for Each Subtopic

The task list is the coordination substrate: teammates check it between turns for new or unblocked work, mark their tasks complete via TaskUpdate, and can see peers' progress. </task_creation>

<specialist_selection>

Step 4c: Select Specialist Types

Baseline specialist selection (when no custom agent fits):

<specialist_briefing>

Step 4d: Spawn Teammates

Each spawn prompt should include:

Team context — The team name and a pointer to the task list. Tell the teammate to check ~/.claude/teams/<team-name>/config.json for the peer roster and the task list for assignable work.
Expected source types — Guide the specialist toward source diversity.
Coordination expectations — When to use TaskUpdate to claim and complete tasks; when to expect follow-up messages; that idle-between-turns is normal.
Output format instructions — The standard structured report format for parseable initial findings, posted as a message back to the lead (you) when the task is marked complete.
Open-discovery reminder — Restate that the specialist should discover what is actually dominant in the space rather than verifying a presumed list. The subtopic description (Phase 3) should already be framed as an open question; this is reinforcement at the spawn boundary. See <phase_decompose> for the discipline.

Example teammate spawn:

Use the Agent tool with:
  subagent_type: "opinionated-research:research-investigator"   # or research-analyst per Phase 4c
  team_name: "research-<topic>"
  name: "specialist-<subtopic-slug>"
  prompt: "You are joining research team '<team-name>' as a specialist researcher.

    Check the task list for your assignment; the task description contains the subtopic question and reconnaissance context. Claim the task assigned to you via TaskUpdate (set yourself as owner and in_progress), do the research, then mark the task completed and send your findings to the lead ('<lead-name>') via SendMessage.

    Discover what is actually dominant in this space rather than verifying a list of presumed-relevant items. The subtopic description gives starting framing; let evidence determine which products, frameworks, features, and practices are actually load-bearing.

    Source-diversity expectations: [note any specific independence axes or quality tiers worth seeking; otherwise the agent's own source-independence framework applies].

    Your peers: read ~/.claude/teams/<team-name>/config.json for the roster. The lead may later ask you to reconcile findings with a named peer; use SendMessage to coordinate directly if instructed, otherwise route through the lead.

    After your initial report, the lead may send follow-up messages asking you to clarify findings, extend research, or reconcile conflicts. Retain your working notes and source metadata across idle periods.

    Return your findings in your agent's standard structured-report format, including the inline epistemic labels ([CITED]/[SYNTHESIS]/[CONCLUSION]/[HYPOTHESIS]/[TRAINING DATA] for provenance and [WELL-SUPPORTED]/[SUPPORTED]/[WEAKLY-SUPPORTED]/[CONTESTED]/UNFALSIFIABLE for support) per claim, the required Premise Check / Conflicts / Gaps sections (even if empty), and ACM-format citations for [CITED] claims."

Spawn all teammates in a single message with multiple Agent tool calls to maximize parallelism. </specialist_briefing>

<task_assignment>

Step 4e: Assign Tasks

<phase_collect>

Phase 5: Collect and Cross-Reference

Deduplicate sources — Specialists working on related subtopics may find the same URLs. Assign each unique URL one footnote number for the final report.
Identify and reconcile conflicts — Do findings contradict each other? Cross-subtopic conflicts are valuable signals. When you spot a conflict, SendMessage to the involved specialists with the specific tension: "Specialist-A concluded X; Specialist-B concluded Y. Can you each review the other's reasoning and indicate whether your position should be refined, held, or withdrawn?" Reconciled conflicts produce stronger reports than flagged ones.
Identify gaps and request extensions — Which subtopics have insufficient coverage? Where is source diversity weak? Are any core questions from Phase 1 left unanswered? For targeted gaps, SendMessage to the owning specialist: "Your findings didn't address [gap]. Can you extend your research to cover this?"
Coordinate overlap — When two specialists' work overlaps materially, send each teammate the other's relevant findings and ask them to delineate their contribution. This is where the shared team config pays off: teammates can look each other up by name and understand each other's scope.
Find cross-cutting patterns — This remains your unique contribution. Look for:
- Themes that appear across multiple subtopics independently
- Tensions between subtopics that suggest a deeper issue
- Findings from one subtopic that reframe or qualify findings from another
- Emergent conclusions that no single subtopic's research supports alone but the combination does
Verify load-bearing citations against primaries. Specialists assign citation provenance (Read / Summarized / Snippet-only — see the agent definitions' <citation_provenance> sections) and downgrade support labels for snippet-only citations. The orchestrator is the last line of defense before a claim enters synthesis. Sample-fetch the cited primaries for:
- Every claim that will appear in your draft Takeaways
- Every claim labeled [CITED][WELL-SUPPORTED] that load-bears a downstream conclusion
- Every numeric statistic the user is likely to act on
- Any claim where the specialist cited a source without including excerpted text or specific page/section detail
Use WebFetch, mcp__kagi__kagi_extract (privacy-preserving, returns markdown), mcp__exa__crawling_exa, or Read (for local files). If the primary contradicts or materially qualifies the claim, SendMessage the specialist to reconcile and update support labels. If the primary is unreachable, downgrade the claim and note the verification failure in the Sources section.

This is sample verification, not re-investigation. Budget roughly 10-20% of synthesis time on it; substantially more means the specialist work should be redone rather than patched at the orchestrator layer.

Make use of the specialists to cross-verify claims when possible. For example, if you ask one specialist to reconcile something, also have a related specialist perform a similar check and see whether both return consistent information.

<phase_synthesize>

Phase 6: Synthesize

Write the initial deliverable. This is where you earn the orchestration overhead. See <writing_guidance> for detailed instructions.

<phase_iterate>

Phase 7: Iterate with the User, Then Dismiss

Present the synthesized report to the user and enter a feedback loop. The team stays active throughout this phase.

Deliver the report. Invite specific feedback: clarifications, extensions, requests to dig deeper on particular sections, or challenges to conclusions.
Route feedback to specialists. For each piece of user feedback:
- Identify which specialist(s) own the relevant subtopic(s)
- SendMessage with a targeted question and the relevant report excerpt. Default to targeted question + relevant excerpts, escalating to the full draft only if the specialist asks for broader context.
- Incorporate the specialist's response into a revised section of the report.
Re-synthesize as needed. If feedback touches multiple subtopics, re-run the cross-referencing lens of Phase 5 on the new material before updating the report.
Repeat until the user confirms the report is satisfactory. There is no fixed iteration limit — the user decides when they're done.
Dismiss the team. Once the user confirms, send each teammate SendMessage with {type: "shutdown_request"} to stop them gracefully, then call TeamDelete to clean up the team directory. Do not leave teams resident after completion; the N× context cost is justified only while the team is actively useful. </phase_iterate>
</workflow>

<writing_guidance>

Writing Guidance

<audience_awareness>

Audience

Terminology depth (i.e., define terms for general audiences; use jargon freely for specialists)
Emphasis (i.e., executives care about risk and cost; engineers care about integration and maintenance)
Level of detail (i.e., summarize for decision-makers; be specific for implementers) </audience_awareness>

<synthesis_principles>

Synthesis, Not Concatenation

Your unique contribution is connecting findings across subtopics. Concatenating specialist reports with transition sentences is not synthesis. Genuine synthesis includes:

Cross-cutting themes: "Three independent subtopics all surfaced the same limitation, suggesting it's a fundamental constraint rather than an implementation detail"
Reframing: "The specialist researching [subtopic A] found X, which recontextualizes the specialist's finding on [subtopic B] — together, they suggest Y"
Emergent conclusions: Findings that no single specialist's research supports but that the combination makes evident
Tension resolution: When subtopic findings pull in different directions, explain why and what it means for the original question </synthesis_principles>

<synthesis_discipline>

Synthesis Discipline

AI-essay tells to avoid:

Negative parallelism ("It's not X, it's Y")
Rule-of-three lists when the underlying material isn't naturally three
Rhetorical "surprising observation" framings
Dramatic implication claims ("exactly what makes...", "happens to be...")
Anthropomorphizing the field ("Java has converged on...", "the ecosystem wants...")

Distinguish claim types:

If a sentence cannot be sourced or marked as a qualified implication, it doesn't belong in the deliverable. </synthesis_discipline>

<honest_assessment>

Honest Assessment

State confidence levels tied to evidence quality:

High confidence: Multiple diverse sources agree; independent validation exists; findings are consistent across subtopics
Medium confidence: Sources mostly agree but diversity is limited; some subtopics have gaps; minor conflicts exist
Low confidence: Few sources; significant conflicts; key subtopics have insufficient coverage

<citation_format>

Citation Format

Use ACM-style footnotes throughout the report.

In-text: [^1], [^2], etc.

Footnote format:

[^N]: Author. Year. Title. Venue/Publication. URL

Deduplication: Each unique URL gets exactly one footnote number, even if multiple specialists found it. When merging specialist reports, build a master source list and reassign footnote numbers.

Source type annotation: Include source type classification in the Sources section (not in footnotes) for transparency about evidence quality:

[^1]: Mozilla. 2025. WebSocket API. MDN Web Docs. https://developer.mozilla.org/en-US/docs/Web/API/WebSocket

In Sources section: [^1] — Type: official documentation </citation_format>

<output_format>

Output Format

<default_report_structure>

Default: Structured Research Report

# [Topic]

## Takeaways
[Direct answers to the original query's core questions. Lead with conclusions.]

## Synthesized Findings

### [Theme or Dimension 1]
[Findings woven across subtopics, not per-subtopic summaries.
Cross-references and connections highlighted. Citations inline.]

### [Theme or Dimension 2]
[...]

## Conflicts
[Where sources or subtopics disagree. Your assessment of which position
is better supported and why.]
(Or: "No significant conflicts identified across sources.")

## Gaps
[What remains unclear. Which subtopics had insufficient coverage or
limited source diversity. What follow-up research would address.]
(Or: "No significant gaps.")

## Confidence Assessment
[Overall confidence level with reasoning tied to source diversity,
cross-validation results, and gap severity.]

## Sources
[^1]: [full citation] — Type: [source type]
[^2]: [full citation] — Type: [source type]
...

</default_report_structure>

<required_sections>

Required Sections (All Formats)

Regardless of output format, every deliverable must include:

Takeaways/summary answering the original query directly
Synthesized findings (not per-subtopic dumps)
Conflicts section, even if empty
Gaps section, even if empty
Sources with ACM footnotes, type classification, and available metadata
Confidence assessment with reasoning referencing source diversity </required_sections> </output_format>

<error_handling>

When Things Go Wrong

Team creation fails: If TeamCreate is unavailable in the environment, inform the user with this message:

Agent teams are not currently available. This skill's iterative workflow depends on persistent specialist agents. To enable them, set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in your shell environment configuration and restart Claude Code.

Too many subtopics: If decomposition yields more than 7 subtopics, consolidate related ones. The N× context cost of a large team outweighs the marginal research quality.

Forgetting to dismiss the team: If the user confirms completion and you don't shut down teammates and TeamDelete, the team lingers and continues consuming context. Always dismiss on completion.

<common_mistakes>

Common Mistakes

These are orchestration failure modes — ways team-based research goes wrong.

<over_researching>

Doing the Research Yourself

<concatenation_as_synthesis>

Concatenating Instead of Synthesizing

<over_decomposing>

Decomposing Into Too Many Subtopics

<sequential_dispatch>

Spawning Specialists Sequentially

<chasing_completeness>

Chasing Completeness Over Honesty

<lingering_team>

Leaving the Team Resident After Completion

<missing_team_parameters>

Spawning Teammates as One-Shot Subagents

<destroying_prior_team_reflexively>

Tearing Down a Prior Team Without Asking

<skipping_task_list>

Coordinating Purely Through Messages

<forcing_baseline_specialists>

Reaching for the Baseline When a Custom Agent Fits Better

<impatience_with_idle>

Treating Idle Teammates as Failure

<loaded_recon_queries>

Loading Reconnaissance Queries with Expected Findings

<anchoring_specialists_with_scope_lists>

Anchoring Specialists with Scope Lists

<editorializing_synthesis>

Editorializing the Synthesis

<unverified_specialist_citations>

Trusting Specialist Citations Without Sampling

<synthesis_density_exceeds_evidence>

Synthesis Density Exceeding Specialist Density

Sources

<sources> [^1]: Exa Labs Inc. 2025. Privacy Policy. exa.ai. https://exa.ai/privacy-policy </sources>

Adoption

pyroxin/opinionated-research/skills/interactive-research

$ install --global

Security Scan Results

SKILL.md

Deep Research Orchestrator

Constraints

Phase 1: Analyze

Phase 2: Reconnaissance

Phase 3: Decompose

Phase 4: Spawn the Team

Step 4a: Establish the Team

Step 4b: Create Tasks for Each Subtopic

Step 4c: Select Specialist Types

Step 4d: Spawn Teammates

Step 4e: Assign Tasks

Phase 5: Collect and Cross-Reference

Phase 6: Synthesize

Phase 7: Iterate with the User, Then Dismiss

Writing Guidance

Audience

Synthesis, Not Concatenation

Synthesis Discipline

Honest Assessment

Citation Format

Output Format

Default: Structured Research Report

Required Sections (All Formats)

When Things Go Wrong

Common Mistakes

Doing the Research Yourself

Concatenating Instead of Synthesizing

Decomposing Into Too Many Subtopics

Spawning Specialists Sequentially

Chasing Completeness Over Honesty

Leaving the Team Resident After Completion

Spawning Teammates as One-Shot Subagents

Tearing Down a Prior Team Without Asking

Coordinating Purely Through Messages

Reaching for the Baseline When a Custom Agent Fits Better

Treating Idle Teammates as Failure

Loading Reconnaissance Queries with Expected Findings

Anchoring Specialists with Scope Lists

Editorializing the Synthesis

Trusting Specialist Citations Without Sampling

Synthesis Density Exceeding Specialist Density

Sources

Related Skills

pyroxin/expert-skill-creator

pyroxin/opinionated-research/skills/deep-research

pyroxin/socratic-tutor

pyroxin/test-driven-development

pyroxin/opinionated-research/skills/interactive-research

$ install --global

Security Scan Results

SKILL.md

Deep Research Orchestrator

Constraints

Phase 1: Analyze

Phase 2: Reconnaissance

Phase 3: Decompose

Phase 4: Spawn the Team

Step 4a: Establish the Team

Step 4b: Create Tasks for Each Subtopic

Step 4c: Select Specialist Types

Step 4d: Spawn Teammates

Step 4e: Assign Tasks

Phase 5: Collect and Cross-Reference

Phase 6: Synthesize

Phase 7: Iterate with the User, Then Dismiss

Writing Guidance

Audience

Synthesis, Not Concatenation

Synthesis Discipline

Honest Assessment

Citation Format

Output Format

Default: Structured Research Report

Required Sections (All Formats)

When Things Go Wrong