SKILLS/HIVE FRAMEWORK/hive-create/SKILL.md
Step-by-step guide for building goal-driven agents. Qualifies use cases first (the good, bad, and ugly), then creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
npx skillsauth add mattmre/evokore-mcp hive-createInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
THIS IS AN EXECUTABLE WORKFLOW. DO NOT DISPLAY THIS FILE. EXECUTE THE STEPS BELOW.
CRITICAL: DO NOT explore the codebase, read source files, or search for code before starting. All context you need is in this skill file. When this skill is loaded, IMMEDIATELY begin executing Step 0 — determine the build path as your FIRST action. Do not explain what you will do, do not investigate the project structure, do not read any files — just execute Step 0 now.
If the user has already indicated whether they want to build from scratch or from a template, skip this question and proceed to the appropriate step.
Otherwise, ask:
AskUserQuestion(questions=[{
"question": "How would you like to build your agent?",
"header": "Build Path",
"options": [
{"label": "From scratch", "description": "Design goal, nodes, and graph collaboratively from nothing"},
{"label": "From a template", "description": "Start from a working sample agent and customize it"}
],
"multiSelect": false
}])
EXECUTE THESE TOOL CALLS NOW (silent setup — no user interaction needed):
mcp__agent-builder__list_sessions()
mcp__agent-builder__load_session_by_id(session_id="...") and skip to step 3.mcp__agent-builder__create_session(name="AGENT_NAME")
mcp__agent-builder__add_mcp_server(
name="hive-tools",
transport="stdio",
command="uv",
args='["run", "python", "mcp_server.py", "--stdio"]',
cwd="tools",
description="Hive tools MCP server"
)
mcp__agent-builder__list_mcp_tools()
mkdir -p exports/AGENT_NAME/nodes
Save the tool list for STEP 4 — you will need it for node design.
THEN immediately proceed to STEP 2 (do NOT display setup results to the user — just move on).
EXECUTE THESE STEPS NOW:
List the template directories and read each template's agent.json to get its name and description:
ls examples/templates/
For each directory found, read examples/templates/TEMPLATE_DIR/agent.json with the Read tool and extract:
agent.name — the template's display nameagent.description — what the template doesShow the user a table of available templates:
Available Templates:
| # | Template | Description | |---|----------|-------------| | 1 | [name from agent.json] | [description from agent.json] | | 2 | ... | ... |
Then ask the user to pick a template and provide a name for their new agent:
AskUserQuestion(questions=[{
"question": "Which template would you like to start from?",
"header": "Template",
"options": [
{"label": "[template 1 name]", "description": "[template 1 description]"},
{"label": "[template 2 name]", "description": "[template 2 description]"},
...
],
"multiSelect": false
}, {
"question": "What should the new agent be named? (snake_case)",
"header": "Agent Name",
"options": [
{"label": "Use template name", "description": "Keep the original template name as-is"},
{"label": "Custom name", "description": "I'll provide a new snake_case name"}
],
"multiSelect": false
}])
cp -r examples/templates/TEMPLATE_DIR exports/NEW_AGENT_NAME
First, check for existing sessions:
mcp__agent-builder__list_sessions()
mcp__agent-builder__load_session_by_id(session_id="...") and skip to list_mcp_tools.mcp__agent-builder__create_session(name="NEW_AGENT_NAME")
Then register MCP and discover tools:
mcp__agent-builder__add_mcp_server(
name="hive-tools",
transport="stdio",
command="uv",
args='["run", "python", "mcp_server.py", "--stdio"]',
cwd="tools",
description="Hive tools MCP server"
)
mcp__agent-builder__list_mcp_tools()
Import the entire agent definition in one call:
mcp__agent-builder__import_from_export(agent_json_path="exports/NEW_AGENT_NAME/agent.json")
This reads the agent.json and populates the builder session with the goal, all nodes, and all edges.
THEN immediately proceed to STEP 2.
A responsible engineer doesn't jump into building. First, understand the problem and be transparent about what the framework can and cannot do.
If starting from a template, the goal is already loaded in the builder session. Present the existing goal to the user using the format below and ask for approval. Skip the collaborative drafting questions — go straight to presenting and asking "Do you approve this goal, or would you like to modify it?"
If the user has NOT already described what they want to build, start by asking what kind of agent they have in mind:
AskUserQuestion(questions=[{
"question": "What kind of agent do you want to build? Select an option below, or choose 'Other' to describe your own.",
"header": "Agent type",
"options": [
{"label": "Data collection", "description": "Gathers information from the web, analyzes it, and produces a report or sends outreach (e.g. market research, news digest, email campaigns, competitive analysis)"},
{"label": "Workflow automation", "description": "Automates a multi-step business process end-to-end (e.g. lead qualification, content publishing pipeline, data entry)"},
{"label": "Personal assistant", "description": "Handles recurring tasks or monitors for events and acts on them (e.g. daily briefings, meeting prep, file organization)"}
],
"multiSelect": false
}])
Use the user's selection (or their custom description if they chose "Other") as context when shaping the goal below. If the user already described what they want before this step, skip the question and proceed directly.
DO NOT propose a complete goal on your own. Instead, collaborate with the user to define it.
The core principle: Discovery should feel like progress, not paperwork. The stakeholder should walk away feeling like you understood them faster than anyone else would have.
Communication sytle: Be concise. Say less. Mean more. Impatient stakeholders don't want a wall of text — they want to know you get it. Every sentence you say should either move the conversation forward or prove you understood something. If it does neither, cut it.
Ask Question Rules: Respect Their Time. Every question must earn its place by:
If a question doesn't do one of these, don't ask it. Make an assumption, state it, and move on.
When the stakeholder describes what they want, don't just hear the words — listen for the architecture underneath. While they talk, mentally construct:
You are extracting a domain model from natural language in real time. Most stakeholders won't give you this structure explicitly — they'll give you a story. Your job is to hear the structure inside the story.
| They say... | You're hearing... | |-------------|-------------------| | Nouns they repeat | Your entities | | Verbs they emphasize | Your core operations | | Frustrations they mention | Your design constraints | | Workarounds they describe | What the system must replace | | People they name | Your user types |
You have broad knowledge of how systems work. Use it aggressively.
If they say "I need a research agent," you already know it probably involves: search, summarization, source tracking, and iteration. Don't ask about each — use them as your starting mental model and let their specifics override your defaults.
If they say "I need to monitor files and alert me," you know this probably involves: watch patterns, triggers, notifications, and state tracking.
The key move: Take your general knowledge of the domain and merge it with the specifics they've given you. The result is a draft understanding that's 60-80% right before you've asked a single question. Your questions close the remaining 20-40%.
After listening, present a concrete picture of what you think they need. Make it specific enough that they can spot what's wrong.
Pattern: "Here's what I heard — tell me where I'm off"
"OK here's how I'm picturing this: [User type] needs to [core action]. Right now they're [current painful workflow]. What you want is [proposed solution that replaces the pain].
The way I'd structure this: [key entities] connected by [key relationships], with the main flow being [trigger → steps → outcome].
For the MVP, I'd focus on [the one thing that delivers the most value] and hold off on [things that can wait].
Before I start — [1-2 specific questions you genuinely can't infer]."
Why this works:
Your questions should be narrow, specific, and consequential. Never ask what you could answer yourself.
Good questions (high-stakes, can't infer):
Bad questions (low-stakes, inferable):
| Turn | Who | What | |------|-----|------| | 1 | User | Describes what they need | | 2 | Agent | Plays back understanding as a proposed model. Asks 1-2 critical questions max. | | 3 | User | Corrects, confirms, or adds detail | | 4 | Agent | Adjusts model, confirms MVP scope, states assumptions, declares starting point | | (5) | (Only if Turn 3 revealed something that fundamentally changes the approach) |
AFTER the conversation, IMMEDIATELY proceed to 2b. DO NOT skip to building.
| Don't | Do Instead | |-------|------------| | Open with a list of questions | Open with what you understood from their request | | "What are your requirements?" | "Here's what I think you need — am I right?" | | Ask about every edge case | Handle with smart defaults, flag in summary | | 10+ turn discovery conversation | 3-8 turns. Start building, iterate with real software. | | Being lazy nd not understand what user want to achieve | Understand "what" and "why | | Ask for permission to start | State your plan and start | | Wait for certainty | Start at 80% confidence, iterate the rest | | Ask what tech/tools to use | That's your job. Decide, disclose, move on. |
After the user responds, analyze the fit. Present this assessment honestly:
Framework Fit Assessment
Based on what you've described, here's my honest assessment of how well this framework fits your use case:
What Works Well (The Good):
- [List 2-4 things the framework handles well for this use case]
- Examples: multi-turn conversations, human-in-the-loop review, tool orchestration, structured outputs
Limitations to Be Aware Of (The Bad):
- [List 2-3 limitations that apply but are workable]
- Examples: LLM latency means not suitable for sub-second responses, context window limits for very large documents, cost per run for heavy tool usage
Potential Deal-Breakers (The Ugly):
- [List any significant challenges or missing capabilities — be honest]
- Examples: no tool available for X, would require custom MCP server, framework not designed for Y
Be specific. Reference the actual tools discovered in Step 1. If the user needs send_email but it's not available, say so. If they need real-time streaming from a database, explain that's not how the framework works.
Identify specific gaps between what the user wants and what you can deliver:
| Requirement | Framework Support | Gap/Workaround | |-------------|-------------------|----------------| | [User need] | [✅ Supported / ⚠️ Partial / ❌ Not supported] | [How to handle or why it's a problem] |
Examples of gaps to identify:
Give a clear recommendation:
My Recommendation:
[One of these three:]
✅ PROCEED — This is a good fit. The framework handles your core needs well. [List any minor caveats.]
⚠️ PROCEED WITH SCOPE ADJUSTMENT — This can work, but we should adjust: [specific changes]. Without these adjustments, you'll hit [specific problems].
🛑 RECONSIDER — This framework may not be the right tool for this job because [specific reasons]. Consider instead: [alternatives — simpler script, different framework, custom solution].
CALL AskUserQuestion:
AskUserQuestion(questions=[{
"question": "Based on this assessment, how would you like to proceed?",
"header": "Proceed",
"options": [
{"label": "Proceed as described", "description": "I understand the limitations, let's build it"},
{"label": "Adjust scope", "description": "Let's modify the requirements to fit better"},
{"label": "More questions", "description": "I have questions about the assessment"},
{"label": "Reconsider", "description": "Maybe this isn't the right approach"}
],
"multiSelect": false
}])
WAIT for user response.
Now that the use case is qualified, collaborate on the goal definition.
START by synthesizing what you learned:
Based on our discussion, here's my understanding of the goal:
Core purpose: [what you understood from 2a] Success looks like: [what you inferred] Key constraints: [what you inferred]
Let me refine this with you:
- What should this agent accomplish? (confirm or correct my understanding)
- How will we know it succeeded? (what specific outcomes matter)
- Are there any hard constraints? (things it must never do, quality bars)
WAIT for the user to respond. Use their input (and the agent type they selected) to draft:
PRESENT the draft goal for approval:
Proposed Goal: [Name]
[Description]
Success Criteria:
- [criterion 1]
- [criterion 2] ...
Constraints:
- [constraint 1]
- [constraint 2] ...
THEN call AskUserQuestion:
AskUserQuestion(questions=[{
"question": "Do you approve this goal definition?",
"header": "Goal",
"options": [
{"label": "Approve", "description": "Goal looks good, proceed to workflow design"},
{"label": "Modify", "description": "I want to change something"}
],
"multiSelect": false
}])
WAIT for user response.
mcp__agent-builder__set_goal(...) with the goal details, then proceed to STEP 4If starting from a template, the nodes are already loaded in the builder session. Present the existing nodes using the table format below and ask for approval. Skip the design phase.
BEFORE designing nodes, review the available tools from Step 1. Nodes can ONLY use tools that exist.
DESIGN the workflow as a series of nodes. For each node, determine:
"event_loop" (the only valid type; use client_facing: True for HITL)Prefer fewer, richer nodes (4 nodes > 8 thin nodes). Each node boundary requires serializing outputs. A research node that searches, fetches, and analyzes keeps all source material in its conversation history.
PRESENT the nodes to the user for review:
Proposed Nodes ([N] total):
| # | Node ID | Type | Description | Tools | Client-Facing | | --- | ---------- | ---------- | ----------------------------- | ---------------------- | :-----------: | | 1 |
intake| event_loop | Gather requirements from user | — | Yes | | 2 |research| event_loop | Search and analyze sources | web_search, web_scrape | No | | 3 |review| event_loop | Present findings for approval | — | Yes | | 4 |report| event_loop | Generate final report | save_data | No |Data Flow:
intakeproduces:research_briefresearchreceives:research_brief→ produces:findings,sourcesreviewreceives:findings,sources→ produces:approved_findingsorfeedbackreportreceives:approved_findings→ produces:final_report
THEN call AskUserQuestion:
AskUserQuestion(questions=[{
"question": "Do you approve these nodes?",
"header": "Nodes",
"options": [
{"label": "Approve", "description": "Nodes look good, proceed to graph design"},
{"label": "Modify", "description": "I want to change the nodes"}
],
"multiSelect": false
}])
WAIT for user response.
If starting from a template, the edges are already loaded in the builder session. Render the existing graph as ASCII art and present it to the user for approval. Skip the edge design phase.
DETERMINE the edges connecting the approved nodes. For each edge:
on_success, on_failure, always, or conditionalDETERMINE the graph lifecycle. Not every agent needs a terminal node:
| Pattern | terminal_nodes | When to Use |
|---------|-------------------|-------------|
| Linear (finish) | ["last-node"] | Agent completes a task and exits (batch processing, one-shot generation) |
| Forever-alive (loop) | [] (empty) | Agent stays alive for continuous interaction (research assistant, personal assistant, monitoring) |
Forever-alive pattern: The deep_research_agent example uses terminal_nodes=[]. Every leaf node has edges that loop back to earlier nodes, creating a perpetual session. The agent only stops when the user explicitly exits. This is the preferred pattern for interactive, multi-turn agents.
Key design rules for forever-alive graphs:
conversation_mode="continuous" to preserve conversation history across node transitionsmax_iterations should be set high (e.g., 100) since the agent is designed to run indefinitelyAsk the user which lifecycle pattern fits their agent. Default to forever-alive for interactive agents, linear for batch/one-shot tasks.
RENDER the complete graph as ASCII art. Make it large and clear — the user needs to see and understand the full workflow at a glance.
IMPORTANT: Make the ASCII art BIG and READABLE. Use a box-and-arrow style with generous spacing. Do NOT make it tiny or compressed. Example format:
┌─────────────────────────────────────────────────────────────────────────────┐
│ AGENT: Research Agent │
│ │
│ Goal: Thoroughly research technical topics and produce verified reports │
└─────────────────────────────────────────────────────────────────────────────┘
┌───────────────────────┐
│ INTAKE │
│ (client-facing) │
│ │
│ in: topic │
│ out: research_brief │
└───────────┬───────────┘
│ on_success
▼
┌───────────────────────┐
│ RESEARCH │
│ │
│ tools: web_search, │
│ web_scrape │
│ │
│ in: research_brief │
│ [feedback] │
│ out: findings, │
│ sources │
└───────────┬───────────┘
│ on_success
▼
┌───────────────────────┐
│ REVIEW │
│ (client-facing) │
│ │
│ in: findings, │
│ sources │
│ out: approved_findings│
│ OR feedback │
└───────┬───────┬───────┘
│ │
approved │ │ feedback (priority: -1)
│ │
▼ └──────────────────┐
┌───────────────────────┐ │
│ REPORT │ │
│ │ │
│ tools: save_data │ │
│ │ │
│ in: approved_ │ │
│ findings │ │
│ out: final_report │ │
└───────────────────────┘ │
│
┌──────────────────────────┘
│ loops back to RESEARCH
▼ (max_node_visits: 3)
EDGES:
──────
1. intake → research [on_success, priority: 1]
2. research → review [on_success, priority: 1]
3. review → report [conditional: approved_findings is not None, priority: 1]
4. review → research [conditional: feedback is not None, priority: -1]
PRESENT the graph and edges to the user:
Here is the complete workflow graph:
[ASCII art above]
Edge Summary:
| # | Edge | Condition | Priority | | --- | ----------------- | -------------------------------------------- | -------- | | 1 | intake → research | on_success | 1 | | 2 | research → review | on_success | 1 | | 3 | review → report | conditional:
approved_findings is not None| 1 | | 4 | review → research | conditional:feedback is not None| -1 |
THEN call AskUserQuestion:
AskUserQuestion(questions=[{
"question": "Do you approve this workflow graph?",
"header": "Graph",
"options": [
{"label": "Approve", "description": "Graph looks good, proceed to build the agent"},
{"label": "Modify", "description": "I want to change the graph"}
],
"multiSelect": false
}])
WAIT for user response.
NOW — and only now — write the actual code. The user has approved the goal, nodes, and graph.
If starting from a template, the copied files will be overwritten with the approved design. You MUST replace every occurrence of the old template name with the new agent name. Here is the complete checklist — miss NONE of these:
| File | What to rename |
|------|---------------|
| config.py | AgentMetadata.name — the display name shown in TUI agent selection |
| config.py | AgentMetadata.description — agent description |
| config.py | AgentMetadata.intro_message — greeting shown to user when TUI loads |
| agent.py | Module docstring (line 1) |
| agent.py | class OldNameAgent: → class NewNameAgent: |
| agent.py | GraphSpec(id="old-name-graph") → GraphSpec(id="new-name-graph") — shown in TUI status bar |
| agent.py | Storage path: Path.home() / ".hive" / "agents" / "old_name" → "new_name" |
| __main__.py | Module docstring (line 1) |
| __main__.py | from .agent import ... OldNameAgent → NewNameAgent |
| __main__.py | CLI help string in def cli() docstring |
| __main__.py | All OldNameAgent() instantiations |
| __main__.py | Storage path (duplicated from agent.py) |
| __main__.py | Shell banner string (e.g. "=== Old Name Agent ===") |
| __init__.py | Package docstring |
| __init__.py | from .agent import OldNameAgent import |
| __init__.py | __all__ list entry |
If starting from a template and no modifications were made in Steps 2-5, the nodes and edges are already registered. Skip to validation (mcp__agent-builder__validate_graph()). If modifications were made, re-register the changed nodes/edges (the MCP tools handle duplicates by overwriting).
FOR EACH approved node, call:
mcp__agent-builder__add_node(
node_id="...",
name="...",
description="...",
node_type="event_loop",
input_keys='["key1", "key2"]',
output_keys='["key1"]',
tools='["tool1"]',
system_prompt="...",
client_facing=True/False,
nullable_output_keys='["key"]',
max_node_visits=1
)
FOR EACH approved edge, call:
mcp__agent-builder__add_edge(
edge_id="source-to-target",
source="source-node-id",
target="target-node-id",
condition="on_success",
condition_expr="",
priority=1
)
VALIDATE the graph:
mcp__agent-builder__validate_graph()
EXPORT the graph data:
mcp__agent-builder__export_graph()
THEN write the Python package files using the exported data. Create these files in exports/AGENT_NAME/:
config.py - Runtime configuration with model settings and AgentMetadata (including intro_message — the greeting shown when TUI loads)nodes/__init__.py - All NodeSpec definitionsagent.py - Goal, edges, graph config, and agent class__init__.py - Package exports__main__.py - CLI interfacemcp_servers.json - MCP server configurationsREADME.md - Usage documentationIMPORTANT entry_points format:
{"start": "first-node-id"}{"first-node-id": ["input_keys"]} (WRONG){"first-node-id"} (WRONG - this is a set)IMPORTANT mcp_servers.json format:
{
"hive-tools": {
"transport": "stdio",
"command": "uv",
"args": ["run", "python", "mcp_server.py", "--stdio"],
"cwd": "../../tools",
"description": "Hive tools MCP server"
}
}
"mcpServers" wrapper (that's Claude Desktop format, NOT hive format)cwd MUST be "../../tools" (relative from exports/AGENT_NAME/ to tools/)command MUST be "uv" with "args": ["run", "python", ...] (NOT bare "python" which fails on Mac)Use the example agent at .claude/skills/hive-create/examples/deep_research_agent/ as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
AFTER writing all files, tell the user:
Agent package created:
exports/AGENT_NAME/Files generated:
__init__.py- Package exportsagent.py- Goal, nodes, edges, agent classconfig.py- Runtime configuration__main__.py- CLI interfacenodes/__init__.py- Node definitionsmcp_servers.json- MCP server configREADME.md- Usage documentation
RUN validation:
cd /home/timothy/oss/hive && PYTHONPATH=exports uv run python -m AGENT_NAME validate
TELL the user the agent is ready and display the next steps box:
┌─────────────────────────────────────────────────────────────────────────────┐
│ ✅ AGENT BUILD COMPLETE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ NEXT STEPS: │
│ │
│ 1. SET UP CREDENTIALS (if agent uses tools like web_search, send_email): │
│ │
│ /hive-credentials --agent AGENT_NAME │
│ │
│ 2. RUN YOUR AGENT: │
│ │
│ hive tui │
│ │
│ Then select your agent from the list and press Enter. │
│ │
│ 3. DEBUG ANY ISSUES: │
│ │
│ /hive-debugger │
│ │
│ The debugger monitors runtime logs, identifies retry loops, │
│ tool failures, and missing outputs, and provides fix recommendations. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
| Type | tools param | Use when |
| ------------ | ----------------------- | --------------------------------------- |
| event_loop | '["tool1"]' or '[]' | All agent work (with or without tools, HITL via client_facing) |
| Field | Default | Description |
| ---------------------- | ------- | --------------------------------------------------------------------- |
| client_facing | False | Streams output to user, blocks for input between turns |
| nullable_output_keys | [] | Output keys that may remain unset (mutually exclusive outputs) |
| max_node_visits | 1 | Max executions per run. Set >1 for feedback loop targets. 0=unlimited |
| Condition | When edge is followed |
| ------------- | ------------------------------------- |
| on_success | Source node completed successfully |
| on_failure | Source node failed |
| always | Always, regardless of success/failure |
| conditional | When condition_expr evaluates to True |
Priority: Positive = forward edge (evaluated first). Negative = feedback edge (loops back to earlier node). Multiple ON_SUCCESS edges from same source = parallel execution (fan-out).
For internal event_loop nodes (not client-facing), instruct the LLM to use set_output:
Use set_output(key, value) to store your results. For example:
- set_output("search_results", <your results as a JSON string>)
Do NOT return raw JSON. Use the set_output tool to produce outputs.
For client-facing event_loop nodes, use the STEP 1/STEP 2 pattern:
**STEP 1 — Respond to the user (text only, NO tool calls):**
[Present information, ask questions, etc.]
**STEP 2 — After the user responds, call set_output:**
- set_output("key", "value based on user's response")
This prevents the LLM from calling set_output before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
EventLoopNodes are auto-created by GraphExecutor at runtime. Both direct GraphExecutor and AgentRuntime / create_agent_runtime() handle event_loop nodes automatically. No manual node_registry setup is needed.
# Direct execution
from framework.graph.executor import GraphExecutor
from framework.runtime.core import Runtime
storage_path = Path.home() / ".hive" / "agents" / "my_agent"
storage_path.mkdir(parents=True, exist_ok=True)
runtime = Runtime(storage_path)
executor = GraphExecutor(
runtime=runtime,
llm=llm,
tools=tools,
tool_executor=tool_executor,
storage_path=storage_path,
)
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
DO NOT pass runtime=None to GraphExecutor — it will crash with 'NoneType' object has no attribute 'start_run'.
Agents have two lifecycle patterns:
Linear (terminal) graphs have terminal_nodes=["last-node"]. Execution ends when the terminal node completes. The session enters a "completed" state. Use for batch processing, one-shot generation, and fire-and-forget tasks.
Forever-alive graphs have terminal_nodes=[] (empty). Every node has at least one outgoing edge — the graph loops indefinitely. The session never enters a "completed" state — this is intentional. The agent stays alive until the user explicitly exits. Use for interactive assistants, research tools, and any agent where the user drives the conversation.
The deep_research_agent example demonstrates this: report loops back to either research (dig deeper) or intake (new topic). The agent is a persistent, interactive assistant.
When conversation_mode="continuous" is set on the GraphSpec, the framework preserves a single conversation thread across all node transitions:
What the framework does automatically:
What this means for agent builders:
When to use continuous mode:
When NOT to use continuous mode:
Use this reference during STEP 2 to give accurate, honest assessments.
| Capability | Description |
|------------|-------------|
| Multi-turn conversations | Client-facing nodes stream to users and block for input |
| Human-in-the-loop review | Approval checkpoints with feedback loops back to earlier nodes |
| Tool orchestration | LLM can call multiple tools, framework handles execution |
| Structured outputs | set_output produces validated, typed outputs |
| Parallel execution | Fan-out/fan-in for concurrent node execution |
| Context management | Automatic compaction and spillover for large data |
| Error recovery | Retry logic, judges, and feedback edges for self-correction |
| Session persistence | State saved to disk, resumable sessions |
| Limitation | Impact | Workaround | |------------|--------|------------| | LLM latency | 2-10+ seconds per turn | Not suitable for real-time/low-latency needs | | Context window limits | ~128K tokens max | Use data tools for spillover, design for chunking | | Cost per run | LLM API calls cost money | Budget planning, caching where possible | | Rate limits | API throttling on heavy usage | Backoff, queue management | | Node boundaries lose context | Outputs must be serialized | Prefer fewer, richer nodes | | Single-threaded within node | One LLM call at a time per node | Use fan-out for parallelism |
| Use Case | Why It's Problematic | Alternative | |----------|---------------------|-------------| | Persistent background daemons (no user) | Forever-alive graphs need a user at client-facing nodes; no autonomous background polling without user | External scheduler triggering agent runs | | Sub-second responses | LLM latency is inherent | Traditional code, no LLM | | Processing millions of items | Context windows and rate limits | Batch processing + sampling | | Real-time streaming data | No built-in pub/sub or streaming input | Custom MCP server + agent | | Guaranteed determinism | LLM outputs vary | Traditional code for deterministic parts | | Offline/air-gapped | Requires LLM API access | Local models (not currently supported) | | Multi-user concurrency | Single-user session model | Separate agent instances per user |
Before promising any capability, check list_mcp_tools(). Common gaps:
send_email — check before promising email automationmcp__agent-builder__list_mcp_tools() first{"start": "node-id"}, NOT a set or list"mcpServers" wrapper), cwd must be "../../tools", and command must be "uv" with args ["run", "python", ...]terminal_nodes=[] (forever-alive pattern). The agent never enters "completed" state — this is intentional. Only batch/one-shot agents need terminal nodesconversation_mode="continuous" to preserve context across node transitions. Without it, each node starts with a blank conversation and loses all prior contextdevelopment
Core orchestration framework for model-agnostic multi-agent workflows with handoff protocol, policy governance, and configuration schemas
testing
Specialized skill for triage issue skill workflows.
development
Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
development
Iterative agent testing with session recovery. Execute, analyze, fix, resume from checkpoints. Use when testing an agent, debugging test failures, or verifying fixes without re-running from scratch.