Google ADK Python — Agent Development Kit

Version: Python SDK (google-adk). Reference docs: https://google.github.io/adk-docs/

Quick Reference Index

| Topic | Section in this file | |---|---| | Installation & project setup | Setup | | LLM Agent (core) | LlmAgent | | Tools (function, long-running, agent-as-tool) | Tools | | Multi-agent systems | Multi-Agent | | Session, State, Memory | Context | | Callbacks | Callbacks | | Running agents (Runner, CLI, web) | Running Agents | | Models (Gemini, Claude, LiteLLM, Ollama) | Models | | Deployment (Agent Engine, Cloud Run) | Deployment | | Streaming (bidi) | Streaming | | MCP Integration | MCP Tools | | Common patterns & gotchas | Patterns & Gotchas |

For deep reference on a specific topic, read references/<topic>.md (loaded on demand).

Setup

Requirements: Python 3.10+, pip

pip install google-adk
# Optional: virtual environment (recommended)
python -m venv .venv && source .venv/bin/activate

Scaffold a new agent project:

adk create my_agent        # creates my_agent/ with agent.py, .env, __init__.py
adk run my_agent           # CLI interactive session
adk web --port 8000        # Web UI (dev only, not for production)

Project structure:

my_agent/
├── agent.py        # REQUIRED: defines root_agent
├── __init__.py
└── .env            # API keys / GCP project IDs

API Key (Gemini via Google AI Studio):

# my_agent/.env
GOOGLE_API_KEY="YOUR_KEY"

LlmAgent

LlmAgent (aliased as Agent) is the primary thinking agent. Import from:

from google.adk.agents import LlmAgent, Agent  # Agent is an alias

Minimal Agent

from google.adk.agents import Agent

root_agent = Agent(
    model="gemini-2.5-flash",
    name="root_agent",           # REQUIRED: unique string, no spaces
    description="Short summary of what this agent does.",  # used by other agents for routing
    instruction="You are a helpful assistant. ...",
)

Key Parameters

| Parameter | Type | Notes | |---|---|---| | model | str | Required. e.g. "gemini-2.5-flash", "gemini-2.0-flash" | | name | str | Required. Unique. Avoid "user" (reserved). | | instruction | str \| Callable | Core behavior. Supports {state_var} templating. | | description | str | Used by parent agents for routing/delegation. | | tools | list | Python functions or BaseTool instances. | | sub_agents | list | Child agents for delegation. | | output_key | str | Auto-save final response to session.state[output_key]. | | output_schema | Pydantic BaseModel | Enforce JSON output. | | input_schema | Pydantic BaseModel | Enforce JSON input. | | include_contents | 'default' \| 'none' | Pass or suppress conversation history. | | generate_content_config | GenerateContentConfig | Temperature, max tokens, safety. | | planner | BasePlanner | BuiltInPlanner or PlanReActPlanner. | | code_executor | BaseCodeExecutor | Enable code execution (e.g., BuiltInCodeExecutor). |

Instruction Templating (State Variables)

# Access session state in instructions with {var}
instruction = "User's name is {user_name?}. Greet them."
# {var?} = optional (won't error if missing); {var} = required
# {artifact.filename} = read artifact text content

Structured Output

from pydantic import BaseModel, Field

class SummaryOutput(BaseModel):
    title: str = Field(description="The document title")
    summary: str = Field(description="A 2-sentence summary")

agent = Agent(
    model="gemini-2.5-flash",
    name="summarizer",
    instruction='Respond ONLY with valid JSON matching the schema.',
    output_schema=SummaryOutput,
    output_key="summary_result",  # saves to session.state["summary_result"]
)
# NOTE: output_schema disables tool use. Use one or the other.

LLM Config (Temperature, Tokens, Safety)

from google.genai import types

agent = Agent(
    model="gemini-2.5-flash",
    name="careful_agent",
    generate_content_config=types.GenerateContentConfig(
        temperature=0.1,
        max_output_tokens=512,
        safety_settings=[
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            )
        ],
    ),
)

Planners

from google.adk.planners import BuiltInPlanner, PlanReActPlanner
from google.genai.types import ThinkingConfig

# For Gemini models with thinking support:
agent = Agent(
    model="gemini-2.5-pro-preview-03-25",
    planner=BuiltInPlanner(
        thinking_config=ThinkingConfig(include_thoughts=True, thinking_budget=1024)
    ),
    ...
)

# For models without built-in thinking (forces structured plan → act format):
agent = Agent(model="gemini-2.0-flash", planner=PlanReActPlanner(), ...)

Tools

Function Tool (Python function → Tool)

ADK auto-wraps Python functions as FunctionTool. The docstring, type hints, and parameter names directly shape the schema sent to the LLM.

def get_weather(city: str, unit: str = "Celsius") -> dict:
    """Returns current weather for a city.

    Args:
        city: The city name.
        unit: Temperature unit, 'Celsius' or 'Fahrenheit'. Defaults to 'Celsius'.

    Returns:
        dict with 'status' and 'report' keys.
    """
    # ... real logic here
    return {"status": "success", "report": f"It is sunny in {city}, 22°{unit[0]}"}

agent = Agent(model="gemini-2.5-flash", name="weather_agent", tools=[get_weather])

Rules for tools:

Always return a dict. Non-dict returns are wrapped as {"result": value}.
Use status key ("success" / "error") — the LLM reads this.
Use clear docstrings — the LLM uses them to decide when/how to call the tool.
*args and **kwargs are ignored by the schema generator.
Optional[str] = None marks a parameter as optional.

Passing Context to Tools (`ToolContext`)

from google.adk.tools import ToolContext

def save_preference(preference: str, tool_context: ToolContext) -> dict:
    """Saves a user preference to session state."""
    tool_context.state["user_preference"] = preference
    return {"status": "success", "saved": preference}
# ADK injects ToolContext automatically — don't include in schema docstring

Long-Running Tool

from google.adk.tools import LongRunningFunctionTool

def process_large_file(file_path: str) -> dict:
    """Processes a large file asynchronously."""
    # ... long operation
    return {"status": "success", "result": "processed"}

long_tool = LongRunningFunctionTool(func=process_large_file)
agent = Agent(model="gemini-2.5-flash", name="processor", tools=[long_tool])

Agent-as-Tool (`AgentTool`)

from google.adk.tools import AgentTool

specialist = Agent(name="Specialist", model="gemini-2.5-flash",
                   description="Expert in data analysis.", instruction="...")

orchestrator = Agent(
    name="Orchestrator",
    model="gemini-2.5-flash",
    tools=[AgentTool(agent=specialist)],
    instruction="Use Specialist for data tasks.",
)
# Unlike sub_agents, AgentTool is called as a function and returns the result inline.

Multi-Agent Systems

Agent Hierarchy & Delegation

from google.adk.agents import LlmAgent

booking_agent = LlmAgent(name="Booker", model="gemini-2.5-flash",
                         description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", model="gemini-2.5-flash",
                      description="Answers general questions and provides information.")

root_agent = LlmAgent(
    name="Coordinator",
    model="gemini-2.5-flash",
    instruction="Delegate booking tasks to Booker, info queries to Info.",
    sub_agents=[booking_agent, info_agent],
    # AutoFlow handles transfer_to_agent() calls automatically
)

Rules:

Each agent instance can only have one parent (ValueError if added twice).
Target agents need descriptive description fields for LLM routing.
Use root_agent.find_agent("name") to look up agents by name.

Sequential Agent

from google.adk.agents import SequentialAgent

fetch = LlmAgent(name="Fetch", instruction="Fetch data about {topic}.", output_key="raw_data")
process = LlmAgent(name="Process", instruction="Process this data: {raw_data}.", output_key="result")

pipeline = SequentialAgent(name="Pipeline", sub_agents=[fetch, process])
# fetch runs first, saves to state['raw_data']; process reads it via {raw_data} template

Parallel Agent

from google.adk.agents import ParallelAgent

weather = LlmAgent(name="Weather", instruction="Get weather for {city}.", output_key="weather")
news = LlmAgent(name="News", instruction="Get news for {city}.", output_key="news")

gatherer = ParallelAgent(name="Gatherer", sub_agents=[weather, news])
# Runs concurrently. Both write to shared session.state (use distinct output_key values!)

Loop Agent

from google.adk.agents import LoopAgent, BaseAgent
from google.adk.agents.invocation_context import InvocationContext
from google.adk.events import Event, EventActions
from typing import AsyncGenerator

class StopWhenDone(BaseAgent):
    async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
        done = ctx.session.state.get("task_complete", False)
        yield Event(author=self.name, actions=EventActions(escalate=done))

worker = LlmAgent(name="Worker", instruction="Do one step. Set state task_complete=True when done.")

loop = LoopAgent(
    name="RetryLoop",
    max_iterations=5,
    sub_agents=[worker, StopWhenDone(name="Checker")]
)
# Loop stops when Checker escalates OR max_iterations (5) reached.

Context: Session, State, Memory

Session & Runner Setup

from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
import asyncio

APP_NAME = "my_app"
USER_ID = "user_001"
SESSION_ID = "session_001"

session_service = InMemorySessionService()
session = asyncio.run(session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
))
runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)

Note: InMemorySessionService is for dev/testing only. All data is lost on restart.
For production use VertexAiSessionService or DatabaseSessionService.

Reading & Writing State

# In a tool:
def update_cart(item: str, tool_context: ToolContext) -> dict:
    cart = tool_context.state.get("cart", [])
    cart.append(item)
    tool_context.state["cart"] = cart
    return {"status": "success", "cart": cart}

# State key prefixes:
# "key"       → persists for session lifetime
# "user:key"  → persists across sessions for this user
# "app:key"   → persists across all users/sessions for this app
# "temp:key"  → only for current invocation turn (not persisted)

Passing Initial State to Session

session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID,
    state={"user_name": "Alice", "language": "en"}
)

Running the Agent

from google.genai import types

async def call_agent(query: str):
    content = types.Content(role="user", parts=[types.Part(text=query)])
    async for event in runner.run_async(
        user_id=USER_ID, session_id=SESSION_ID, new_message=content
    ):
        if event.is_final_response() and event.content:
            print("Response:", event.content.parts[0].text)

Memory Service (Cross-Session)

from google.adk.memory import InMemoryMemoryService  # dev only

memory_service = InMemoryMemoryService()
runner = Runner(agent=root_agent, app_name=APP_NAME,
                session_service=session_service, memory_service=memory_service)
# For production: VertexAiMemoryService

Callbacks

Callbacks let you observe and modify agent behavior at key lifecycle points.

from google.adk.agents.callback_context import CallbackContext
from google.adk.models import LlmRequest, LlmResponse
from google.adk.tools import BaseTool
from typing import Optional

# --- Before model call ---
def my_before_model(callback_context: CallbackContext, llm_request: LlmRequest) -> Optional[LlmResponse]:
    print(f"[Callback] About to call LLM. Turn: {callback_context.invocation_id}")
    # Return an LlmResponse to SKIP the actual model call
    return None  # None = proceed normally

# --- After model call ---
def my_after_model(callback_context: CallbackContext, llm_response: LlmResponse) -> Optional[LlmResponse]:
    # Modify or replace the response
    return llm_response  # return modified or original

# --- Before tool call ---
def my_before_tool(tool: BaseTool, args: dict, callback_context: CallbackContext) -> Optional[dict]:
    print(f"[Callback] Tool '{tool.name}' called with {args}")
    # Return a dict to SHORT-CIRCUIT the tool call with that result
    return None  # None = proceed normally

agent = Agent(
    model="gemini-2.5-flash",
    name="monitored_agent",
    before_model_callback=my_before_model,
    after_model_callback=my_after_model,
    before_tool_callback=my_before_tool,
)

Available callbacks:

before_agent_callback / after_agent_callback
before_model_callback / after_model_callback
before_tool_callback / after_tool_callback

Running Agents

CLI Commands

adk run my_agent              # Interactive CLI chat
adk web --port 8000           # Web UI (dev only)
adk api_server                # Start local REST API server
adk eval my_agent evals/      # Run evaluations
adk deploy agent_engine ...   # Deploy to Vertex AI Agent Engine

Async Runner Pattern (Recommended)

import asyncio
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types

async def main():
    session_service = InMemorySessionService()
    session = await session_service.create_session(
        app_name="app", user_id="u1", session_id="s1"
    )
    runner = Runner(agent=root_agent, app_name="app", session_service=session_service)

    content = types.Content(role="user", parts=[types.Part(text="Hello!")])
    async for event in runner.run_async(user_id="u1", session_id="s1", new_message=content):
        if event.is_final_response():
            print(event.content.parts[0].text)

asyncio.run(main())

Sync Runner (Simple Testing)

from google.adk.runners import InMemoryRunner  # convenience wrapper

runner = InMemoryRunner(agent=root_agent)
session = asyncio.run(runner.session_service.create_session(
    app_name=runner.app_name, user_id="u1"
))
# then run_async as above

Models

Gemini (default)

Agent(model="gemini-2.5-flash", ...)   # fast, efficient
Agent(model="gemini-2.5-pro", ...)     # most capable
Agent(model="gemini-2.0-flash", ...)   # balanced

Set GOOGLE_API_KEY in .env for Google AI Studio.
For Vertex AI: set GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION.

Claude (Anthropic) via Vertex AI

# pip install google-adk[anthropic]
from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="anthropic/claude-sonnet-4-6"),
    name="claude_agent",
    ...
)
# Requires ANTHROPIC_API_KEY or Vertex AI Claude setup

LiteLLM (100+ models)

from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="openai/gpt-4o"),
    name="gpt_agent",
    ...
)
# Set relevant API keys in .env (OPENAI_API_KEY, etc.)

Ollama (Local Models)

from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="ollama/llama3"),
    name="local_agent",
    ...
)
# Run: ollama serve (default: http://localhost:11434)

Deployment

Vertex AI Agent Engine (Managed, Production)

# 1. Authenticate
gcloud auth login
gcloud auth application-default login

# 2. Enable APIs
# - Vertex AI API
# - Cloud Resource Manager API

# 3. Deploy
adk deploy agent_engine \
    --project=MY_PROJECT_ID \
    --region=us-central1 \
    --display_name="My Agent" \
    my_agent/

After deployment, interact via REST:

POST https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/reasoningEngines/{RESOURCE_ID}:query

Or via Vertex AI SDK:

import vertexai
agent_engine = vertexai.agent_engines.get("projects/.../reasoningEngines/RESOURCE_ID")

Cloud Run

adk deploy cloud_run \
    --project=MY_PROJECT_ID \
    --region=us-central1 \
    my_agent/

Streaming

Bidi-Streaming (Live) Agent

from google.adk.agents import LiveRequestQueue
from google.adk.runners import Runner

runner = Runner(agent=root_agent, app_name="app", session_service=session_service)
live_request_queue = LiveRequestQueue()

async def stream_agent():
    async for event in runner.run_live(
        user_id="u1", session_id="s1",
        live_request_queue=live_request_queue
    ):
        if event.content:
            for part in event.content.parts:
                if part.text:
                    print(part.text, end="", flush=True)

# Send messages via live_request_queue.send_content(...)

MCP Tools

Use an MCP Server as Tools in ADK

import asyncio
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StdioServerParameters

async def get_tools():
    tools, exit_stack = await MCPToolset.from_server(
        connection_params=StdioServerParameters(
            command="npx",
            args=["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"],
        )
    )
    return tools, exit_stack

async def main():
    tools, exit_stack = await get_tools()
    async with exit_stack:
        agent = Agent(
            model="gemini-2.5-flash",
            name="mcp_agent",
            tools=tools,
            instruction="Use the filesystem tools to help the user.",
        )
        # ... run agent

For SSE-based MCP servers:

from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseServerParams

tools, exit_stack = await MCPToolset.from_server(
    connection_params=SseServerParams(url="http://localhost:3000/sse")
)

Patterns & Gotchas

✅ Correct `root_agent` export (required by ADK)

# agent.py — root_agent must be defined at module level
from google.adk.agents import Agent

def my_tool(x: int) -> dict:
    """Does something."""
    return {"result": x * 2}

root_agent = Agent(
    model="gemini-2.5-flash",
    name="root_agent",
    instruction="You are helpful.",
    tools=[my_tool],
)

✅ Sequential state passing pattern

# Use output_key + {var} template for pipeline data flow
step1 = Agent(name="Step1", instruction="Extract topic from input.", output_key="topic")
step2 = Agent(name="Step2", instruction="Research {topic} in depth.", output_key="research")
step3 = Agent(name="Step3", instruction="Write a report about {research}.")
pipeline = SequentialAgent(name="Pipeline", sub_agents=[step1, step2, step3])

❌ Avoid `output_schema` + `tools` together

output_schema forces JSON-only mode which disables tool calls. Use one or the other.

❌ Avoid duplicate agent instances in sub_agents

# WRONG — agent can only have one parent
shared_agent = Agent(name="Shared", ...)
parent1 = Agent(name="P1", sub_agents=[shared_agent])
parent2 = Agent(name="P2", sub_agents=[shared_agent])  # ValueError!

# RIGHT — create separate instances

✅ Async-first design

ADK is async-native. Always use run_async and asyncio.run(main()) in scripts.
For Jupyter/Colab, use await directly at the top level.

✅ Tool error handling

def safe_tool(param: str) -> dict:
    """Does something safely."""
    try:
        result = do_work(param)
        return {"status": "success", "result": result}
    except Exception as e:
        return {"status": "error", "error_message": str(e)}
# Always return {"status": "error", "error_message": "..."} on failure
# Never raise exceptions from tools — the LLM needs to read the error

✅ State key prefix reference

| Prefix | Scope | |---|---| | (none) | Current session | | user: | All sessions for this user | | app: | All sessions in this app | | temp: | Current invocation only (not persisted) |

Reference Files

For deeper content on specific topics, read the relevant reference file:

references/callbacks.md — Callback patterns and best practices
references/custom-agents.md — Building BaseAgent subclasses
references/evaluate.md — Agent evaluation and testing
references/models-auth.md — Model authentication details for all providers
references/artifacts.md — Working with ADK Artifacts (file-like objects)

Load these files only when the user's task specifically requires that depth.

Google ADK Python — Agent Development Kit

Version: Python SDK (google-adk). Reference docs: https://google.github.io/adk-docs/

Quick Reference Index

For deep reference on a specific topic, read references/<topic>.md (loaded on demand).

Setup

Requirements: Python 3.10+, pip

pip install google-adk
# Optional: virtual environment (recommended)
python -m venv .venv && source .venv/bin/activate

Scaffold a new agent project:

adk create my_agent        # creates my_agent/ with agent.py, .env, __init__.py
adk run my_agent           # CLI interactive session
adk web --port 8000        # Web UI (dev only, not for production)

Project structure:

my_agent/
├── agent.py        # REQUIRED: defines root_agent
├── __init__.py
└── .env            # API keys / GCP project IDs

API Key (Gemini via Google AI Studio):

# my_agent/.env
GOOGLE_API_KEY="YOUR_KEY"

LlmAgent

LlmAgent (aliased as Agent) is the primary thinking agent. Import from:

from google.adk.agents import LlmAgent, Agent  # Agent is an alias

Minimal Agent

from google.adk.agents import Agent

root_agent = Agent(
    model="gemini-2.5-flash",
    name="root_agent",           # REQUIRED: unique string, no spaces
    description="Short summary of what this agent does.",  # used by other agents for routing
    instruction="You are a helpful assistant. ...",
)

Key Parameters

Instruction Templating (State Variables)

# Access session state in instructions with {var}
instruction = "User's name is {user_name?}. Greet them."
# {var?} = optional (won't error if missing); {var} = required
# {artifact.filename} = read artifact text content

Structured Output

from pydantic import BaseModel, Field

class SummaryOutput(BaseModel):
    title: str = Field(description="The document title")
    summary: str = Field(description="A 2-sentence summary")

agent = Agent(
    model="gemini-2.5-flash",
    name="summarizer",
    instruction='Respond ONLY with valid JSON matching the schema.',
    output_schema=SummaryOutput,
    output_key="summary_result",  # saves to session.state["summary_result"]
)
# NOTE: output_schema disables tool use. Use one or the other.

LLM Config (Temperature, Tokens, Safety)

from google.genai import types

agent = Agent(
    model="gemini-2.5-flash",
    name="careful_agent",
    generate_content_config=types.GenerateContentConfig(
        temperature=0.1,
        max_output_tokens=512,
        safety_settings=[
            types.SafetySetting(
                category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
                threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
            )
        ],
    ),
)

Planners

from google.adk.planners import BuiltInPlanner, PlanReActPlanner
from google.genai.types import ThinkingConfig

# For Gemini models with thinking support:
agent = Agent(
    model="gemini-2.5-pro-preview-03-25",
    planner=BuiltInPlanner(
        thinking_config=ThinkingConfig(include_thoughts=True, thinking_budget=1024)
    ),
    ...
)

# For models without built-in thinking (forces structured plan → act format):
agent = Agent(model="gemini-2.0-flash", planner=PlanReActPlanner(), ...)

Tools

Function Tool (Python function → Tool)

ADK auto-wraps Python functions as FunctionTool. The docstring, type hints, and parameter names directly shape the schema sent to the LLM.

def get_weather(city: str, unit: str = "Celsius") -> dict:
    """Returns current weather for a city.

    Args:
        city: The city name.
        unit: Temperature unit, 'Celsius' or 'Fahrenheit'. Defaults to 'Celsius'.

    Returns:
        dict with 'status' and 'report' keys.
    """
    # ... real logic here
    return {"status": "success", "report": f"It is sunny in {city}, 22°{unit[0]}"}

agent = Agent(model="gemini-2.5-flash", name="weather_agent", tools=[get_weather])

Rules for tools:

Always return a dict. Non-dict returns are wrapped as {"result": value}.
Use status key ("success" / "error") — the LLM reads this.
Use clear docstrings — the LLM uses them to decide when/how to call the tool.
*args and **kwargs are ignored by the schema generator.
Optional[str] = None marks a parameter as optional.

Passing Context to Tools (`ToolContext`)

from google.adk.tools import ToolContext

def save_preference(preference: str, tool_context: ToolContext) -> dict:
    """Saves a user preference to session state."""
    tool_context.state["user_preference"] = preference
    return {"status": "success", "saved": preference}
# ADK injects ToolContext automatically — don't include in schema docstring

Long-Running Tool

from google.adk.tools import LongRunningFunctionTool

def process_large_file(file_path: str) -> dict:
    """Processes a large file asynchronously."""
    # ... long operation
    return {"status": "success", "result": "processed"}

long_tool = LongRunningFunctionTool(func=process_large_file)
agent = Agent(model="gemini-2.5-flash", name="processor", tools=[long_tool])

Agent-as-Tool (`AgentTool`)

from google.adk.tools import AgentTool

specialist = Agent(name="Specialist", model="gemini-2.5-flash",
                   description="Expert in data analysis.", instruction="...")

orchestrator = Agent(
    name="Orchestrator",
    model="gemini-2.5-flash",
    tools=[AgentTool(agent=specialist)],
    instruction="Use Specialist for data tasks.",
)
# Unlike sub_agents, AgentTool is called as a function and returns the result inline.

Multi-Agent Systems

Agent Hierarchy & Delegation

from google.adk.agents import LlmAgent

booking_agent = LlmAgent(name="Booker", model="gemini-2.5-flash",
                         description="Handles flight and hotel bookings.")
info_agent = LlmAgent(name="Info", model="gemini-2.5-flash",
                      description="Answers general questions and provides information.")

root_agent = LlmAgent(
    name="Coordinator",
    model="gemini-2.5-flash",
    instruction="Delegate booking tasks to Booker, info queries to Info.",
    sub_agents=[booking_agent, info_agent],
    # AutoFlow handles transfer_to_agent() calls automatically
)

Rules:

Each agent instance can only have one parent (ValueError if added twice).
Target agents need descriptive description fields for LLM routing.
Use root_agent.find_agent("name") to look up agents by name.

Sequential Agent

from google.adk.agents import SequentialAgent

fetch = LlmAgent(name="Fetch", instruction="Fetch data about {topic}.", output_key="raw_data")
process = LlmAgent(name="Process", instruction="Process this data: {raw_data}.", output_key="result")

pipeline = SequentialAgent(name="Pipeline", sub_agents=[fetch, process])
# fetch runs first, saves to state['raw_data']; process reads it via {raw_data} template

Parallel Agent

from google.adk.agents import ParallelAgent

weather = LlmAgent(name="Weather", instruction="Get weather for {city}.", output_key="weather")
news = LlmAgent(name="News", instruction="Get news for {city}.", output_key="news")

gatherer = ParallelAgent(name="Gatherer", sub_agents=[weather, news])
# Runs concurrently. Both write to shared session.state (use distinct output_key values!)

Loop Agent

from google.adk.agents import LoopAgent, BaseAgent
from google.adk.agents.invocation_context import InvocationContext
from google.adk.events import Event, EventActions
from typing import AsyncGenerator

class StopWhenDone(BaseAgent):
    async def _run_async_impl(self, ctx: InvocationContext) -> AsyncGenerator[Event, None]:
        done = ctx.session.state.get("task_complete", False)
        yield Event(author=self.name, actions=EventActions(escalate=done))

worker = LlmAgent(name="Worker", instruction="Do one step. Set state task_complete=True when done.")

loop = LoopAgent(
    name="RetryLoop",
    max_iterations=5,
    sub_agents=[worker, StopWhenDone(name="Checker")]
)
# Loop stops when Checker escalates OR max_iterations (5) reached.

Context: Session, State, Memory

Session & Runner Setup

from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
import asyncio

APP_NAME = "my_app"
USER_ID = "user_001"
SESSION_ID = "session_001"

session_service = InMemorySessionService()
session = asyncio.run(session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID
))
runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)

Note: InMemorySessionService is for dev/testing only. All data is lost on restart.
For production use VertexAiSessionService or DatabaseSessionService.

Reading & Writing State

# In a tool:
def update_cart(item: str, tool_context: ToolContext) -> dict:
    cart = tool_context.state.get("cart", [])
    cart.append(item)
    tool_context.state["cart"] = cart
    return {"status": "success", "cart": cart}

# State key prefixes:
# "key"       → persists for session lifetime
# "user:key"  → persists across sessions for this user
# "app:key"   → persists across all users/sessions for this app
# "temp:key"  → only for current invocation turn (not persisted)

Passing Initial State to Session

session = await session_service.create_session(
    app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID,
    state={"user_name": "Alice", "language": "en"}
)

Running the Agent

from google.genai import types

async def call_agent(query: str):
    content = types.Content(role="user", parts=[types.Part(text=query)])
    async for event in runner.run_async(
        user_id=USER_ID, session_id=SESSION_ID, new_message=content
    ):
        if event.is_final_response() and event.content:
            print("Response:", event.content.parts[0].text)

Memory Service (Cross-Session)

from google.adk.memory import InMemoryMemoryService  # dev only

memory_service = InMemoryMemoryService()
runner = Runner(agent=root_agent, app_name=APP_NAME,
                session_service=session_service, memory_service=memory_service)
# For production: VertexAiMemoryService

Callbacks

Callbacks let you observe and modify agent behavior at key lifecycle points.

from google.adk.agents.callback_context import CallbackContext
from google.adk.models import LlmRequest, LlmResponse
from google.adk.tools import BaseTool
from typing import Optional

# --- Before model call ---
def my_before_model(callback_context: CallbackContext, llm_request: LlmRequest) -> Optional[LlmResponse]:
    print(f"[Callback] About to call LLM. Turn: {callback_context.invocation_id}")
    # Return an LlmResponse to SKIP the actual model call
    return None  # None = proceed normally

# --- After model call ---
def my_after_model(callback_context: CallbackContext, llm_response: LlmResponse) -> Optional[LlmResponse]:
    # Modify or replace the response
    return llm_response  # return modified or original

# --- Before tool call ---
def my_before_tool(tool: BaseTool, args: dict, callback_context: CallbackContext) -> Optional[dict]:
    print(f"[Callback] Tool '{tool.name}' called with {args}")
    # Return a dict to SHORT-CIRCUIT the tool call with that result
    return None  # None = proceed normally

agent = Agent(
    model="gemini-2.5-flash",
    name="monitored_agent",
    before_model_callback=my_before_model,
    after_model_callback=my_after_model,
    before_tool_callback=my_before_tool,
)

Available callbacks:

before_agent_callback / after_agent_callback
before_model_callback / after_model_callback
before_tool_callback / after_tool_callback

Running Agents

CLI Commands

adk run my_agent              # Interactive CLI chat
adk web --port 8000           # Web UI (dev only)
adk api_server                # Start local REST API server
adk eval my_agent evals/      # Run evaluations
adk deploy agent_engine ...   # Deploy to Vertex AI Agent Engine

Async Runner Pattern (Recommended)

import asyncio
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types

async def main():
    session_service = InMemorySessionService()
    session = await session_service.create_session(
        app_name="app", user_id="u1", session_id="s1"
    )
    runner = Runner(agent=root_agent, app_name="app", session_service=session_service)

    content = types.Content(role="user", parts=[types.Part(text="Hello!")])
    async for event in runner.run_async(user_id="u1", session_id="s1", new_message=content):
        if event.is_final_response():
            print(event.content.parts[0].text)

asyncio.run(main())

Sync Runner (Simple Testing)

from google.adk.runners import InMemoryRunner  # convenience wrapper

runner = InMemoryRunner(agent=root_agent)
session = asyncio.run(runner.session_service.create_session(
    app_name=runner.app_name, user_id="u1"
))
# then run_async as above

Models

Gemini (default)

Agent(model="gemini-2.5-flash", ...)   # fast, efficient
Agent(model="gemini-2.5-pro", ...)     # most capable
Agent(model="gemini-2.0-flash", ...)   # balanced

Set GOOGLE_API_KEY in .env for Google AI Studio.
For Vertex AI: set GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION.

Claude (Anthropic) via Vertex AI

# pip install google-adk[anthropic]
from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="anthropic/claude-sonnet-4-6"),
    name="claude_agent",
    ...
)
# Requires ANTHROPIC_API_KEY or Vertex AI Claude setup

LiteLLM (100+ models)

from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="openai/gpt-4o"),
    name="gpt_agent",
    ...
)
# Set relevant API keys in .env (OPENAI_API_KEY, etc.)

Ollama (Local Models)

from google.adk.models.lite_llm import LiteLlm

agent = Agent(
    model=LiteLlm(model="ollama/llama3"),
    name="local_agent",
    ...
)
# Run: ollama serve (default: http://localhost:11434)

Deployment

Vertex AI Agent Engine (Managed, Production)

# 1. Authenticate
gcloud auth login
gcloud auth application-default login

# 2. Enable APIs
# - Vertex AI API
# - Cloud Resource Manager API

# 3. Deploy
adk deploy agent_engine \
    --project=MY_PROJECT_ID \
    --region=us-central1 \
    --display_name="My Agent" \
    my_agent/

After deployment, interact via REST:

POST https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/reasoningEngines/{RESOURCE_ID}:query

Or via Vertex AI SDK:

import vertexai
agent_engine = vertexai.agent_engines.get("projects/.../reasoningEngines/RESOURCE_ID")

Cloud Run

adk deploy cloud_run \
    --project=MY_PROJECT_ID \
    --region=us-central1 \
    my_agent/

Streaming

Bidi-Streaming (Live) Agent

from google.adk.agents import LiveRequestQueue
from google.adk.runners import Runner

runner = Runner(agent=root_agent, app_name="app", session_service=session_service)
live_request_queue = LiveRequestQueue()

async def stream_agent():
    async for event in runner.run_live(
        user_id="u1", session_id="s1",
        live_request_queue=live_request_queue
    ):
        if event.content:
            for part in event.content.parts:
                if part.text:
                    print(part.text, end="", flush=True)

# Send messages via live_request_queue.send_content(...)

MCP Tools

Use an MCP Server as Tools in ADK

import asyncio
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, StdioServerParameters

async def get_tools():
    tools, exit_stack = await MCPToolset.from_server(
        connection_params=StdioServerParameters(
            command="npx",
            args=["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"],
        )
    )
    return tools, exit_stack

async def main():
    tools, exit_stack = await get_tools()
    async with exit_stack:
        agent = Agent(
            model="gemini-2.5-flash",
            name="mcp_agent",
            tools=tools,
            instruction="Use the filesystem tools to help the user.",
        )
        # ... run agent

For SSE-based MCP servers:

from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset, SseServerParams

tools, exit_stack = await MCPToolset.from_server(
    connection_params=SseServerParams(url="http://localhost:3000/sse")
)

Patterns & Gotchas

✅ Correct `root_agent` export (required by ADK)

# agent.py — root_agent must be defined at module level
from google.adk.agents import Agent

def my_tool(x: int) -> dict:
    """Does something."""
    return {"result": x * 2}

root_agent = Agent(
    model="gemini-2.5-flash",
    name="root_agent",
    instruction="You are helpful.",
    tools=[my_tool],
)

✅ Sequential state passing pattern

# Use output_key + {var} template for pipeline data flow
step1 = Agent(name="Step1", instruction="Extract topic from input.", output_key="topic")
step2 = Agent(name="Step2", instruction="Research {topic} in depth.", output_key="research")
step3 = Agent(name="Step3", instruction="Write a report about {research}.")
pipeline = SequentialAgent(name="Pipeline", sub_agents=[step1, step2, step3])

❌ Avoid `output_schema` + `tools` together

output_schema forces JSON-only mode which disables tool calls. Use one or the other.

❌ Avoid duplicate agent instances in sub_agents

# WRONG — agent can only have one parent
shared_agent = Agent(name="Shared", ...)
parent1 = Agent(name="P1", sub_agents=[shared_agent])
parent2 = Agent(name="P2", sub_agents=[shared_agent])  # ValueError!

# RIGHT — create separate instances

✅ Async-first design

ADK is async-native. Always use run_async and asyncio.run(main()) in scripts.
For Jupyter/Colab, use await directly at the top level.

✅ Tool error handling

def safe_tool(param: str) -> dict:
    """Does something safely."""
    try:
        result = do_work(param)
        return {"status": "success", "result": result}
    except Exception as e:
        return {"status": "error", "error_message": str(e)}
# Always return {"status": "error", "error_message": "..."} on failure
# Never raise exceptions from tools — the LLM needs to read the error

✅ State key prefix reference

| Prefix | Scope | |---|---| | (none) | Current session | | user: | All sessions for this user | | app: | All sessions in this app | | temp: | Current invocation only (not persisted) |

Reference Files

For deeper content on specific topics, read the relevant reference file:

references/callbacks.md — Callback patterns and best practices
references/custom-agents.md — Building BaseAgent subclasses
references/evaluate.md — Agent evaluation and testing
references/models-auth.md — Model authentication details for all providers
references/artifacts.md — Working with ADK Artifacts (file-like objects)

Load these files only when the user's task specifically requires that depth.

Adoption

akshatbindal/google-adk-python

$ install --global

Security Scan Results

SKILL.md

Google ADK Python — Agent Development Kit

Quick Reference Index

Setup

LlmAgent

Minimal Agent

Key Parameters

Instruction Templating (State Variables)

Structured Output

LLM Config (Temperature, Tokens, Safety)

Planners

Tools

Function Tool (Python function → Tool)

Passing Context to Tools (ToolContext)

Long-Running Tool

Agent-as-Tool (AgentTool)

Multi-Agent Systems

Agent Hierarchy & Delegation

Sequential Agent

Parallel Agent

Loop Agent

Context: Session, State, Memory

Session & Runner Setup

Reading & Writing State

Passing Initial State to Session

Running the Agent

Memory Service (Cross-Session)

Callbacks

Running Agents

CLI Commands

Async Runner Pattern (Recommended)

Sync Runner (Simple Testing)

Models

Gemini (default)

Claude (Anthropic) via Vertex AI

LiteLLM (100+ models)

Ollama (Local Models)

Deployment

Vertex AI Agent Engine (Managed, Production)

Cloud Run

Streaming

Bidi-Streaming (Live) Agent

MCP Tools

Use an MCP Server as Tools in ADK

Patterns & Gotchas

✅ Correct root_agent export (required by ADK)

✅ Sequential state passing pattern

❌ Avoid output_schema + tools together

❌ Avoid duplicate agent instances in sub_agents

✅ Async-first design

✅ Tool error handling

✅ State key prefix reference

Reference Files

Related Skills

openclaw/taskflow

openclaw/extensions/lobster

steipete/extensions/lobster

steipete/xurl

akshatbindal/google-adk-python

$ install --global

Security Scan Results

SKILL.md

Google ADK Python — Agent Development Kit

Quick Reference Index

Setup

LlmAgent

Minimal Agent

Key Parameters

Instruction Templating (State Variables)

Structured Output

LLM Config (Temperature, Tokens, Safety)

Planners

Tools

Function Tool (Python function → Tool)

Passing Context to Tools (ToolContext)

Long-Running Tool

Passing Context to Tools (`ToolContext`)

Agent-as-Tool (`AgentTool`)

✅ Correct `root_agent` export (required by ADK)

❌ Avoid `output_schema` + `tools` together

Passing Context to Tools (`ToolContext`)

Agent-as-Tool (`AgentTool`)

✅ Correct `root_agent` export (required by ADK)

❌ Avoid `output_schema` + `tools` together