Pydantic AI

Python agent framework for building production-grade GenAI applications with the "FastAPI feeling".

Quick Navigation

| Topic | Reference | | ------------ | --------------------------------------------- | | Agents | agents.md | | Capabilities | agents.md | | Tools | tools.md | | Models | models.md | | Embeddings | embeddings.md | | Evals | evals.md | | Integrations | integrations.md | | Graphs | graphs.md | | UI Streams | ui.md | | Installation | installation.md |

When to Use

Building AI agents with structured output
Need type-safe, IDE-friendly agent development
Require dependency injection for tools
Multi-model support (OpenAI, Anthropic, Gemini, etc.)
Production observability with Logfire
Complex workflows with graphs

Installation

See references/installation.md for full/slim install options and optional dependency groups. Requires Python 3.10+.

Release Highlights (1.96.1)

V2 preparation: Agent(..., prepare_tools=..., prepare_output_tools=..., event_stream_handler=...) is now the deprecated path; capability-based migration is the new direction.
Capability migration: use PrepareTools, PrepareOutputTools, and ProcessEventStream capabilities instead of wiring those behaviors through constructor sugar.
OpenAI fixes: the latest patch line also tightens OpenAIResponsesModel system-prompt-role handling and image-generation request shaping.

Release Highlights (1.97.0 -> 1.102.0)

MCP migration: prefer MCPToolset for new MCP integrations. FastMCPToolset and the older MCPServer* client surface are now on the deprecation path.
Google provider split: GoogleProvider and GoogleCloudProvider are now distinct, and model ids move from google-gla: / google-vertex: to google: / google-cloud:.
Streaming migration: move from stream_responses() to stream_response(); the newer API yields ModelResponse objects directly.
Retry configuration: prefer Agent(retries=...) or AgentRetries(...) over older constructor-level retry knobs.
New runtime tools: ctx.enqueue() and MCP background tasks make it easier to queue follow-up work without forcing it into the current response turn.

Release Highlights (1.75.0 -> 1.84.1)

Capabilities: CapabilityOrdering adds explicit wrapping/ordering control (innermost, outermost, wraps, wrapped_by, requires) for complex capability stacks.
Compaction: new server-side compaction capabilities for OpenAI and Anthropic; OpenAI adds stateful compaction mode.
Models: Claude Opus 4.7 support and a native OllamaModel path with corrected Ollama capability flags for structured output.
Tools: tool hooks now consistently receive dict-shaped validated args for single-BaseModel tools, and internal output tools skip hook execution.
Hardening: Google FileSearchTool parsing received regex hardening in the 1.83/1.84 line.

Release Highlights (1.71.0 → 1.74.0)

Capabilities: composable, reusable units of agent behavior that bundle tools, lifecycle hooks, instructions, and model settings into a single class. Plug into any agent for maximum reuse.
AgentSpec: load agents from YAML/JSON files via Agent.from_file. Supports TemplateStr for templated instructions referencing deps.
Hooks capability: define hooks using decorators (@hooks.on_model_request, etc.).
Thinking capability: cross-provider thinking model setting for reasoning.
Provider-adaptive tools: WebSearch, WebFetch, MCP, ImageGeneration — automatically fall back from builtin (provider) tools to local tools.
Online evaluation: evaluation infrastructure in pydantic-evals.
TextContent: user prompts with metadata not sent to model.
CaseLifecycle hooks: hooks for Dataset.evaluate lifecycle.
Model swapping in hooks: before_model_request / wrap hooks can swap models via ModelRequestContext.
ModelRetry from hooks: hooks can raise ModelRetry for retry control flow.
Sync tool preparation functions supported. MCP capability no longer requires explicit url=.

Release Highlights (1.69.0 → 1.70.0)

Agents: Agent(description=...) adds a human-readable description to the run span as gen_ai.agent.description when instrumentation is enabled.
Models: FallbackModel now supports response-based fallback handlers for semantic failures in non-streaming runs.
Tools: multimodal tool results are passed directly to provider APIs instead of always being split into extra user-message parts.
Bedrock: bedrock_inference_profile is available on model and embedding settings for routing requests through an inference profile ARN.
Stability: provider fixes landed for OpenRouter Anthropic model matching, Cohere embeddings, Google image sizes, Bedrock tool-name sanitization, and malformed tool-call retry handling.

Quick Start

Basic Agent

from pydantic_ai import Agent

agent = Agent(
    'openai:gpt-4o',
    instructions='Be concise, reply with one sentence.'
)

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)

With Structured Output

from pydantic import BaseModel
from pydantic_ai import Agent

class CityInfo(BaseModel):
    name: str
    country: str
    population: int

agent = Agent('openai:gpt-4o', output_type=CityInfo)
result = agent.run_sync('Tell me about Paris')
print(result.output)  # CityInfo(name='Paris', country='France', population=2161000)

With Tools and Dependencies

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext

@dataclass
class Deps:
    user_id: int

agent = Agent('openai:gpt-4o', deps_type=Deps)

@agent.tool
async def get_user_name(ctx: RunContext[Deps]) -> str:
    """Get the current user's name."""
    return f"User #{ctx.deps.user_id}"

result = agent.run_sync('What is my name?', deps=Deps(user_id=123))

Key Features

| Feature | Description | | -------------------- | ------------------------------- | | Type-safe | Full IDE support, type checking | | Model-agnostic | 30+ providers supported | | Dependency Injection | Pass context to tools | | Structured Output | Pydantic model validation | | Embeddings | Multi-provider vector support | | Logfire Integration | Built-in observability | | MCP Support | External tools and data | | Evals | Systematic testing | | Graphs | Complex workflow support |

Supported Models

| Provider | Models | | --------- | ------------------------------------- | | OpenAI | GPT-4o, GPT-4, o1, o3 | | Anthropic | Claude Opus 4.6, Claude 4, Claude 3.5 | | Google | Gemini 2.0, Gemini 1.5 | | xAI | Grok-4 (native SDK) | | Groq | Llama, Mixtral | | Mistral | Mistral Large, Codestral | | Azure | Azure OpenAI | | Bedrock | AWS Bedrock + Nova 2.0 | | SambaNova | SambaNova models | | Ollama | Local models |

Best Practices

Use type hints — enables IDE support and validation
Define output types — guarantees structured responses
Use dependencies — inject context into tools
Add tool docstrings — LLM uses them as descriptions
Enable Logfire — for production observability
Use run_sync for simple cases — run for async
Override deps for testing — agent.override(deps=...)
Set usage limits — prevent infinite loops with UsageLimits

Prohibitions

Do not expose API keys in code
Do not skip output validation in production
Do not ignore tool errors
Do not use run_stream without handling partial outputs
Do not forget to close MCP connections (async with agent)
Do not assume capability order is arbitrary once multiple wrappers/hooks are involved; define it explicitly when composition matters.

Common Patterns

Streaming Response

async with agent.run_stream('Query') as response:
    async for text in response.stream_text():
        print(text, end='')

Fallback Models

from pydantic_ai.models.fallback import FallbackModel

fallback = FallbackModel(openai_model, anthropic_model)
agent = Agent(fallback)

MCP Integration

from pydantic_ai.mcp import MCPToolset

toolset = MCPToolset(command='python', args=['mcp_server.py'])
agent = Agent('openai:gpt-4o', toolsets=[toolset])

Testing with TestModel

from pydantic_ai.models.test import TestModel

agent = Agent(model=TestModel())
result = agent.run_sync('test')  # Deterministic output

Embeddings

from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')

# Embed search query
result = await embedder.embed_query('What is ML?')

# Embed documents for indexing
docs = ['Doc 1', 'Doc 2', 'Doc 3']
result = await embedder.embed_documents(docs)

See embeddings.md for providers and settings.

xAI Provider

from pydantic_ai import Agent

agent = Agent('xai:grok-4-1-fast-non-reasoning')

See models.md for configuration details.

Exa Neural Search

import os
from pydantic_ai import Agent
from pydantic_ai.common_tools.exa import ExaToolset

api_key = os.getenv('EXA_API_KEY')
toolset = ExaToolset(api_key, num_results=5, include_search=True)
agent = Agent('openai:gpt-4o', toolsets=[toolset])

See tools.md for all Exa tools.

Pydantic AI

Python agent framework for building production-grade GenAI applications with the "FastAPI feeling".

Quick Navigation

When to Use

Building AI agents with structured output
Need type-safe, IDE-friendly agent development
Require dependency injection for tools
Multi-model support (OpenAI, Anthropic, Gemini, etc.)
Production observability with Logfire
Complex workflows with graphs

Installation

See references/installation.md for full/slim install options and optional dependency groups. Requires Python 3.10+.

Release Highlights (1.96.1)

V2 preparation: Agent(..., prepare_tools=..., prepare_output_tools=..., event_stream_handler=...) is now the deprecated path; capability-based migration is the new direction.
Capability migration: use PrepareTools, PrepareOutputTools, and ProcessEventStream capabilities instead of wiring those behaviors through constructor sugar.
OpenAI fixes: the latest patch line also tightens OpenAIResponsesModel system-prompt-role handling and image-generation request shaping.

Release Highlights (1.97.0 -> 1.102.0)

MCP migration: prefer MCPToolset for new MCP integrations. FastMCPToolset and the older MCPServer* client surface are now on the deprecation path.
Google provider split: GoogleProvider and GoogleCloudProvider are now distinct, and model ids move from google-gla: / google-vertex: to google: / google-cloud:.
Streaming migration: move from stream_responses() to stream_response(); the newer API yields ModelResponse objects directly.
Retry configuration: prefer Agent(retries=...) or AgentRetries(...) over older constructor-level retry knobs.
New runtime tools: ctx.enqueue() and MCP background tasks make it easier to queue follow-up work without forcing it into the current response turn.

Release Highlights (1.75.0 -> 1.84.1)

Capabilities: CapabilityOrdering adds explicit wrapping/ordering control (innermost, outermost, wraps, wrapped_by, requires) for complex capability stacks.
Compaction: new server-side compaction capabilities for OpenAI and Anthropic; OpenAI adds stateful compaction mode.
Models: Claude Opus 4.7 support and a native OllamaModel path with corrected Ollama capability flags for structured output.
Tools: tool hooks now consistently receive dict-shaped validated args for single-BaseModel tools, and internal output tools skip hook execution.
Hardening: Google FileSearchTool parsing received regex hardening in the 1.83/1.84 line.

Release Highlights (1.71.0 → 1.74.0)

Capabilities: composable, reusable units of agent behavior that bundle tools, lifecycle hooks, instructions, and model settings into a single class. Plug into any agent for maximum reuse.
AgentSpec: load agents from YAML/JSON files via Agent.from_file. Supports TemplateStr for templated instructions referencing deps.
Hooks capability: define hooks using decorators (@hooks.on_model_request, etc.).
Thinking capability: cross-provider thinking model setting for reasoning.
Provider-adaptive tools: WebSearch, WebFetch, MCP, ImageGeneration — automatically fall back from builtin (provider) tools to local tools.
Online evaluation: evaluation infrastructure in pydantic-evals.
TextContent: user prompts with metadata not sent to model.
CaseLifecycle hooks: hooks for Dataset.evaluate lifecycle.
Model swapping in hooks: before_model_request / wrap hooks can swap models via ModelRequestContext.
ModelRetry from hooks: hooks can raise ModelRetry for retry control flow.
Sync tool preparation functions supported. MCP capability no longer requires explicit url=.

Release Highlights (1.69.0 → 1.70.0)

Agents: Agent(description=...) adds a human-readable description to the run span as gen_ai.agent.description when instrumentation is enabled.
Models: FallbackModel now supports response-based fallback handlers for semantic failures in non-streaming runs.
Tools: multimodal tool results are passed directly to provider APIs instead of always being split into extra user-message parts.
Bedrock: bedrock_inference_profile is available on model and embedding settings for routing requests through an inference profile ARN.
Stability: provider fixes landed for OpenRouter Anthropic model matching, Cohere embeddings, Google image sizes, Bedrock tool-name sanitization, and malformed tool-call retry handling.

Quick Start

Basic Agent

from pydantic_ai import Agent

agent = Agent(
    'openai:gpt-4o',
    instructions='Be concise, reply with one sentence.'
)

result = agent.run_sync('Where does "hello world" come from?')
print(result.output)

With Structured Output

from pydantic import BaseModel
from pydantic_ai import Agent

class CityInfo(BaseModel):
    name: str
    country: str
    population: int

agent = Agent('openai:gpt-4o', output_type=CityInfo)
result = agent.run_sync('Tell me about Paris')
print(result.output)  # CityInfo(name='Paris', country='France', population=2161000)

With Tools and Dependencies

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext

@dataclass
class Deps:
    user_id: int

agent = Agent('openai:gpt-4o', deps_type=Deps)

@agent.tool
async def get_user_name(ctx: RunContext[Deps]) -> str:
    """Get the current user's name."""
    return f"User #{ctx.deps.user_id}"

result = agent.run_sync('What is my name?', deps=Deps(user_id=123))

Key Features

Supported Models

Best Practices

Use type hints — enables IDE support and validation
Define output types — guarantees structured responses
Use dependencies — inject context into tools
Add tool docstrings — LLM uses them as descriptions
Enable Logfire — for production observability
Use run_sync for simple cases — run for async
Override deps for testing — agent.override(deps=...)
Set usage limits — prevent infinite loops with UsageLimits

Prohibitions

Do not expose API keys in code
Do not skip output validation in production
Do not ignore tool errors
Do not use run_stream without handling partial outputs
Do not forget to close MCP connections (async with agent)
Do not assume capability order is arbitrary once multiple wrappers/hooks are involved; define it explicitly when composition matters.

Common Patterns

Streaming Response

async with agent.run_stream('Query') as response:
    async for text in response.stream_text():
        print(text, end='')

Fallback Models

from pydantic_ai.models.fallback import FallbackModel

fallback = FallbackModel(openai_model, anthropic_model)
agent = Agent(fallback)

MCP Integration

from pydantic_ai.mcp import MCPToolset

toolset = MCPToolset(command='python', args=['mcp_server.py'])
agent = Agent('openai:gpt-4o', toolsets=[toolset])

Testing with TestModel

from pydantic_ai.models.test import TestModel

agent = Agent(model=TestModel())
result = agent.run_sync('test')  # Deterministic output

Embeddings

from pydantic_ai import Embedder

embedder = Embedder('openai:text-embedding-3-small')

# Embed search query
result = await embedder.embed_query('What is ML?')

# Embed documents for indexing
docs = ['Doc 1', 'Doc 2', 'Doc 3']
result = await embedder.embed_documents(docs)

See embeddings.md for providers and settings.

xAI Provider

from pydantic_ai import Agent

agent = Agent('xai:grok-4-1-fast-non-reasoning')

See models.md for configuration details.

Exa Neural Search

import os
from pydantic_ai import Agent
from pydantic_ai.common_tools.exa import ExaToolset

api_key = os.getenv('EXA_API_KEY')
toolset = ExaToolset(api_key, num_results=5, include_search=True)
agent = Agent('openai:gpt-4o', toolsets=[toolset])

See tools.md for all Exa tools.

Adoption

itechmeat/pydantic-ai

$ install --global

Security Scan Results

SKILL.md

Pydantic AI

Quick Navigation

When to Use

Installation

Release Highlights (1.96.1)

Release Highlights (1.97.0 -> 1.102.0)

Release Highlights (1.75.0 -> 1.84.1)

Release Highlights (1.71.0 → 1.74.0)

Release Highlights (1.69.0 → 1.70.0)

Quick Start

Basic Agent

With Structured Output

With Tools and Dependencies

Key Features

Supported Models

Best Practices

Prohibitions

Common Patterns

Streaming Response

Fallback Models

MCP Integration

Testing with TestModel

Embeddings

xAI Provider

Exa Neural Search

Links

Related Skills

itechmeat/zvec

itechmeat/vitest

itechmeat/vite

itechmeat/vibekanban

itechmeat/pydantic-ai

$ install --global

Security Scan Results

SKILL.md

Pydantic AI

Quick Navigation

When to Use

Installation

Release Highlights (1.96.1)

Release Highlights (1.97.0 -> 1.102.0)

Release Highlights (1.75.0 -> 1.84.1)

Release Highlights (1.71.0 → 1.74.0)

Release Highlights (1.69.0 → 1.70.0)

Quick Start

Basic Agent

With Structured Output

With Tools and Dependencies

Key Features

Supported Models

Best Practices

Prohibitions

Common Patterns

Streaming Response

Fallback Models

MCP Integration

Testing with TestModel

Embeddings

xAI Provider

Exa Neural Search

Links

Related Skills

itechmeat/zvec

itechmeat/vitest

itechmeat/vite

itechmeat/vibekanban