agent-engine/SKILL.md
Deploy and manage AI agents on Vertex AI Agent Engine. Use when deploying ADK agents to production, configuring Agent Engine runtime, managing deployed agents, setting up sessions and memory, or integrating with A2A protocol. Covers deployment from agent objects and source files, environment configuration, scaling, sessions, memory bank, and agent management operations.
npx skillsauth add jswortz/my-skills agent-engineInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Deploy, manage, and scale AI agents in production on Google Cloud.
| Task | Pattern |
|------|---------|
| Install SDK | pip install google-cloud-aiplatform[agent_engines,adk] |
| Initialize client | vertexai.Client(project, location) |
| Deploy agent | client.agent_engines.create(agent=app, config={...}) |
| Query agent | remote_agent.stream_query(user_id, message) |
| Delete agent | remote_agent.delete(force=True) |
# Required IAM roles
- Vertex AI User (roles/aiplatform.user)
- Storage Admin (roles/storage.admin)
# Enable APIs
gcloud services enable aiplatform.googleapis.com storage.googleapis.com
pip install --upgrade google-cloud-aiplatform[agent_engines,adk]>=1.112
# For A2A support
pip install google-cloud-aiplatform[agent_engines,a2a]
import vertexai
client = vertexai.Client(
project="your-project-id",
location="us-central1"
)
Best for development and testing.
from google.adk.agents import Agent
from vertexai import agent_engines
# Create agent
def get_data(query: str) -> dict:
"""Gets data for a query.
Args:
query: The search query.
"""
return {"status": "success", "data": "..."}
agent = Agent(
model="gemini-2.0-flash",
name="data_agent",
instruction="You help users find data.",
tools=[get_data]
)
# Wrap in AdkApp
app = agent_engines.AdkApp(agent=agent)
# Deploy
remote_agent = client.agent_engines.create(
agent=app,
config={
"display_name": "data-agent",
"requirements": [
"google-cloud-aiplatform[agent_engines,adk]",
"requests"
],
"staging_bucket": "gs://your-bucket"
}
)
print(f"Deployed: {remote_agent.resource_name}")
Best for production deployments.
remote_agent = client.agent_engines.create(
config={
"display_name": "production-agent",
"source_packages": ["./agent_package"],
"entrypoint_module": "agent_package.agent",
"entrypoint_object": "root_agent",
"requirements_file": "requirements.txt",
"staging_bucket": "gs://your-bucket",
"env_vars": {
"API_KEY": "...",
"DATABASE_URL": "..."
}
}
)
config = {
# Required
"staging_bucket": "gs://your-bucket",
# Package options
"requirements": ["package1", "package2"], # OR
"requirements_file": "requirements.txt",
"extra_packages": ["./local_package", "/path/to/dependency"],
# Source deployment
"source_packages": ["./agent_dir"],
"entrypoint_module": "agent_dir.agent",
"entrypoint_object": "root_agent",
# Display
"display_name": "My Agent",
"description": "Agent description",
# Environment
"env_vars": {"KEY": "value"},
# Resources
"resource_config": {
"min_instances": 1,
"max_instances": 10,
"cpu": "2",
"memory": "4Gi"
}
}
from google.adk.sessions import InMemorySessionService
from vertexai.agent_engines import AdkApp
# Avoid httpx client issues in Agent Engine
session_service_builder = lambda **kwargs: InMemorySessionService()
app = AdkApp(
agent=agent,
session_service_builder=session_service_builder
)
Agent Engine provides managed sessions automatically when no custom service is specified.
response = remote_agent.query(
user_id="user-123",
message="What is the weather today?"
)
print(response)
for event in remote_agent.stream_query(
user_id="user-123",
message="Explain quantum computing"
):
print(event)
async for event in remote_agent.async_stream_query(
user_id="user-123",
message="Analyze this data"
):
print(event)
Deploy agents in phases: leaf → functional → orchestrator.
from google.adk.agents import Agent
from vertexai.agent_engines import AdkApp
from google.adk.sessions import InMemorySessionService
import a2a
import os
def get_pto_balance(user_id: str) -> dict:
"""Gets PTO balance for a user."""
return {"user_id": user_id, "balance": 15}
pto_agent = Agent(
name="pto_agent",
model="gemini-2.0-flash",
instruction="You check PTO balances. Use get_pto_balance when asked.",
tools=[get_pto_balance]
)
# Bundle a2a package
a2a_path = os.path.dirname(a2a.__file__)
pto_remote = client.agent_engines.create(
agent=AdkApp(
agent=pto_agent,
session_service_builder=lambda **kwargs: InMemorySessionService()
),
config={
"display_name": "pto-agent",
"requirements": [
"google-cloud-aiplatform[agent_engines,a2a]",
"google-adk[a2a]",
"a2a-sdk>=0.3.5"
],
"extra_packages": ["./agent_system", a2a_path],
"staging_bucket": "gs://your-bucket"
}
)
pto_url = f"https://us-central1-aiplatform.googleapis.com/v1/{pto_remote.resource_name}"
from google.adk.agents.remote_a2a_agent import RemoteA2aAgent
# Functional agent connects to leaf via A2A
# Pass leaf URL as environment variable
hr_remote = client.agent_engines.create(
agent=AdkApp(
agent=hr_agent,
session_service_builder=lambda **kwargs: InMemorySessionService()
),
config={
"display_name": "hr-agent",
"requirements": [...],
"env_vars": {"PTO_AGENT_URL": pto_url},
"extra_packages": ["./agent_system", a2a_path],
"staging_bucket": "gs://your-bucket"
}
)
orch_remote = client.agent_engines.create(
agent=AdkApp(
agent=orchestrator_agent,
session_service_builder=lambda **kwargs: InMemorySessionService()
),
config={
"display_name": "orchestrator",
"requirements": [...],
"env_vars": {
"HR_AGENT_URL": hr_url,
"FINANCE_AGENT_URL": finance_url
},
"extra_packages": ["./agent_system", a2a_path],
"staging_bucket": "gs://your-bucket"
}
)
agents = client.agent_engines.list()
for agent in agents:
print(f"{agent.display_name}: {agent.resource_name}")
agent = client.agent_engines.get(resource_name="projects/.../agentEngines/...")
print(agent.display_name)
print(agent.create_time)
agent.update(
display_name="Updated Agent Name",
description="New description"
)
# Safe delete
remote_agent.delete()
# Force delete (removes associated resources)
remote_agent.delete(force=True)
For detailed information on sessions, memory bank, scaling optimization, and framework integration, see:
references/advanced_deployment.md - Comprehensive Agent Engine reference including:
# Required
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
# For Vertex AI
export GOOGLE_GENAI_USE_VERTEXAI="1"
# Staging bucket
export BUCKET="gs://your-staging-bucket"
| Framework | Integration Level | |-----------|------------------| | ADK | Full | | LangChain | Full | | LangGraph | Full | | LlamaIndex | SDK | | AG2 | SDK | | CrewAI | Custom Template | | Custom | Custom Template |
import os
from google.cloud import aiplatform
from vertexai.agent_engines import AdkApp
from google.adk.agents import Agent
from google.adk.sessions import InMemorySessionService
import vertexai.agent_engines as vae
# Configuration
PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT")
LOCATION = os.environ.get("GOOGLE_CLOUD_LOCATION", "us-central1")
BUCKET = os.environ.get("BUCKET")
def deploy_agent(agent, name, env_vars=None, extra_packages=None):
"""Deploy an ADK agent to Agent Engine."""
aiplatform.init(
project=PROJECT_ID,
location=LOCATION,
staging_bucket=BUCKET
)
requirements = [
"google-cloud-aiplatform[agent_engines,a2a]",
"google-adk[a2a]",
"uvicorn",
"fastapi",
"a2a-sdk>=0.3.5"
]
app = AdkApp(
agent=agent,
session_service_builder=lambda **kwargs: InMemorySessionService()
)
remote = vae.create(
agent_engine=app,
display_name=name,
requirements=requirements,
env_vars=env_vars,
extra_packages=extra_packages
)
print(f"Deployed {name}: {remote.resource_name}")
return remote
# Usage
if __name__ == "__main__":
from my_agent import root_agent
remote = deploy_agent(
root_agent,
"my-production-agent",
env_vars={"API_KEY": "..."},
extra_packages=["./my_agent"]
)
google-cloud-aiplatform[agent_engines,adk]>=1.112
google-adk[a2a]
uvicorn
fastapi
a2a-sdk>=0.3.5
When deploying via agent_engines.create() with an agent object, ADK uses cloudpickle. Any module referenced by the agent must be importable in the Agent Engine runtime. If the agent imports from local modules (config files, tools), the pickle will fail with module-not-found errors.
Solution: Build the agent inline in the deploy script with zero external module dependencies:
def _build_agent():
"""Build agent inline for Agent Engine — no external imports."""
from google.adk.agents import LlmAgent
# Inline ALL data (no filesystem reads at runtime)
_PERSONAS = [
{"id": "budget", "name": "Budget Shopper", "budget": 120.00},
# ...
]
agents = []
for p in _PERSONAS:
agents.append(LlmAgent(
name=f"shopper_{p['id']}",
model="gemini-2.5-flash",
instruction=f"You are {p['name']}. Budget: ${p['budget']:.2f}.",
))
return LlmAgent(
name="orchestrator",
model="gemini-2.5-flash",
instruction="...",
sub_agents=agents,
)
# Deploy
app = agent_engines.AdkApp(agent=_build_agent(), enable_tracing=True)
remote = agent_engines.create(agent_engine=app, display_name="My Agent",
requirements=["google-adk>=1.19.0", "google-cloud-aiplatform"])
Alternative to programmatic deployment:
cd src && adk deploy agent_engine \
--project=my-project \
--region=us-central1 \
--staging_bucket=gs://my-bucket \
--display_name="My Agent" \
--trace_to_cloud \
agent # directory containing agent.py with root_agent
Memory Bank is auto-enabled on Agent Engine when ADK >= 1.5.0 and GOOGLE_CLOUD_AGENT_ENGINE_ID env var is set (automatic in Agent Engine runtime). No deploy flags needed.
client.agent_engines.memories.generate(
name=agent_resource,
direct_contents_source={"events": [
{"content": {"role": "user", "parts": [{"text": "I'm allergic to peanuts."}]}},
]},
scope={"user_id": "user-123"},
config={"wait_for_completion": True},
)
Before persisting generated insights or facts to the Memory Bank, use a "Critic Agent" pattern to evaluate candidates. Score insights based on:
Only write memories/insights that exceed a specific CRITIC_SURVIVAL_THRESHOLD (e.g., >= 0.85) to prevent Memory Bank pollution and maintain high signal-to-noise ratio.
client.agent_engines.memories.generate(
name=agent_resource,
direct_memories_source={"direct_memories": [
{"fact": "Customer prefers organic produce"},
]},
scope={"user_id": "user-123"},
)
results = client.agent_engines.memories.retrieve(
name=agent_resource,
scope={"user_id": "user-123"},
)
memories = list(results)
results = client.agent_engines.memories.retrieve(
name=agent_resource,
scope={"user_id": "user-123"},
similarity_search_params={"search_query": "dietary restrictions", "top_k": 3},
)
Each agent engine has its own memory namespace. To share memories, generate/retrieve against the same agent engine resource, or use the same user_id scope across multiple agents.
from google.adk.tools.preload_memory_tool import PreloadMemoryTool
# Add to agent — auto-loads memories at conversation start
tools = [PreloadMemoryTool()]
PreloadMemoryTool only reads. To write memories, call memory_service.add_session_to_memory(session) explicitly after a session ends.
Run evals against deployed agents:
from google import genai
from google.genai import types
client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)
eval_set = types.EvalSet(eval_set_id="my_eval", eval_cases=[
types.EvalCase(
eval_case_id="test_1",
conversation_scenario=types.ConversationScenario(
starting_prompt="What are the closing procedures?",
conversation_plan="Ask for specific steps and safety checks.",
),
),
])
# Run inference then evaluate
result = client.evals.run_inference(agent=agent_resource, eval_set=eval_set,
config=types.RunInferenceConfig(eval_run_id="run_1"))
evaluation = client.evals.evaluate(eval_set=result, metrics=[
types.EvalMetric(metric_name="rubric_based_final_response_quality_v1"),
types.EvalMetric(metric_name="hallucinations_v1"),
types.EvalMetric(metric_name="safety_v1"),
types.EvalMetric(metric_name="tool_use_quality_v1"),
], config=types.EvaluateConfig(eval_run_id="run_1"))
import requests
import google.auth
from google.auth.transport.requests import Request
credentials, _ = google.auth.default()
credentials.refresh(Request())
url = (f"https://us-central1-aiplatform.googleapis.com/v1/"
f"projects/{project_number}/locations/us-central1/"
f"reasoningEngines/{agent_engine_id}:streamQuery")
resp = requests.post(url, headers={
"Authorization": f"Bearer {credentials.token}",
"Content-Type": "application/json",
}, json={"input": {"message": "Hello", "user_id": "user-1"}}, timeout=120)
# Response is newline-delimited JSON events
for line in resp.text.strip().split("\n"):
event = json.loads(line)
parts = event.get("content", {}).get("parts", [])
for part in parts:
if "text" in part:
print(part["text"])
| Issue | Solution |
|-------|----------|
| Package not found | Add to extra_packages or requirements |
| httpx client error | Use InMemorySessionService |
| Deployment timeout | Increase staging bucket size, reduce dependencies |
| Agent not responding | Check logs in Cloud Console |
| A2A connection failed | Verify agent URL ends with /.well-known/agent.json |
| Permission denied | Add required IAM roles |
| Bucket access error | Use gs:// prefix, ensure Storage Admin role |
| cloudpickle module not found | Inline all data in deploy script, no external imports |
| No query methods at runtime | Redeploy; broken deployments don't self-heal |
| Token overflow (>1M tokens) | Remove excess data stores from engine, reduce tool output size |
| Memories not persisting | Call add_session_to_memory() explicitly; PreloadMemoryTool is read-only |
| Eval not showing up | Need client.evals.run_inference() + client.evals.evaluate() calls |
research
Constant-Time Analyzer (ct-analyzer)
testing
--- name: condition-based-waiting description: -- name: Condition-Based Waiting description: Replace arbitrary timeouts with condition polling for reliable async tests when_to_use: when tests have ... --- -- name: Condition-Based Waiting description: Replace arbitrary timeouts with condition polling for reliable async tests when_to_use: when tests have race conditions, timing dependencies, or inconsistent pass/fail behavior version: 1.1.0 languages: all --- # Condition-Based Waiting ## Overvi
testing
--- name: collision-zone-thinking description: -- name: Collision-Zone Thinking description: Force unrelated concepts together to discover emergent properties - "What if we treated X like Y?" when_... --- -- name: Collision-Zone Thinking description: Force unrelated concepts together to discover emergent properties - "What if we treated X like Y?" when_to_use: when conventional approaches feel inadequate and you need breakthrough innovation by forcing unrelated concepts together version: 1.1.0
documentation
--- name: canvas-design description: -- name: canvas-design description: Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the... --- -- name: canvas-design description: Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid