.github/plugins/azure-skills/skills/microsoft-foundry/SKILL.md
Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, prompt optimizer workflows, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, improve prompt, prompt optimization, prompt optimizer, improve agent instructions, optimize agent instructions, optimize system prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
npx skillsauth add microsoft/skills microsoft-foundryInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill helps developers work with Microsoft Foundry resources, covering model discovery and deployment, complete dev lifecycle of AI agent, evaluation workflows, and troubleshooting.
MANDATORY: Before executing ANY workflow, you MUST read the corresponding sub-skill document. Do not call MCP tools for a workflow without reading its skill document. This applies even if you already know the MCP tool parameters — the skill document contains required workflow steps, pre-checks, and validation logic that must be followed. This rule applies on every new user message that triggers a different workflow, even if the skill is already loaded.
This skill includes specialized sub-skills for specific workflows. Use these instead of the main skill when they match your task:
| Sub-Skill | When to Use | Reference |
|-----------|-------------|-----------|
| deploy | Containerize, build, push to ACR, create/update/start/stop/clone agent deployments | deploy |
| invoke | Send messages to an agent, single or multi-turn conversations | invoke |
| observe | Evaluate agent quality, run batch evals, analyze failures, optimize prompts, improve agent instructions, compare versions, and set up CI/CD monitoring | observe |
| trace | Query traces, analyze latency/failures, correlate eval results to specific responses via App Insights customEvents | trace |
| troubleshoot | View container logs, query telemetry, diagnose failures | troubleshoot |
| create | Create new hosted agent applications. Supports Microsoft Agent Framework, LangGraph, or custom frameworks in Python or C#. Downloads starter samples from foundry-samples repo. | create |
| eval-datasets | Harvest production traces into evaluation datasets, manage dataset versions and splits, track evaluation metrics over time, detect regressions, and maintain full lineage from trace to deployment. Use for: create dataset from traces, dataset versioning, evaluation trending, regression detection, dataset comparison, eval lineage. | eval-datasets |
| project/create | Creating a new Azure AI Foundry project for hosting agents and models. Use when onboarding to Foundry or setting up new infrastructure. | project/create/create-foundry-project.md |
| resource/create | Creating Azure AI Services multi-service resource (Foundry resource) using Azure CLI. Use when manually provisioning AI Services resources with granular control. | resource/create/create-foundry-resource.md |
| models/deploy-model | Unified model deployment with intelligent routing. Handles quick preset deployments, fully customized deployments (version/SKU/capacity/RAI), and capacity discovery across regions. Routes to sub-skills: preset (quick deploy), customize (full control), capacity (find availability). | models/deploy-model/SKILL.md |
| quota | Managing quotas and capacity for Microsoft Foundry resources. Use when checking quota usage, troubleshooting deployment failures due to insufficient quota, requesting quota increases, or planning capacity. | quota/quota.md |
| rbac | Managing RBAC permissions, role assignments, managed identities, and service principals for Microsoft Foundry resources. Use for access control, auditing permissions, and CI/CD setup. | rbac/rbac.md |
💡 Tip: For a complete onboarding flow:
project/create→ agent workflows (deploy→invoke).
💡 Model Deployment: Use
models/deploy-modelfor all deployment scenarios — it intelligently routes between quick preset deployment, customized deployment with full control, and capacity discovery across regions.
💡 Prompt Optimization: For requests like "optimize my prompt" or "improve my agent instructions," load observe and use the
prompt_optimizeMCP tool through that eval-driven workflow.
Match user intent to the correct workflow. Read each sub-skill in order before executing.
| User Intent | Workflow (read in order) | |-------------|------------------------| | Create a new agent from scratch | create → deploy → invoke | | Deploy an agent (code already exists) | deploy → invoke | | Update/redeploy an agent after code changes | deploy → invoke | | Invoke/test/chat with an agent | invoke | | Optimize / improve agent prompt or instructions | observe (Step 4: Optimize) | | Evaluate and optimize agent (full loop) | observe | | Troubleshoot an agent issue | invoke → troubleshoot | | Fix a broken agent (troubleshoot + redeploy) | invoke → troubleshoot → apply fixes → deploy → invoke | | Start/stop agent container | deploy |
Every agent source folder should keep Foundry-specific state under .foundry/:
<agent-root>/
.foundry/
agent-metadata.yaml
datasets/
evaluators/
results/
agent-metadata.yaml is the required source of truth for environment-specific project settings, agent names, registry details, and evaluation test cases.datasets/ and evaluators/ are local cache folders. Reuse them when they are current, and ask before refreshing or overwriting them.Agent skills should run this step only when they need configuration values they don't already have. If a value (for example, agent root, environment, project endpoint, or agent name) is already known from the user's message or a previous skill in the same session, skip resolution for that value.
Search the workspace for .foundry/agent-metadata.yaml.
.foundry/ folder during setup; for all other workflows, stop and ask the user which agent source folder to initialize.Read .foundry/agent-metadata.yaml and resolve the environment in this order:
defaultEnvironment from metadataIf the metadata contains multiple environments and none of the rules above selects one, prompt the user to choose. Keep the selected agent root and environment visible in every workflow summary.
Use the selected environment in agent-metadata.yaml as the primary source:
| Metadata Field | Resolves To | Used By |
|----------------|-------------|---------|
| environments.<env>.projectEndpoint | Project endpoint | deploy, invoke, observe, trace, troubleshoot |
| environments.<env>.agentName | Agent name | invoke, observe, trace, troubleshoot |
| environments.<env>.azureContainerRegistry | ACR registry name / image URL prefix | deploy |
| environments.<env>.testCases[] | Dataset + evaluator + threshold bundles | observe, eval-datasets |
If create/deploy is initializing a new .foundry workspace and metadata fields are still missing, check if azure.yaml exists in the project root. If found, run azd env get-values and use it to seed agent-metadata.yaml before continuing.
| azd Variable | Seeds |
|-------------|-------|
| AZURE_AI_PROJECT_ENDPOINT or AZURE_AIPROJECT_ENDPOINT | environments.<env>.projectEndpoint |
| AZURE_CONTAINER_REGISTRY_NAME or AZURE_CONTAINER_REGISTRY_ENDPOINT | environments.<env>.azureContainerRegistry |
| AZURE_SUBSCRIPTION_ID | Azure subscription for trace/troubleshoot lookups |
Use the ask_user or askQuestions tool only for values not resolved from the user's message, session context, metadata, or azd bootstrap. Common values skills may need:
.foundry/agent-metadata.yamldev, prod, or another environment key from metadata💡 Tip: If the user already provides the agent path, environment, project endpoint, or agent name, extract it directly — do not ask again.
All agent skills support two agent types:
| Type | Kind | Description |
|------|------|-------------|
| Prompt | "prompt" | LLM-based agents backed by a model deployment |
| Hosted | "hosted" | Container-based agents running custom code |
Use agent_get MCP tool to determine an agent's type when needed.
ask_user or askQuestions tool whenever collecting information from the usertask or runSubagent tool to delegate long-running or independent sub-tasks (e.g., env var scanning, status polling, Dockerfile generation)tools
KQL language expertise for writing correct, efficient Kusto Query Language queries. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Real-Time Intelligence, EventHouse, Log Analytics, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax.
testing
Pre-deployment validation for Azure readiness. Run deep checks on configuration, infrastructure (Bicep or Terraform), RBAC role assignments, managed identity permissions, and prerequisites before deploying. WHEN: validate my app, check deployment readiness, run preflight checks, verify configuration, check if ready to deploy, validate azure.yaml, validate Bicep, test before deploying, troubleshoot deployment errors, validate Azure Functions, validate function app, validate serverless deployment, verify RBAC roles, check role assignments, review managed identity permissions, what-if analysis, validate Container Apps deployment.
testing
Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".
development
Execute Azure deployments for ALREADY-PREPARED applications that have existing .azure/deployment-plan.md and infrastructure files. DO NOT use this skill when the user asks to CREATE a new application — use azure-prepare instead. This skill runs azd up, azd deploy, terraform apply, and az deployment commands with built-in error recovery. Requires .azure/deployment-plan.md from azure-prepare and validated status from azure-validate. WHEN: "run azd up", "run azd deploy", "execute deployment", "push to production", "push to cloud", "go live", "ship it", "bicep deploy", "terraform apply", "publish to Azure", "launch on Azure". DO NOT USE WHEN: "create and deploy", "build and deploy", "create a new app", "set up infrastructure", "create and deploy to Azure using Terraform" — use azure-prepare for these.