.github/plugins/azure-skills/skills/microsoft-foundry/models/deploy-model/capacity/SKILL.md
Discovers available Azure OpenAI model capacity across regions and projects. Analyzes quota limits, compares availability, and recommends optimal deployment locations based on capacity requirements. USE FOR: find capacity, check quota, where can I deploy, capacity discovery, best region for capacity, multi-project capacity search, quota analysis, model availability, region comparison, check TPM availability. DO NOT USE FOR: actual deployment (hand off to preset or customize after discovery), quota increase requests (direct user to Azure Portal), listing existing deployments.
npx skillsauth add microsoft/skills capacityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Finds available Azure OpenAI model capacity across all accessible regions and projects. Recommends the best deployment location based on capacity requirements.
| Property | Description |
|----------|-------------|
| Purpose | Find where you can deploy a model with sufficient capacity |
| Scope | All regions and projects the user has access to |
| Output | Ranked table of regions/projects with available capacity |
| Action | Read-only analysis — does NOT deploy. Hands off to preset or customize |
| Authentication | Azure CLI (az login) |
After discovery → hand off to preset or customize for actual deployment.
Pre-built scripts handle the complex REST API calls and data processing. Use these instead of constructing commands manually.
| Script | Purpose | Usage |
|--------|---------|-------|
| scripts/discover_and_rank.ps1 | Full discovery: capacity + projects + ranking | Primary script for capacity discovery |
| scripts/discover_and_rank.sh | Same as above (bash) | Primary script for capacity discovery |
| scripts/query_capacity.ps1 | Raw capacity query (no project matching) | Quick capacity check or version listing |
| scripts/query_capacity.sh | Same as above (bash) | Quick capacity check or version listing |
az account show --query "{Subscription:name, SubscriptionId:id}" --output table
Extract model name from user prompt. If version is unknown, query available versions:
.\scripts\query_capacity.ps1 -ModelName <model-name>
./scripts/query_capacity.sh <model-name>
This lists available versions. Use the latest version unless user specifies otherwise.
Run the full discovery script with model name, version, and minimum capacity target:
.\scripts\discover_and_rank.ps1 -ModelName <model-name> -ModelVersion <version> -MinCapacity <target>
./scripts/discover_and_rank.sh <model-name> <version> <min-capacity>
💡 The script automatically queries capacity across ALL regions, cross-references with the user's existing projects, and outputs a ranked table sorted by: meets target → project count → available capacity.
After discovery identifies candidate regions, validate that the user's subscription actually has available quota in each region. Model capacity (from Phase 3) shows what the platform can support, but subscription quota limits what this specific user can deploy.
# For each candidate region from discovery results:
$usageData = az cognitiveservices usage list --location <region> --subscription $SUBSCRIPTION_ID -o json 2>$null | ConvertFrom-Json
# Check quota for each SKU the model supports
# Quota names follow pattern: OpenAI.<SKU>.<model-name>
$usageEntry = $usageData | Where-Object { $_.name.value -eq "OpenAI.<SKU>.<model-name>" }
if ($usageEntry) {
$quotaAvailable = $usageEntry.limit - $usageEntry.currentValue
} else {
$quotaAvailable = 0 # No quota allocated
}
# For each candidate region from discovery results:
usage_json=$(az cognitiveservices usage list --location <region> --subscription "$SUBSCRIPTION_ID" -o json 2>/dev/null)
# Extract quota for specific SKU+model
quota_available=$(echo "$usage_json" | jq -r --arg name "OpenAI.<SKU>.<model-name>" \
'.[] | select(.name.value == $name) | .limit - .currentValue')
Annotate discovery results:
Add a "Quota Available" column to the ranked output from Phase 3:
| Region | Available Capacity | Meets Target | Projects | Quota Available | |--------|-------------------|--------------|----------|-----------------| | eastus2 | 120K TPM | ✅ | 3 | ✅ 80K | | westus3 | 90K TPM | ✅ | 1 | ❌ 0 (at limit) | | swedencentral | 100K TPM | ✅ | 0 | ✅ 100K |
Regions/SKUs where quotaAvailable = 0 should be marked with ❌ in the results. If no region has available quota, hand off to the quota skill for increase requests and troubleshooting.
After the script outputs the ranked table (now annotated with quota info), present it to the user and ask:
Before handing off to preset or customize, always confirm the target project with the user. See the Project Selection rules in the parent router.
If the discovery table shows a sample project for the chosen region, suggest it as the default. Otherwise, query projects in that region and let the user pick.
| Error | Cause | Resolution |
|-------|-------|------------|
| "No capacity found" | Model not available or all at quota | Hand off to quota skill for increase requests and troubleshooting |
| Script auth error | az login expired | Re-run az login |
| Empty version list | Model not in region catalog | Try a different region: ./scripts/query_capacity.sh <model> "" eastus |
| "No projects found" | No AI Services resources | Guide to project/create skill or Azure Portal |
tools
KQL language expertise for writing correct, efficient Kusto Query Language queries. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Real-Time Intelligence, EventHouse, Log Analytics, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax.
development
Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, prompt optimizer workflows, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, improve prompt, prompt optimization, prompt optimizer, improve agent instructions, optimize agent instructions, optimize system prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
testing
Pre-deployment validation for Azure readiness. Run deep checks on configuration, infrastructure (Bicep or Terraform), RBAC role assignments, managed identity permissions, and prerequisites before deploying. WHEN: validate my app, check deployment readiness, run preflight checks, verify configuration, check if ready to deploy, validate azure.yaml, validate Bicep, test before deploying, troubleshoot deployment errors, validate Azure Functions, validate function app, validate serverless deployment, verify RBAC roles, check role assignments, review managed identity permissions, what-if analysis, validate Container Apps deployment.
testing
Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".