skills/microsoft-foundry/models/deploy-model/capacity/SKILL.md
Discovers available Azure OpenAI model capacity across regions and projects. Analyzes quota limits, compares availability, and recommends optimal deployment locations based on capacity requirements. USE FOR: find capacity, check quota, where can I deploy, capacity discovery, best region for capacity, multi-project capacity search, quota analysis, model availability, region comparison, check TPM availability. DO NOT USE FOR: actual deployment (hand off to preset or customize after discovery), quota increase requests (direct user to Azure Portal), listing existing deployments.
npx skillsauth add microsoft/azure-skills capacityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Finds available Azure OpenAI model capacity across all accessible regions and projects. Recommends the best deployment location based on capacity requirements.
| Property | Description |
|----------|-------------|
| Purpose | Find where you can deploy a model with sufficient capacity |
| Scope | All regions and projects the user has access to |
| Output | Ranked table of regions/projects with available capacity |
| Action | Read-only analysis — does NOT deploy. Hands off to preset or customize |
| Authentication | Azure CLI (az login) |
After discovery → hand off to preset or customize for actual deployment.
Pre-built scripts handle the complex REST API calls and data processing. Use these instead of constructing commands manually.
| Script | Purpose | Usage |
|--------|---------|-------|
| scripts/discover_and_rank.ps1 | Full discovery: capacity + projects + ranking | Primary script for capacity discovery |
| scripts/discover_and_rank.sh | Same as above (bash) | Primary script for capacity discovery |
| scripts/query_capacity.ps1 | Raw capacity query (no project matching) | Quick capacity check or version listing |
| scripts/query_capacity.sh | Same as above (bash) | Quick capacity check or version listing |
az account show --query "{Subscription:name, SubscriptionId:id}" --output table
Extract model name from user prompt. If version is unknown, query available versions:
.\scripts\query_capacity.ps1 -ModelName <model-name>
./scripts/query_capacity.sh <model-name>
This lists available versions. Use the latest version unless user specifies otherwise.
Run the full discovery script with model name, version, and minimum capacity target:
.\scripts\discover_and_rank.ps1 -ModelName <model-name> -ModelVersion <version> -MinCapacity <target>
./scripts/discover_and_rank.sh <model-name> <version> <min-capacity>
💡 The script automatically queries capacity across ALL regions, cross-references with the user's existing projects, and outputs a ranked table sorted by: meets target → project count → available capacity.
After discovery identifies candidate regions, validate that the user's subscription actually has available quota in each region. Model capacity (from Phase 3) shows what the platform can support, but subscription quota limits what this specific user can deploy.
# For each candidate region from discovery results:
$usageData = az cognitiveservices usage list --location <region> --subscription $SUBSCRIPTION_ID -o json 2>$null | ConvertFrom-Json
# Check quota for each SKU the model supports
# Quota names follow pattern: OpenAI.<SKU>.<model-name>
$usageEntry = $usageData | Where-Object { $_.name.value -eq "OpenAI.<SKU>.<model-name>" }
if ($usageEntry) {
$quotaAvailable = $usageEntry.limit - $usageEntry.currentValue
} else {
$quotaAvailable = 0 # No quota allocated
}
# For each candidate region from discovery results:
usage_json=$(az cognitiveservices usage list --location <region> --subscription "$SUBSCRIPTION_ID" -o json 2>/dev/null)
# Extract quota for specific SKU+model
quota_available=$(echo "$usage_json" | jq -r --arg name "OpenAI.<SKU>.<model-name>" \
'.[] | select(.name.value == $name) | .limit - .currentValue')
Annotate discovery results:
Add a "Quota Available" column to the ranked output from Phase 3:
| Region | Available Capacity | Meets Target | Projects | Quota Available | |--------|-------------------|--------------|----------|-----------------| | eastus2 | 120K TPM | ✅ | 3 | ✅ 80K | | westus3 | 90K TPM | ✅ | 1 | ❌ 0 (at limit) | | swedencentral | 100K TPM | ✅ | 0 | ✅ 100K |
Regions/SKUs where quotaAvailable = 0 should be marked with ❌ in the results. If no region has available quota, hand off to the quota skill for increase requests and troubleshooting.
After the script outputs the ranked table (now annotated with quota info), present it to the user and ask:
Before handing off to preset or customize, always confirm the target project with the user. See the Project Selection rules in the parent router.
If the discovery table shows a sample project for the chosen region, suggest it as the default. Otherwise, query projects in that region and let the user pick.
| Error | Cause | Resolution |
|-------|-------|------------|
| "No capacity found" | Model not available or all at quota | Hand off to quota skill for increase requests and troubleshooting |
| Script auth error | az login expired | Re-run az login |
| Empty version list | Model not in region catalog | Try a different region: ./scripts/query_capacity.sh <model> "" eastus |
| "No projects found" | No AI Services resources | Guide to project/create skill or Azure Portal |
tools
Deploy, evaluate, fine-tune, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, batch eval, continuous eval, prompt optimizer, Agent Optimizer scaffold, agent.yaml, dataset curation from traces, model fine-tuning (SFT/DPO/RFT). USE FOR: deploy agent, hosted agent, create agent, add tool to agent, invoke agent, evaluate agent, continuous eval, continuous monitoring, optimize prompt, improve prompt, optimize agent instructions, agent optimizer, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, AI Services, create Foundry resource, provision, knowledge index, customize deployment, onboard, availability, fine-tune, SFT, DPO, RFT, training-data, grader, distillation, fine-tuned model, large file upload. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
testing
Architect and provision enterprise Azure infrastructure from workload descriptions. For cloud architects and platform engineers planning networking, identity, security, compliance, and multi-resource topologies with WAF alignment. Generates Bicep or Terraform directly (no azd). WHEN: 'plan Azure infrastructure', 'architect Azure landing zone', 'design hub-spoke network', 'plan multi-region DR topology', 'set up VNets firewalls and private endpoints', 'subscription-scope Bicep deployment', 'Azure Backup for VM workloads'. PREFER azure-prepare FOR app-centric workflows.
testing
Azure cost management: query costs, forecast spending, optimize to reduce waste. WHEN: "Azure costs", "Azure bill", "cost breakdown", "how much am I spending", "forecast spending", "optimize costs", "reduce spending", "orphaned resources", "rightsize VMs", "cost spike", "reduce storage costs", "AKS cost". DO NOT USE FOR: deploying resources, provisioning, diagnostics, or security audits.
development
Assess and upgrade Azure workloads between plans, tiers, or SKUs, or modernize Azure SDK dependencies in source code. WHEN: upgrade Consumption to Flex Consumption, upgrade Azure Functions plan, change hosting plan, function app SKU, migrate App Service to Container Apps, modernize legacy Azure Java SDKs (com.microsoft.azure to com.azure), migrate Azure Cache for Redis (ACR/ACRE) to Azure Managed Redis (AMR).