.github/plugins/azure-skills/skills/microsoft-foundry/models/deploy-model/customize/SKILL.md
Interactive guided deployment flow for Azure OpenAI models with full customization control. Step-by-step selection of model version, SKU (GlobalStandard/Standard/ProvisionedManaged), capacity, RAI policy (content filter), and advanced options (dynamic quota, priority processing, spillover). USE FOR: custom deployment, customize model deployment, choose version, select SKU, set capacity, configure content filter, RAI policy, deployment options, detailed deployment, advanced deployment, PTU deployment, provisioned throughput. DO NOT USE FOR: quick deployment to optimal region (use preset).
npx skillsauth add microsoft/skills customizeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Interactive guided workflow for deploying Azure OpenAI models with full customization control over version, SKU, capacity, content filtering, and advanced options.
| Property | Description |
|----------|-------------|
| Flow | Interactive step-by-step guided deployment |
| Customization | Version, SKU, Capacity, RAI Policy, Advanced Options |
| SKU Support | GlobalStandard, Standard, ProvisionedManaged, DataZoneStandard |
| Best For | Precise control over deployment configuration |
| Authentication | Azure CLI (az login) |
| Tools | Azure CLI, MCP tools (optional) |
Use this skill when you need precise control over deployment configuration:
Alternative: Use preset for quick deployment to the best available region with automatic configuration.
| Feature | customize | preset | |---------|---------------------|----------------------------| | Focus | Full customization control | Optimal region selection | | Version Selection | User chooses from available | Uses latest automatically | | SKU Selection | User chooses (GlobalStandard/Standard/PTU) | GlobalStandard only | | Capacity | User specifies exact value | Auto-calculated (50% of available) | | RAI Policy | User selects from options | Default policy only | | Region | Current region first, falls back to all regions if no capacity | Checks capacity across all regions upfront | | Use Case | Precise deployment requirements | Quick deployment to best region |
/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project})az login)PROJECT_RESOURCE_ID environment variable1. Verify Authentication
2. Get Project Resource ID
3. Verify Project Exists
4. Get Model Name (if not provided)
5. List Model Versions → User Selects
6. List SKUs for Version → User Selects
7. Get Capacity Range → User Configures
7b. If no capacity: Cross-Region Fallback → Query all regions → User selects region/project
8. List RAI Policies → User Selects
9. Configure Advanced Options (if applicable)
10. Configure Version Upgrade Policy
11. Generate Deployment Name
12. Review Configuration
13. Execute Deployment & Monitor
If user accepts all defaults (latest version, GlobalStandard SKU, recommended capacity, default RAI policy, standard upgrade policy), deployment completes in ~5 interactions.
⚠️ MUST READ: Before executing any phase, load references/customize-workflow.md for the full scripts and implementation details. The summaries below describe what each phase does — the reference file contains the how (CLI commands, quota patterns, capacity formulas, cross-region fallback logic).
| Phase | Action | Key Details |
|-------|--------|-------------|
| 1. Verify Auth | Check az account show; prompt az login if needed | Verify correct subscription is active |
| 2. Get Project ID | Read PROJECT_RESOURCE_ID env var or prompt user | ARM resource ID format required |
| 3. Verify Project | Parse resource ID, call az cognitiveservices account show | Extracts subscription, RG, account, project, region |
| 4. Get Model | List models via az cognitiveservices account list-models | User selects from available or enters custom name |
| 5. Select Version | Query versions for chosen model | Recommend latest; user picks from list |
| 6. Select SKU | Query model catalog + subscription quota, show only deployable SKUs | ⚠️ Never hardcode SKU lists — always query live data |
| 7. Configure Capacity | Query capacity API, validate min/max/step, user enters value | Cross-region fallback if no capacity in current region |
| 8. Select RAI Policy | Present content filter options | Default: Microsoft.DefaultV2 |
| 9. Advanced Options | Dynamic quota (GlobalStandard), priority processing (PTU), spillover | SKU-dependent availability |
| 10. Upgrade Policy | Choose: OnceNewDefaultVersionAvailable / OnceCurrentVersionExpired / NoAutoUpgrade | Default: auto-upgrade on new default |
| 11. Deployment Name | Auto-generate unique name, allow custom override | Validates format: ^[\w.-]{2,64}$ |
| 12. Review | Display full config summary, confirm before proceeding | User approves or cancels |
| 13. Deploy & Monitor | az cognitiveservices account deployment create, poll status | Timeout after 5 min; show endpoint + portal link |
| Error | Cause | Resolution |
|-------|-------|------------|
| Model not found | Invalid model name | List available models with az cognitiveservices account list-models |
| Version not available | Version not supported for SKU | Select different version or SKU |
| Insufficient quota | Capacity > available quota | Skill auto-searches all regions; fails only if no region has quota |
| SKU not supported | SKU not available in region | Cross-region fallback searches other regions automatically |
| Capacity out of range | Invalid capacity value | PREVENTED: Skill validates min/max/step at input (Phase 7) |
| Deployment name exists | Name conflict | Auto-incremented name generation |
| Authentication failed | Not logged in | Run az login |
| Permission denied | Insufficient permissions | Assign Cognitive Services Contributor role |
| Capacity query fails | API/permissions/network error | DEPLOYMENT BLOCKED: Will not proceed without valid quota data |
# Check deployment status
az cognitiveservices account deployment show --name <account> --resource-group <rg> --deployment-name <name>
# List all deployments
az cognitiveservices account deployment list --name <account> --resource-group <rg> -o table
# Check quota usage
az cognitiveservices usage list --name <account> --resource-group <rg>
# Delete failed deployment
az cognitiveservices account deployment delete --name <account> --resource-group <rg> --deployment-name <name>
For SKU comparison tables, PTU sizing formulas, and advanced option details, load references/customize-guides.md.
SKU selection: GlobalStandard (production/HA) → Standard (dev/test) → ProvisionedManaged (high-volume/guaranteed throughput) → DataZoneStandard (data residency).
Capacity: TPM-based SKUs range from 1K (dev) to 100K+ (large production). PTU-based use formula: (Input TPM × 0.001) + (Output TPM × 0.002) + (Requests/min × 0.1).
Advanced options: Dynamic quota (GlobalStandard only), priority processing (PTU only, extra cost), spillover (overflow to backup deployment).
PROJECT_RESOURCE_ID environment variable to skip prompttools
KQL language expertise for writing correct, efficient Kusto Query Language queries. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Real-Time Intelligence, EventHouse, Log Analytics, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax.
development
Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, prompt optimizer workflows, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, improve prompt, prompt optimization, prompt optimizer, improve agent instructions, optimize agent instructions, optimize system prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
testing
Pre-deployment validation for Azure readiness. Run deep checks on configuration, infrastructure (Bicep or Terraform), RBAC role assignments, managed identity permissions, and prerequisites before deploying. WHEN: validate my app, check deployment readiness, run preflight checks, verify configuration, check if ready to deploy, validate azure.yaml, validate Bicep, test before deploying, troubleshoot deployment errors, validate Azure Functions, validate function app, validate serverless deployment, verify RBAC roles, check role assignments, review managed identity permissions, what-if analysis, validate Container Apps deployment.
testing
Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".