skills/azure-capacity-management/SKILL.md
This skill should be used when the user asks about Azure capacity management, quota operations, or capacity planning for SaaS ISVs running workloads in their own Azure subscriptions under EA or MCA. Relevant queries include: how to increase VM quota across subscriptions, how quota groups work, how to create or share capacity reservation groups, what the difference is between capacity reservations and Azure Reservations or savings plans, how to request region access or zonal enablement, how logical and physical availability zones map across subscriptions, how to configure quota or budget or anomaly alerts, how AKS node pools interact with capacity reservations, how to manage non-compute quotas, how deployment stamps map to FinOps Planning & Estimating, Forecasting, Architecting & Workload Placement, Usage Optimization, and Governance, Policy & Risk, and how Azure capacity controls support estate-level governance. Also covers quota transfers, overallocation, SKU restrictions, CRG sharing, billing hierarchy, and subscription vending.
npx skillsauth add microsoft/azcapman azure-capacity-managementInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Estate-level capacity and quota management for SaaS ISVs operating workloads in subscriptions they own or control under an Enterprise Agreement (EA) or Microsoft Customer Agreement (MCA). This skill aligns with the ISV landing zone guidance and covers pure SaaS and stamp-based isolation patterns where customers are isolated through dedicated or shared deployment stamps inside the ISV's Azure estate.
Read the full Azure implementation reference at references/docs/operations/capacity-and-quotas/README.md.
Treat Azure capacity management as implementation detail under the canonical FinOps Framework. Capacity evidence commonly supports Planning & Estimating, Forecasting, Architecting & Workload Placement, Usage Optimization, Rate Optimization, Budgeting, Governance, Policy & Risk, and Automation, Tools & Services. Use Well-Architected capacity planning, reliable scaling, and workload supply chain guidance as Azure implementation guidance, not as a replacement framework. Source Source
| FinOps capability | Capacity question | Azure surfaces | |------|-------------|----------------| | Planning & Estimating | What capacity does the planned workload, scenario, or stamp need? | Workload requirements, scale-unit assumptions, Azure Monitor, estimates, FinOps budgets | | Forecasting | When will demand exceed quota, region, SKU, zone, or reserved-capacity headroom? | Historical usage, growth trends, forecast breach dates, capacity planning models | | Architecting & Workload Placement | Which regions, zones, SKUs, quota pools, or deployment patterns should change? | SKU availability, region access, quota groups, capacity reservation groups, CRG sharing, overallocation | | Usage Optimization | Which allocated capacity, quota, or deployment pattern is underused or inefficient? | Utilization, headroom, rightsizing evidence, unused reserved capacity, demand signals | | Rate Optimization | Where should capacity guarantees be coordinated with reservations or savings plans? | Benefit recommendations, commitment utilization, CRG utilization, pricing evidence from FinOps Hub | | Governance, Policy & Risk | Which capacity risks need ownership, exception review, or escalation? | Approved regions and SKUs, owner metadata, risk thresholds, exception status | | Automation, Tools & Services | Which controls expose capacity risk before deployment or scale events? | Quota alerts, budget alerts, anomaly alerts, CI/CD gates, workflow status |
Read references/docs/operations/capacity-planning/README.md for forecasting details and references/docs/operations/capacity-governance/README.md for the governance program design.
Azure assigns default quota limits per subscription. EA subscriptions typically start with 350 cores; pay-as-you-go subscriptions start with 20 cores. Some VM series have offer restrictions that block deployment until you request access.
Key workflows:
CLI reference:
# List quota usage for a subscription
az quota usage list --scope /subscriptions/{sub-id}/providers/Microsoft.Compute/locations/{location}
# Request a quota increase
az quota create --resource-name "StandardDSv3Family" --scope /subscriptions/{sub-id}/providers/Microsoft.Compute/locations/{location} --limit-object value=500
Read references/docs/operations/quota/README.md for the complete quota operations reference.
Quota groups are ARM objects that aggregate compute quota across eligible subscriptions at the management group scope. They reduce stranded VM-family headroom and let you request group-level increases.
Prerequisites: Register the Microsoft.Quota resource provider on each member subscription. The management group must exist before creating the quota group.
Limitations:
Lifecycle: Create the quota group under a management group, add subscriptions, then request group-level limit increases. Monitor allocation snapshots and transfer as demand shifts between subscriptions.
Read references/docs/operations/quota-groups/README.md for the complete reference including ARM lifecycle, transfer mechanics, and monitoring integration.
Capacity reservation groups (CRGs) guarantee compute capacity for specific VM sizes in a region or availability zone. CRGs are capacity guarantees, not pricing commitments — unused reserved capacity is billed at the pay-as-you-go rate for the VM size.
Cost implications: Reserved capacity is billed whether or not VMs run against it. Pair CRGs with Azure Reservations or savings plans to get both capacity guarantee and pricing discount.
Sharing (preview): CRGs can be shared across subscriptions within the same tenant. The ODCR owner in the consumer subscription needs Microsoft.Compute/capacityReservationGroups/share/action. The VM owner in the consumer subscription needs Microsoft.Compute/capacityReservationGroups/read, Microsoft.Compute/capacityReservationGroups/deploy, Microsoft.Compute/capacityReservationGroups/capacityReservations/read, and Microsoft.Compute/capacityReservationGroups/capacityReservations/deploy. Portal support isn't available in preview; use CLI, PowerShell, or REST API.
Overallocation: Overallocation lets you deploy more VMs than the reserved quantity. Excess VMs don't have capacity guarantees but benefit from the reservation when capacity is available.
Zone alignment: CRGs are zone-specific. Before sharing across subscriptions, verify logical-to-physical zone mapping with the Get-AzAvailabilityZoneMapping.ps1 script — logical zones can map to different physical zones across subscriptions.
Read references/docs/operations/capacity-reservations/README.md for the complete reference including automation patterns (REST API, Bicep, Terraform).
AKS node pools consume VM quota and can associate with capacity reservation groups, but with constraints specific to the AKS lifecycle:
Microsoft.Compute/capacityReservationGroups/read permission on the CRGRead references/docs/operations/aks-capacity/README.md for the complete reference including Bicep and Terraform examples.
Storage accounts, App Service plans, Cosmos DB throughput, Service Bus namespaces, Key Vault transactions, and other services have their own quota limits outside the compute quota system. Quota groups don't cover these — manage them through standard quota requests and service-specific scaling controls.
Read references/docs/operations/non-compute-quotas/README.md for service-specific quota references.
Three alert types cover the capacity governance space:
references/scripts/anomaly-alerts/Deploy-AnomalyAlert.ps1 or bulk deploy with references/scripts/anomaly-alerts/Deploy-BulkALZ.ps1.Governance cadence: Monthly quota reviews, quarterly capacity planning cycles, and post-incident reviews when scaling events fail. Read references/docs/operations/monitoring-alerting/README.md for alert configuration details and references/docs/operations/capacity-governance/README.md for the governance program design.
| Script | Path | Purpose |
|--------|------|---------|
| Get-AzVMQuotaUsage.ps1 | references/scripts/quota/ | Multi-threaded quota analysis across subscriptions |
| Show-AzVMQuotaReport.ps1 | references/scripts/quota/ | Single-threaded quota reporting |
| Get-AzAvailabilityZoneMapping.ps1 | references/scripts/quota/ | Logical-to-physical zone mapping |
| Get-BenefitRecommendations.ps1 | references/scripts/rate/ | Reservation and savings plan recommendations |
| Deploy-AnomalyAlert.ps1 | references/scripts/anomaly-alerts/ | Deploy cost anomaly alerts |
| Deploy-BulkALZ.ps1 | references/scripts/anomaly-alerts/ | Bulk deploy anomaly alerts |
| Deploy-Budget.ps1 | references/scripts/budgets/ | Deploy individual budgets |
| Deploy-BulkBudgets.ps1 | references/scripts/budgets/ | Bulk deploy budgets |
| Suppress-AdvisorRecommendations.ps1 | references/scripts/advisor/ | Suppress Advisor recommendations |
| Serverless SQL workbook | references/scripts/serverless-sql-storage/ | Azure Monitor workbook for serverless SQL allocated vs. used storage; identifies databases worth shrinking to reclaim billing waste |
Read the README in each script directory for parameter requirements and prerequisites.
These are commonly confused — keep them separated:
The azure-capacity-manager agent is a capacity evidence specialist for FinOps workflows. It handles operational tasks like quota analysis and reservation evaluation, maps Azure capacity evidence back to FinOps capabilities, has access to the same references, and can run scripts and az commands for live operations.
| Domain | Reference path |
|--------|---------------|
| Azure capacity reference | references/docs/operations/capacity-and-quotas/README.md |
| Glossary | references/docs/operations/glossary.md |
| Quota operations | references/docs/operations/quota/README.md |
| Quota groups | references/docs/operations/quota-groups/README.md |
| Capacity reservations | references/docs/operations/capacity-reservations/README.md |
| AKS capacity | references/docs/operations/aks-capacity/README.md |
| Non-compute quotas | references/docs/operations/non-compute-quotas/README.md |
| Monitoring and alerting | references/docs/operations/monitoring-alerting/README.md |
| Capacity governance | references/docs/operations/capacity-governance/README.md |
| Capacity planning | references/docs/operations/capacity-planning/README.md |
| Billing (EA) | references/docs/billing/legacy/README.md |
| Billing (MCA) | references/docs/billing/modern/README.md |
| Deployment patterns | references/docs/deployment/README.md |
| Tools and scripts | references/docs/operations/tools-scripts/README.md |
| Quota scripts | references/scripts/quota/README.md |
| Anomaly alerts | references/scripts/anomaly-alerts/README.md |
| Budgets | references/scripts/budgets/README.md |
| Rate optimization | references/scripts/rate/README.md |
| Serverless SQL storage | references/scripts/serverless-sql-storage/README.md |
The capacity manager can send alerts and reports through SRE Agent's notification connectors. Use Teams for urgent capacity events and email for periodic reports.
Use Teams notifications for urgent capacity events that need immediate attention, following the Azure SRE Agent notification guidance. The SRE Agent sends messages through the Teams connector in HTML format, so keep the payload compact, scannable, and action-oriented, as described in Send notifications from Azure SRE Agent.
When to send Teams alerts:
Teams alert template:
Use this HTML template when you compose a Teams message. Include alert severity, the affected resource, current state, the recommended action, and links to the relevant Azure portal blades, following Send notifications from Azure SRE Agent.
<h3>⚠️ Quota utilization alert</h3>
<p><b>Severity:</b> warning<br>
<b>VM family:</b> Standard_D_v5<br>
<b>Region:</b> eastus<br>
<b>Subscription:</b> prod-001<br>
<b>Usage:</b> 85/100 vCPUs (85%)</p>
<p><b>Recommended action:</b> Request a quota increase before the next deployment cycle.</p>
<p>
<a href="https://portal.azure.com/#view/Microsoft_Azure_Capacity/QuotaMenuBlade">Open quota blade</a><br>
<a href="https://portal.azure.com/#view/HubsExtension/BrowseResource/resourceType/Microsoft.Compute%2FcapacityReservationGroups">Open capacity reservation groups</a>
</p>
Use Outlook email for periodic capacity reports and non-urgent recommendations, following the same Azure SRE Agent notification guidance. The SRE Agent sends email through the Outlook connector in HTML format, so structure the message for scanning first, and detailed follow-up second.
When to send email reports:
Email report template:
Use this HTML template for email reports. Include the report title, date range, a summary table of key metrics, highlighted findings, recommended actions with priority, and links to Azure portal resources or supporting documentation, as described in Send notifications from Azure SRE Agent.
<h2>Weekly capacity digest</h2>
<p><b>Period:</b> March 3–9, 2026<br>
<b>Subscriptions analyzed:</b> 47</p>
<h3>Key findings</h3>
<table border="1" cellpadding="8">
<tr><th>Region</th><th>VM family</th><th>Utilization</th><th>Action</th></tr>
<tr><td>eastus</td><td>Standard_D_v5</td><td>92%</td><td>Request increase</td></tr>
<tr><td>westus2</td><td>Standard_E_v5</td><td>78%</td><td>Monitor</td></tr>
</table>
<h3>Recommendations</h3>
<ol>
<li>Request quota increase for Standard_D_v5 in eastus (critical)</li>
<li>Review CRG utilization in westeurope—45% utilized, consider right-sizing</li>
<li>Savings plan expires April 15—evaluate renewal versus pay-as-you-go</li>
</ol>
<p>
<a href="https://portal.azure.com/#view/Microsoft_Azure_Capacity/QuotaMenuBlade">Open quota blade</a><br>
<a href="https://learn.microsoft.com/en-us/azure/quotas/how-to-guide-monitoring-alerting">Review quota monitoring guidance</a>
</p>
Configure recurring notifications with Azure SRE Agent scheduled tasks, and pair the schedule with the connector flow from Send notifications from Azure SRE Agent.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.