.github/plugins/azure-skills/skills/azure-cost-optimization/SKILL.md
Identify Azure cost savings from usage and spending data. USE FOR: optimize Azure costs, reduce Azure spending/expenses, analyze Azure costs, find cost savings, generate cost optimization report, identify orphaned resources to delete, rightsize VMs, reduce waste, optimize Redis costs, optimize storage costs, AKS cost analysis add-on, namespace cost, cost spike, anomaly, budget alert, AKS cost visibility. DO NOT USE FOR: deploying resources (use azure-deploy), general Azure diagnostics (use azure-diagnostics), security issues (use azure-security)
npx skillsauth add microsoft/skills azure-cost-optimizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyze Azure subscriptions to identify cost savings through orphaned resource cleanup, rightsizing, and optimization recommendations based on actual usage data.
Use this skill when the user asks to:
Follow these steps in conversation with the user:
Before starting, verify these tools and permissions are available:
Required Tools:
az login)costmanagement, resource-graphRequired Permissions:
Verification commands:
az --version
az account show
az extension show --name costmanagement
azqr version
Get Azure cost optimization best practices to inform recommendations:
// Use Azure MCP best practices tool
mcp_azure_mcp_get_azure_bestpractices({
intent: "Get cost optimization best practices",
command: "get_bestpractices",
parameters: { resource: "cost-optimization", action: "all" }
})
If the user specifically requests Redis cost optimization, use the specialized Redis skill:
📋 Reference: Azure Redis Cost Optimization
When to use Redis-specific analysis:
Key capabilities:
redis_list commandReport templates available:
Note: For general subscription-wide cost optimization (including Redis), continue with Step 2. For Redis-only focused analysis, follow the instructions in the Redis-specific reference document.
If performing Redis cost optimization, ask the user to select their analysis scope:
Prompt the user with these options:
Wait for user response, then proceed to Step 2.
If the user specifically requests AKS cost optimization, use the specialized AKS reference files:
When to use AKS-specific analysis:
Tool Selection:
mcp_azure_mcp_aks for AKS operations (list clusters, get node pools, inspect configuration) — it provides richer metadata and is consistent with AKS skill conventions in this repoaz aks and kubectl only when the specific operation cannot be performed via the MCP surfaceReference files (load only what is needed for the request):
Note: For general subscription-wide cost optimization (including AKS resource groups), continue with Step 2. For AKS-focused analysis, follow the instructions in the relevant reference file above.
If performing AKS cost optimization, ask the user to select their analysis scope:
Prompt the user with these options:
Wait for user response before proceeding to Step 2.
Run azqr to find orphaned resources (immediate cost savings):
📋 Reference: Azure Quick Review - Detailed instructions for running azqr scans
// Use Azure MCP extension_azqr tool
extension_azqr({
subscription: "<SUBSCRIPTION_ID>",
"resource-group": "<RESOURCE_GROUP>" // optional
})
What to look for in azqr results:
Note: The Azure Quick Review reference document includes instructions for creating filter configurations, saving output to the
output/folder, and interpreting results for cost optimization.
For efficient cross-subscription resource discovery, use Azure Resource Graph. See Azure Resource Graph Queries for orphaned resource detection and cost optimization patterns.
List all resources in the subscription using Azure MCP tools or CLI:
# Get subscription info
az account show
# List all resources
az resource list --subscription "<SUBSCRIPTION_ID>" --resource-group "<RESOURCE_GROUP>"
# Use MCP tools for specific services (preferred):
# - Storage accounts, Cosmos DB, Key Vaults: use Azure MCP tools
# - Redis caches: use mcp_azure_mcp_redis tool (see ./references/azure-redis.md)
# - Web apps, VMs, SQL: use az CLI commands
Get actual cost data from Azure Cost Management API (last 30 days):
Create cost query file:
Create temp/cost-query.json with:
{
"type": "ActualCost",
"timeframe": "Custom",
"timePeriod": {
"from": "<START_DATE>",
"to": "<END_DATE>"
},
"dataset": {
"granularity": "None",
"aggregation": {
"totalCost": {
"name": "Cost",
"function": "Sum"
}
},
"grouping": [
{
"type": "Dimension",
"name": "ResourceId"
}
]
}
}
Action Required: Calculate
<START_DATE>(30 days ago) and<END_DATE>(today) in ISO 8601 format (e.g.,2025-11-03T00:00:00Z).
Execute cost query:
# Create temp folder
New-Item -ItemType Directory -Path "temp" -Force
# Query using REST API (more reliable than az costmanagement query)
az rest --method post `
--url "https://management.azure.com/subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/Microsoft.CostManagement/query?api-version=2023-11-01" `
--body '@temp/cost-query.json'
Important: Save the query results to output/cost-query-result<timestamp>.json for audit trail.
Fetch current pricing from official Azure pricing pages using fetch_webpage:
// Validate pricing for key services
fetch_webpage({
urls: ["https://azure.microsoft.com/en-us/pricing/details/container-apps/"],
query: "pricing tiers and costs"
})
Key services to validate:
Important: Check for free tier allowances - many Azure services have generous free limits that may explain $0 costs.
Query Azure Monitor for utilization data (last 14 days) to support rightsizing recommendations:
# Calculate dates for last 14 days
$startTime = (Get-Date).AddDays(-14).ToString("yyyy-MM-ddTHH:mm:ssZ")
$endTime = Get-Date -Format "yyyy-MM-ddTHH:mm:ssZ"
# VM CPU utilization
az monitor metrics list `
--resource "<RESOURCE_ID>" `
--metric "Percentage CPU" `
--interval PT1H `
--aggregation Average `
--start-time $startTime `
--end-time $endTime
# App Service Plan utilization
az monitor metrics list `
--resource "<RESOURCE_ID>" `
--metric "CpuTime,Requests" `
--interval PT1H `
--aggregation Total `
--start-time $startTime `
--end-time $endTime
# Storage capacity
az monitor metrics list `
--resource "<RESOURCE_ID>" `
--metric "UsedCapacity,BlobCount" `
--interval PT1H `
--aggregation Average `
--start-time $startTime `
--end-time $endTime
Create a comprehensive cost optimization report in the output/ folder:
Use the create_file tool with path output/costoptimizereport<YYYYMMDD_HHMMSS>.md:
Report Structure:
# Azure Cost Optimization Report
**Generated**: <timestamp>
## Executive Summary
- Total Monthly Cost: $X (💰 ACTUAL DATA)
- Top Cost Drivers: [List top 3 resources with Azure Portal links]
## Cost Breakdown
[Table with top 10 resources by cost, including Azure Portal links]
## Free Tier Analysis
[Resources operating within free tiers showing $0 cost]
## Orphaned Resources (Immediate Savings)
[From azqr - resources that can be deleted immediately]
- Resource name with Portal link - $X/month savings
## Optimization Recommendations
### Priority 1: High Impact, Low Risk
[Example: Delete orphaned resources]
- 💰 ACTUAL cost: $X/month
- 📊 ESTIMATED savings: $Y/month
- Commands to execute (with warnings)
### Priority 2: Medium Impact, Medium Risk
[Example: Rightsize VM from D4s_v5 to D2s_v5]
- 💰 ACTUAL baseline: D4s_v5, $X/month
- 📈 ACTUAL metrics: CPU 8%, Memory 30%
- 💵 VALIDATED pricing: D4s_v5 $Y/hr, D2s_v5 $Z/hr
- 📊 ESTIMATED savings: $S/month
- Commands to execute
### Priority 3: Long-term Optimization
[Example: Reserved Instances, Storage tiering]
## Total Estimated Savings
- Monthly: $X
- Annual: $Y
## Implementation Commands
[Safe commands with approval warnings]
## Validation Appendix
### Data Sources and Files
- **Cost Query Results**: `output/cost-query-result<timestamp>.json`
- Raw cost data from Azure Cost Management API
- Audit trail proving actual costs at report generation time
- Keep for at least 12 months for historical comparison
- Contains every resource's exact cost over the analysis period
- **Pricing Sources**: [Links to Azure pricing pages]
- **Free Tier Allowances**: [Applicable allowances]
> **Note**: The `temp/cost-query.json` file (if present) is a temporary query template and can be safely deleted. All permanent audit data is in the `output/` folder.
Portal Link Format:
https://portal.azure.com/#@<TENANT_ID>/resource/subscriptions/<SUBSCRIPTION_ID>/resourceGroups/<RESOURCE_GROUP>/providers/<RESOURCE_PROVIDER>/<RESOURCE_TYPE>/<RESOURCE_NAME>/overview
Save all cost query results for validation:
Use the create_file tool with path output/cost-query-result<YYYYMMDD_HHMMSS>.json:
{
"timestamp": "<ISO_8601>",
"subscription": "<SUBSCRIPTION_ID>",
"resourceGroup": "<RESOURCE_GROUP>",
"queries": [
{
"queryType": "ActualCost",
"timeframe": "MonthToDate",
"query": { },
"response": { }
}
]
}
Remove temporary query files and folder after the report is generated:
# Delete entire temp folder (no longer needed)
Remove-Item -Path "temp" -Recurse -Force -ErrorAction SilentlyContinue
Note: The
temp/cost-query.jsonfile is only needed during API execution. The actual query and results are preserved inoutput/cost-query-result*.jsonfor audit purposes.
The skill generates:
Cost Optimization Report (output/costoptimizereport<timestamp>.md)
Cost Query Results (output/cost-query-result<timestamp>.json)
az costmanagement query)az rest with JSON body, not az costmanagement querytools
KQL language expertise for writing correct, efficient Kusto Query Language queries. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Real-Time Intelligence, EventHouse, Log Analytics, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax.
development
Deploy, evaluate, and manage Foundry agents end-to-end: Docker build, ACR push, hosted/prompt agent create, container start, batch eval, prompt optimization, prompt optimizer workflows, agent.yaml, dataset curation from traces. USE FOR: deploy agent to Foundry, hosted agent, create agent, invoke agent, evaluate agent, run batch eval, optimize prompt, improve prompt, prompt optimization, prompt optimizer, improve agent instructions, optimize agent instructions, optimize system prompt, deploy model, Foundry project, RBAC, role assignment, permissions, quota, capacity, region, troubleshoot agent, deployment failure, create dataset from traces, dataset versioning, eval trending, create AI Services, Cognitive Services, create Foundry resource, provision resource, knowledge index, agent monitoring, customize deployment, onboard, availability. DO NOT USE FOR: Azure Functions, App Service, general Azure deploy (use azure-deploy), general Azure prep (use azure-prepare).
testing
Pre-deployment validation for Azure readiness. Run deep checks on configuration, infrastructure (Bicep or Terraform), RBAC role assignments, managed identity permissions, and prerequisites before deploying. WHEN: validate my app, check deployment readiness, run preflight checks, verify configuration, check if ready to deploy, validate azure.yaml, validate Bicep, test before deploying, troubleshoot deployment errors, validate Azure Functions, validate function app, validate serverless deployment, verify RBAC roles, check role assignments, review managed identity permissions, what-if analysis, validate Container Apps deployment.
testing
Check/manage Azure quotas and usage across providers. For deployment planning, capacity validation, region selection. WHEN: "check quotas", "service limits", "current usage", "request quota increase", "quota exceeded", "validate capacity", "regional availability", "provisioning limits", "vCPU limit", "how many vCPUs available in my subscription".