skills/azure-resource-health-diagnose/SKILL.md
Analyze Azure resource health, diagnose issues from logs and telemetry, and create a remediation plan for identified problems.
npx skillsauth add vuluu2k/skills azure-resource-health-diagnoseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This workflow analyzes a specific Azure resource to assess its health status, diagnose potential issues using logs and telemetry data, and develop a comprehensive remediation plan for any problems discovered.
azmcp-*) over direct Azure CLI when availableAction: Retrieve diagnostic and troubleshooting best practices Tools: Azure MCP best practices tool Process:
Action: Locate and identify the target Azure resource Tools: Azure MCP tools + Azure CLI fallback Process:
Resource Lookup:
azmcp-subscription-listaz resource list --name <resource-name> to find matching resourcesResource Type Detection:
Action: Evaluate current resource health and availability Tools: Azure MCP monitoring tools + Azure CLI Process:
Basic Health Check:
Service-Specific Health Indicators:
Action: Analyze logs and telemetry to identify issues and patterns Tools: Azure MCP monitoring tools for Log Analytics queries Process:
Find Monitoring Sources:
azmcp-monitor-workspace-list to identify Log Analytics workspacesazmcp-monitor-table-listExecute Diagnostic Queries:
Use azmcp-monitor-log-query with targeted KQL queries based on resource type:
General Error Analysis:
// Recent errors and exceptions
union isfuzzy=true
AzureDiagnostics,
AppServiceHTTPLogs,
AppServiceAppLogs,
AzureActivity
| where TimeGenerated > ago(24h)
| where Level == "Error" or ResultType != "Success"
| summarize ErrorCount=count() by Resource, ResultType, bin(TimeGenerated, 1h)
| order by TimeGenerated desc
Performance Analysis:
// Performance degradation patterns
Perf
| where TimeGenerated > ago(7d)
| where ObjectName == "Processor" and CounterName == "% Processor Time"
| summarize avg(CounterValue) by Computer, bin(TimeGenerated, 1h)
| where avg_CounterValue > 80
Application-Specific Queries:
// Application Insights - Failed requests
requests
| where timestamp > ago(24h)
| where success == false
| summarize FailureCount=count() by resultCode, bin(timestamp, 1h)
| order by timestamp desc
// Database - Connection failures
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SQL"
| where Category == "SQLSecurityAuditEvents"
| where action_name_s == "CONNECTION_FAILED"
| summarize ConnectionFailures=count() by bin(TimeGenerated, 1h)
Pattern Recognition:
Action: Categorize identified issues and determine root causes Process:
Issue Classification:
Root Cause Analysis:
Impact Assessment:
Action: Create a comprehensive plan to address identified issues Process:
Immediate Actions (Critical issues):
Short-term Fixes (High/Medium issues):
Long-term Improvements (All issues):
Implementation Steps:
Action: Present findings and get approval for remediation actions Process:
Display Health Assessment Summary:
🏥 Azure Resource Health Assessment
📊 Resource Overview:
• Resource: [Name] ([Type])
• Status: [Healthy/Warning/Critical]
• Location: [Region]
• Last Analyzed: [Timestamp]
🚨 Issues Identified:
• Critical: X issues requiring immediate attention
• High: Y issues affecting performance/reliability
• Medium: Z issues for optimization
• Low: N informational items
🔍 Top Issues:
1. [Issue Type]: [Description] - Impact: [High/Medium/Low]
2. [Issue Type]: [Description] - Impact: [High/Medium/Low]
3. [Issue Type]: [Description] - Impact: [High/Medium/Low]
🛠️ Remediation Plan:
• Immediate Actions: X items
• Short-term Fixes: Y items
• Long-term Improvements: Z items
• Estimated Resolution Time: [Timeline]
❓ Proceed with detailed remediation plan? (y/n)
Generate Detailed Report:
# Azure Resource Health Report: [Resource Name]
**Generated**: [Timestamp]
**Resource**: [Full Resource ID]
**Overall Health**: [Status with color indicator]
## 🔍 Executive Summary
[Brief overview of health status and key findings]
## 📊 Health Metrics
- **Availability**: X% over last 24h
- **Performance**: [Average response time/throughput]
- **Error Rate**: X% over last 24h
- **Resource Utilization**: [CPU/Memory/Storage percentages]
## 🚨 Issues Identified
### Critical Issues
- **[Issue 1]**: [Description]
- **Root Cause**: [Analysis]
- **Impact**: [Business impact]
- **Immediate Action**: [Required steps]
### High Priority Issues
- **[Issue 2]**: [Description]
- **Root Cause**: [Analysis]
- **Impact**: [Performance/reliability impact]
- **Recommended Fix**: [Solution steps]
## 🛠️ Remediation Plan
### Phase 1: Immediate Actions (0-2 hours)
```bash
# Critical fixes to restore service
[Azure CLI commands with explanations]
# Performance and reliability improvements
[Azure CLI commands with explanations]
# Architectural and preventive measures
[Azure CLI commands and configuration changes]
development
Vue 3 Composition API — <script setup>, reactivity (shallowRef/ref), props without destructure, computed, watch, provide/inject, and composables. Use when the project uses modern Vue 3 Composition API style.
development
Vue 3 Options API — data, props, computed, methods, watch, emits, provide/inject, lifecycle hooks, and mixins. Use when the project uses Options API style (Vue 2 legacy or explicit Vue 3 Options API preference).
tools
Best practices for mixing Ant Design Vue components with Tailwind CSS utility classes. Use this skill to keep styling consistent without custom CSS files.
development
Pinia state management for Vue 3 using Composition API (Setup Stores) — TypeScript-first, storeToRefs for reactivity, focused stores, and API calls in composables. Use when the project uses Vue 3 Composition API / <script setup>.