infrastructure/cloud-aws/aws-cost-optimization/SKILL.md
Reduce AWS spend with rightsizing, autoscaling, commitment planning, and storage lifecycle policies. Use when running FinOps reviews, lowering cloud bills, or improving cost-per-request metrics.
npx skillsauth add bagelhole/devops-security-agent-skills aws-cost-optimizationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Apply practical FinOps controls to reduce AWS spend without sacrificing reliability or performance.
aws configure)ce:*, budgets:*, ec2:Describe*, cloudwatch:PutMetricAlarm, s3:PutLifecycleConfiguration# Get cost and usage for the last 30 days grouped by service
aws ce get-cost-and-usage \
--time-period Start=2026-02-01,End=2026-03-01 \
--granularity MONTHLY \
--metrics "BlendedCost" "UnblendedCost" "UsageQuantity" \
--group-by Type=DIMENSION,Key=SERVICE
# Get cost forecast for the next 30 days
aws ce get-cost-forecast \
--time-period Start=2026-03-24,End=2026-04-24 \
--metric UNBLENDED_COST \
--granularity MONTHLY
# Get cost grouped by a specific tag (e.g., team)
aws ce get-cost-and-usage \
--time-period Start=2026-02-01,End=2026-03-01 \
--granularity MONTHLY \
--metrics "UnblendedCost" \
--group-by Type=TAG,Key=team
# Get rightsizing recommendations for EC2
aws ce get-rightsizing-recommendation \
--service "AmazonEC2" \
--configuration '{"RecommendationTarget":"SAME_INSTANCE_FAMILY","BenefitsConsidered":true}'
# Get Savings Plans purchase recommendation
aws ce get-savings-plans-purchase-recommendation \
--savings-plans-type COMPUTE_SP \
--term-in-years ONE_YEAR \
--payment-option NO_UPFRONT \
--lookback-period-in-days SIXTY_DAYS
# Get Savings Plans utilization
aws ce get-savings-plans-utilization \
--time-period Start=2026-02-01,End=2026-03-01 \
--granularity MONTHLY
# Get Reserved Instance utilization
aws ce get-reservation-utilization \
--time-period Start=2026-02-01,End=2026-03-01 \
--granularity MONTHLY
# Create a monthly cost budget with email alert at 80% and 100%
aws budgets create-budget \
--account-id 123456789012 \
--budget '{
"BudgetName": "monthly-total",
"BudgetLimit": {"Amount": "5000", "Unit": "USD"},
"TimeUnit": "MONTHLY",
"BudgetType": "COST",
"CostFilters": {},
"CostTypes": {
"IncludeTax": true,
"IncludeSubscription": true,
"UseBlended": false
}
}' \
--notifications-with-subscribers '[
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 80,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [{"SubscriptionType": "EMAIL", "Address": "[email protected]"}]
},
{
"Notification": {
"NotificationType": "ACTUAL",
"ComparisonOperator": "GREATER_THAN",
"Threshold": 100,
"ThresholdType": "PERCENTAGE"
},
"Subscribers": [{"SubscriptionType": "EMAIL", "Address": "[email protected]"}]
}
]'
# List all budgets
aws budgets describe-budgets --account-id 123456789012
# Enable Cost Anomaly Detection monitor for all services
aws ce create-anomaly-monitor \
--anomaly-monitor '{
"MonitorName": "all-services",
"MonitorType": "DIMENSIONAL",
"MonitorDimension": "SERVICE"
}'
# Create anomaly subscription (alert when impact > $50)
aws ce create-anomaly-subscription \
--anomaly-subscription '{
"SubscriptionName": "cost-alerts",
"MonitorArnList": ["arn:aws:ce::123456789012:anomalymonitor/monitor-id"],
"Subscribers": [{"Type": "EMAIL", "Address": "[email protected]"}],
"Threshold": 50,
"Frequency": "DAILY"
}'
# Create alarm for estimated charges exceeding $4000
aws cloudwatch put-metric-alarm \
--alarm-name "billing-alarm-4000" \
--alarm-description "Alert when estimated charges exceed $4000" \
--metric-name EstimatedCharges \
--namespace AWS/Billing \
--statistic Maximum \
--period 21600 \
--threshold 4000 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 1 \
--dimensions Name=Currency,Value=USD \
--alarm-actions "arn:aws:sns:us-east-1:123456789012:billing-alerts" \
--treat-missing-data notBreaching
# List unattached EBS volumes (wasted storage spend)
aws ec2 describe-volumes \
--filters Name=status,Values=available \
--query "Volumes[].{ID:VolumeId,Size:Size,Created:CreateTime}" \
--output table
# Find old EBS snapshots (older than 90 days)
aws ec2 describe-snapshots \
--owner-ids self \
--query "Snapshots[?StartTime<='2025-12-24'].{ID:SnapshotId,Size:VolumeSize,Date:StartTime}" \
--output table
# List unused Elastic IPs (charged when not associated)
aws ec2 describe-addresses \
--query "Addresses[?AssociationId==null].{IP:PublicIp,AllocId:AllocationId}" \
--output table
# Find idle load balancers (zero healthy targets)
aws elbv2 describe-target-health \
--target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-tg/abc123
# List RDS instances and their utilization
aws cloudwatch get-metric-statistics \
--namespace AWS/RDS \
--metric-name CPUUtilization \
--dimensions Name=DBInstanceIdentifier,Value=mydb \
--start-time 2026-03-17T00:00:00Z \
--end-time 2026-03-24T00:00:00Z \
--period 86400 \
--statistics Average
# Apply tiered lifecycle policy to reduce storage costs
aws s3api put-bucket-lifecycle-configuration \
--bucket my-data-bucket \
--lifecycle-configuration '{
"Rules": [
{
"ID": "TierDownOldData",
"Status": "Enabled",
"Filter": {"Prefix": ""},
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER"},
{"Days": 365, "StorageClass": "DEEP_ARCHIVE"}
],
"NoncurrentVersionTransitions": [
{"NoncurrentDays": 30, "StorageClass": "GLACIER"}
],
"NoncurrentVersionExpiration": {"NoncurrentDays": 90}
},
{
"ID": "CleanupIncompleteUploads",
"Status": "Enabled",
"Filter": {"Prefix": ""},
"AbortIncompleteMultipartUpload": {"DaysAfterInitiation": 7}
}
]
}'
resource "aws_budgets_budget" "monthly" {
name = "monthly-total"
budget_type = "COST"
limit_amount = "5000"
limit_unit = "USD"
time_unit = "MONTHLY"
notification {
comparison_operator = "GREATER_THAN"
threshold = 80
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = ["[email protected]"]
}
notification {
comparison_operator = "GREATER_THAN"
threshold = 100
threshold_type = "PERCENTAGE"
notification_type = "ACTUAL"
subscriber_email_addresses = ["[email protected]"]
}
}
resource "aws_cloudwatch_metric_alarm" "billing" {
alarm_name = "billing-alarm-4000"
comparison_operator = "GreaterThanThreshold"
evaluation_periods = 1
metric_name = "EstimatedCharges"
namespace = "AWS/Billing"
period = 21600
statistic = "Maximum"
threshold = 4000
alarm_description = "Billing exceeds $4000"
alarm_actions = [aws_sns_topic.billing_alerts.arn]
dimensions = {
Currency = "USD"
}
}
# Stop all dev instances tagged Environment=dev (run via EventBridge + Lambda)
aws ec2 describe-instances \
--filters "Name=tag:Environment,Values=dev" "Name=instance-state-name,Values=running" \
--query "Reservations[].Instances[].InstanceId" \
--output text | xargs -n1 aws ec2 stop-instances --instance-ids
# Scale down dev ECS services to zero at night
aws ecs update-service \
--cluster dev-cluster \
--service dev-api \
--desired-count 0
| Problem | Cause | Fix | |---|---|---| | Cost Explorer returns empty data | CE not enabled or < 24h old | Enable in Billing console, wait 24h | | Budget alert not firing | SNS subscription not confirmed | Check email and confirm subscription | | Rightsizing shows no recommendations | Not enough usage data | Wait 14 days for sufficient metrics | | Savings Plans utilization low | Over-purchased or workload changed | Review and adjust SP coverage | | Unattached EBS not showing | Wrong region queried | Loop through all active regions | | Billing alarm never triggers | Billing metrics only in us-east-1 | Create alarm in us-east-1 region | | CUR data missing in S3 | Report not configured or bucket policy wrong | Verify CUR setup in Billing console | | Tag-based cost allocation empty | Tags not activated | Activate cost allocation tags in Billing |
development
Design and operationalize SRE dashboards that surface reliability, latency, error, saturation, and capacity signals across services. Use when building observability views for SLOs, incident response, and executive reliability reporting.
testing
Harden OpenClaw self-hosted environments with baseline host controls, auth tightening, secret handling, network segmentation, and safe update/rollback workflows. Use when deploying OpenClaw in home labs, startups, or production-like local AI infrastructure.
devops
Deploy, manage, and optimize vector databases for AI applications. Covers Qdrant, Weaviate, pgvector, and Pinecone — collection management, indexing strategies, backup, and performance tuning for production RAG and semantic search workloads.
testing
Deploy ML models on Kubernetes with KServe (formerly KFServing) and NVIDIA Triton Inference Server. Includes canary deployments, autoscaling, model versioning, A/B testing, and GPU resource management for production model serving.