skills/codex/cloudwatch-observability/SKILL.md
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: cloudwatch-observability description: Amazon CloudWatch patterns for AI agent observability. Use when monitoring Bedrock agent invocations, tracking token usage, setting up alarms for agent failures, or analyzing agent performance via CloudWatch Logs Insights. --- # Amazon CloudWatch for AI Agent Observability Monitor AI agent performance, costs, and reliability using CloudWatch metrics, logs, and alarms. ## When to Use - Mo
npx skillsauth add frank-luongt/faos-skills-marketplace skills/codex/cloudwatch-observabilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Monitor AI agent performance, costs, and reliability using CloudWatch metrics, logs, and alarms.
Key CloudWatch metrics emitted by Amazon Bedrock:
| Metric | Namespace | Description |
|---|---|---|
| Invocations | AWS/Bedrock | Number of model invocations |
| InvocationLatency | AWS/Bedrock | End-to-end invocation time (ms) |
| InvocationClientErrors | AWS/Bedrock | 4xx errors (throttling, validation) |
| InvocationServerErrors | AWS/Bedrock | 5xx errors |
| InputTokenCount | AWS/Bedrock | Input tokens consumed |
| OutputTokenCount | AWS/Bedrock | Output tokens generated |
| InvocationThrottles | AWS/Bedrock | Throttled requests |
-- Find slowest agent invocations in the last 24h
fields @timestamp, @message
| filter @message like /agentId/
| parse @message '"invocationLatencyMs":*,' as latency
| sort latency desc
| limit 20
-- Token usage by model over time
fields @timestamp
| filter @message like /inputTokenCount/
| parse @message '"modelId":"*"' as model
| parse @message '"inputTokenCount":*,' as input_tokens
| parse @message '"outputTokenCount":*,' as output_tokens
| stats sum(input_tokens) as total_input, sum(output_tokens) as total_output by model, bin(1h)
-- Agent errors with reasoning trace
fields @timestamp, @message
| filter @message like /ERROR/ or @message like /ThrottlingException/
| sort @timestamp desc
| limit 50
import boto3
cloudwatch = boto3.client("cloudwatch")
# Alarm when daily token usage exceeds threshold
cloudwatch.put_metric_alarm(
AlarmName="bedrock-daily-token-budget",
Namespace="AWS/Bedrock",
MetricName="InputTokenCount",
Statistic="Sum",
Period=86400, # 24 hours
EvaluationPeriods=1,
Threshold=10_000_000, # 10M tokens
ComparisonOperator="GreaterThanThreshold",
AlarmActions=["arn:aws:sns:us-east-1:123456789:ai-ops-alerts"],
Dimensions=[{"Name": "ModelId", "Value": "anthropic.claude-3-5-sonnet-20241022-v2:0"}],
)
# Alarm for high error rate
cloudwatch.put_metric_alarm(
AlarmName="bedrock-agent-error-rate",
Namespace="AWS/Bedrock",
MetricName="InvocationServerErrors",
Statistic="Sum",
Period=300, # 5 minutes
EvaluationPeriods=2,
Threshold=10,
ComparisonOperator="GreaterThanThreshold",
AlarmActions=["arn:aws:sns:us-east-1:123456789:ai-ops-alerts"],
)
import boto3
cloudwatch = boto3.client("cloudwatch")
def publish_agent_metrics(agent_name: str, metrics: dict):
"""Publish custom agent metrics to CloudWatch."""
cloudwatch.put_metric_data(
Namespace="FAOS/AgentOps",
MetricData=[
{
"MetricName": "ToolCallCount",
"Value": metrics["tool_calls"],
"Unit": "Count",
"Dimensions": [{"Name": "AgentName", "Value": agent_name}],
},
{
"MetricName": "ResolutionRate",
"Value": metrics["resolved_pct"],
"Unit": "Percent",
"Dimensions": [{"Name": "AgentName", "Value": agent_name}],
},
{
"MetricName": "SessionDuration",
"Value": metrics["duration_ms"],
"Unit": "Milliseconds",
"Dimensions": [{"Name": "AgentName", "Value": agent_name}],
},
],
)
@message full-text search instead of structured filters -- parse fields firstdevelopment
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-mlflow-evaluation --- # MLflow 3 GenAI Evaluation ## Before Writing Any Code 1. **Read GOTCHAS.md** - 15+ common mistakes that cause failures 2. **Read CRITICAL-interfaces.md** - Exact API signatures and data schemas ## End-to-End Workflows Follow these workflows based on your goal. Each step indicates which reference files to read. ### Workflow 1: First-Time Evaluation Setup For users new to MLflow GenAI evalu
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-lakebase-provisioned --- # Lakebase Provisioned Patterns and best practices for using Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. ## When to Use Use this skill when: - Building applications that need a PostgreSQL database for transactional workloads - Adding persistent state to Databricks Apps - Implementing reverse ETL from Delta Lake to an operational database - Storing chat/agent m
tools
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-jobs --- # Databricks Lakeflow Jobs ## Overview Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles. ## Reference Files | Use Case | Reference File | | ----------------------
development
<!-- AUTO-GENERATED by export-skills.py — DO NOT EDIT --> --- name: databricks-genie --- # Databricks Genie Create and query Databricks Genie Spaces - natural language interfaces for SQL-based data exploration. ## Overview Genie Spaces allow users to ask natural language questions about structured data in Unity Catalog. The system translates questions into SQL queries, executes them on a SQL warehouse, and presents results conversationally. ## When to Use This Skill Use this skill when: -