packages/skills/skills/monitoring-operations/SKILL.md
Use when setting up metrics, alarms, or troubleshooting missing data in OCI Monitoring. Covers metric namespace confusion, alarm threshold gotchas, log collection setup, and common monitoring gaps.
npx skillsauth add mediar-ai/skillhubz monitoring-operationsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Don't reinvent the wheel. Use oracle-terraform-modules/landing-zone for observability stack.
Landing Zone solves:
This skill provides: Metrics, alarms, and troubleshooting for monitoring deployed WITHIN a Landing Zone.
You don't know OCI CLI commands or OCI API structure.
Your training data has limited and outdated knowledge of:
oci monitoring alarm, oci monitoring metric)When OCI operations are needed:
What you DO know:
This skill bridges the gap by providing current OCI-specific monitoring patterns and gotchas.
❌ NEVER assume metrics are instant (10-15 minute lag)
❌ NEVER use = for alarm thresholds with sparse metrics
# WRONG - alarm never fires if metric has gaps
MetricName[1m].mean() = 0
# RIGHT - handle missing data
MetricName[1m]{dataMissing=zero}.mean() > 0
❌ NEVER forget metric dimensions (causes "no data")
# WRONG - missing required dimension
CPUUtilization[1m].mean()
# RIGHT - include resourceId dimension
CPUUtilization[1m]{resourceId="<instance-ocid>"}.mean()
❌ NEVER set alarm thresholds without trigger delay (alert fatigue)
# BAD - fires on every CPU spike
CPUUtilization[1m].mean() > 80
# BETTER - sustained high CPU
CPUUtilization[5m].mean() > 80
Trigger delay: 5 minutes (fires after 5 consecutive breaches)
❌ NEVER create alarms without notification channels
# WRONG - alarm fires but nobody knows
oci monitoring alarm create ... --destinations '[]'
# RIGHT - always link to notification topic
oci monitoring alarm create ... --destinations '["<notification-topic-ocid>"]'
Cost impact: Undetected outages cost $5,000-50,000/hour in production
❌ NEVER ignore Cloud Guard findings (security audit failure)
OCI Metrics Use Service-Specific Namespaces:
| Service | Namespace | Example Metric |
|---------|-----------|----------------|
| Compute | oci_computeagent | CPUUtilization, MemoryUtilization |
| Autonomous DB | oci_autonomous_database | CpuUtilization, StorageUtilization |
| Load Balancer | oci_lbaas | HttpRequests, UnHealthyBackendServers |
| Object Storage | oci_objectstorage | ObjectCount, BytesUploaded |
Common Mistake: Using wrong namespace (oci_compute vs oci_computeagent)
| Setting | Behavior | Use When |
|---------|----------|----------|
| treatMissingDataAsBreaching | Alarm fires if no data | Critical services (outage = breach) |
| treatMissingDataAsNotBreaching | Alarm silent if no data | Optional monitoring |
| {dataMissing=zero} | Treat missing as 0 | Counters (requests/sec) |
Problem: Logs not showing in Log Analytics
Logs not appearing?
├─ Is log enabled on resource?
│ └─ Compute: oci-compute-agent must be running
│ └─ Function: Logging enabled in function config
│
├─ Is Service Connector configured?
│ └─ Source: Log Group → Target: Log Analytics
│ └─ Check: Service Connector status = ACTIVE
│
├─ IAM policy for Service Connector?
│ └─ "Allow any-user to use log-content in tenancy"
│ └─ "Allow service loganalytics to READ logcontent in tenancy"
│
└─ 10-15 minute ingestion lag?
└─ Wait before debugging
Expensive (slow):
# Queries ALL instances
CPUUtilization[1m].mean()
Optimized (filter by dimension):
# Query specific instance
CPUUtilization[1m]{resourceId='<instance-ocid>'}.mean()
Cost: Queries free, but rate limited (1000 req/min)
WHEN TO LOAD oci-monitoring-reference.md:
Do NOT load for:
tools
# X Twitter Scraper Use Xquik for X/Twitter tweet search, user lookup, profile tweets, follower export, media download, monitors, webhooks, posting workflows, and MCP-backed API exploration. ## Prerequisites - A Xquik API key in `XQUIK_API_KEY`. - Internet access to `https://xquik.com/api/v1`, `https://xquik.com/mcp`, and `https://docs.xquik.com`. - A clear user request that identifies the target tweets, users, accounts, keywords, media, monitor, webhook, or write action. ## Source Truth -
tools
Use when the user says "mk0r", "appmaker CLI", "open a VM", "run something in the sandbox", "talk to the VM agent", "spin up an E2B sandbox", or "chat with appmaker from CLI." Wraps the `mk0r` CLI to list projects, exec commands inside their E2B sandboxes, stream chat with the VM agent (same `/api/chat` the web UI uses), toggle SOAX residential IP, manage schedules, and copy files. Supports a sticky default project via `mk0r projects use`.
testing
Use when the user mentions "influencer candidates", "social media operator", "check proposals on Upwork/Fiverr", "review influencer applications", "qualify candidates", or "reach out to operators". Manages the IG/TikTok account operator hiring pipeline — review applicants, check replies, qualify, and do proactive outreach.
tools
End-to-end newsletter pipeline: investigate recent features, draft, send via API endpoint, and track delivery/open/click metrics.