Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

acedergren/monitoring-operations

Name: monitoring-operations
Author: acedergren

skills/monitoring-operations/SKILL.md

npx skillsauth add acedergren/agentic-tools monitoring-operations

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Clean

VirusTotalMulti-engine malware detection

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

OCI Monitoring and Observability - Expert Knowledge

NEVER Do This

NEVER debug "missing metrics" within the first 15 minutes

Metrics are published every 1–5 minutes
Processing delay adds another 5–10 minutes
Total lag from event to visible metric: 10–15 minutes
Premature debugging creates false investigations

NEVER use = for alarm thresholds with sparse metrics

# WRONG - alarm never fires when metric has data gaps
MetricName[1m].mean() = 0

# RIGHT - handle missing data explicitly
MetricName[1m]{dataMissing=zero}.mean() > 0

NEVER omit the resourceId dimension in metric queries

# WRONG - returns no data (required dimension missing)
CPUUtilization[1m].mean()

# RIGHT - filter by instance OCID
CPUUtilization[1m]{resourceId="<instance-ocid>"}.mean()

Querying without dimensions returns data for ALL resources — usually not what's intended, and rate-limited at 1000 req/min.

NEVER set alarm thresholds without a trigger delay

# BAD - fires on every transient CPU spike (alert fatigue)
CPUUtilization[1m].mean() > 80

# BETTER - fires only on sustained breach
CPUUtilization[5m].mean() > 80
# + set trigger delay: 5 minutes (5 consecutive breaches)

NEVER create alarms without notification destinations

# WRONG - alarm fires but nobody is notified
oci monitoring alarm create ... --destinations '[]'

# RIGHT - always link to a notification topic
oci monitoring alarm create ... --destinations '["<notification-topic-ocid>"]'

Cost impact: undetected production outages = $5,000–50,000+/hour.

NEVER ignore Cloud Guard findings

Cloud Guard detects misconfigurations before they become incidents
Wire it: Cloud Guard → Notifications → email/Slack/PagerDuty
Unresolved findings fail CIS/SOC2/HIPAA audits

Metric Namespace Reference

OCI uses service-specific namespaces — using the wrong namespace returns no data with no error.

| Service | Namespace | Key Metrics | |------------------|------------------------------|------------------------------------------| | Compute | oci_computeagent | CPUUtilization, MemoryUtilization | | Autonomous DB | oci_autonomous_database | CpuUtilization, StorageUtilization | | Load Balancer | oci_lbaas | HttpRequests, UnHealthyBackendServers| | Object Storage | oci_objectstorage | ObjectCount, BytesUploaded |

Common mistake: using oci_compute instead of oci_computeagent — the agent namespace requires the OCI Compute Agent to be running on the instance.

Alarm Missing Data Handling

| Setting | Behavior | Use When | |---------|----------|----------| | treatMissingDataAsBreaching | Alarm fires if no data arrives | Critical services (silence = outage) | | treatMissingDataAsNotBreaching | Alarm silent if no data | Optional or intermittent monitoring | | {dataMissing=zero} in MQL | Treats gaps as 0 value | Request counters, throughput metrics |

Log Collection Troubleshooting

Logs not appearing in Log Analytics?
│
├─ Is logging enabled on the resource?
│  └─ Compute: is oci-compute-agent running? (systemctl status oracle-cloud-agent)
│  └─ Functions: is logging enabled in function configuration?
│
├─ Is Service Connector configured and ACTIVE?
│  └─ Source: Log Group → Target: Log Analytics
│  └─ Check status: oci sch service-connector get --id <ocid>
│
├─ IAM policy for Service Connector?
│  └─ "Allow any-user to use log-content in tenancy"
│  └─ "Allow service loganalytics to READ logcontent in tenancy"
│  └─ Missing EITHER policy causes silent failure
│
└─ 10–15 minute ingestion lag?
   └─ Wait before concluding logs are missing

Metric Query Performance

Unfiltered queries scan ALL resources in compartment — slow and consumes rate limit budget.

# Expensive: scans all instances
CPUUtilization[1m].mean()

# Optimized: filter to specific instance
CPUUtilization[1m]{resourceId='<instance-ocid>'}.mean()

Rate limit: 1000 metric queries/minute per tenancy. Dashboard with many unfiltered widgets can exhaust this.

Progressive Loading Reference

Load references/oci-monitoring-reference.md when:

Need the complete list of OCI service metric namespaces and metric names
Writing complex MQL expressions (composites, functions, grouping)
Implementing composite alarm conditions
Setting up Log Analytics workspace, APM, or Service Connector Hub in detail

Do NOT load for alarm threshold patterns, namespace gotchas, or log troubleshooting — this file covers those.

acedergren/monitoring-operations

skills/monitoring-operations/SKILL.md

Use when setting up OCI metrics, alarms, or log collection, or troubleshooting missing data and silent alarms. Covers metric namespace naming, MQL dimension requirements, alarm missing-data handling, Service Connector IAM gaps, and Cloud Guard integration. KEYWORDS: monitoring, alarm, metric, MQL, namespace, log, Service Connector, Log Analytics, Cloud Guard, missing data, oci_computeagent.

8 stars

testing

Updated Mar 26, 2026

$ install --global

skillsauth

npx skillsauth add acedergren/agentic-tools monitoring-operations

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

4 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Clean

VirusTotalMulti-engine malware detection

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 27, 2026, 4:32 AM78.2s1 file scanned

SKILL.md

name:: monitoring-operations
description:: Use when setting up OCI metrics, alarms, or log collection, or troubleshooting missing data and silent alarms. Covers metric namespace naming, MQL dimension requirements, alarm missing-data handling, Service Connector IAM gaps, and Cloud Guard integration. KEYWORDS: monitoring, alarm, metric, MQL, namespace, log, Service Connector, Log Analytics, Cloud Guard, missing data, oci_computeagent.

OCI Monitoring and Observability - Expert Knowledge

NEVER Do This

NEVER debug "missing metrics" within the first 15 minutes

Metrics are published every 1–5 minutes
Processing delay adds another 5–10 minutes
Total lag from event to visible metric: 10–15 minutes
Premature debugging creates false investigations

NEVER use = for alarm thresholds with sparse metrics

# WRONG - alarm never fires when metric has data gaps
MetricName[1m].mean() = 0

# RIGHT - handle missing data explicitly
MetricName[1m]{dataMissing=zero}.mean() > 0

NEVER omit the resourceId dimension in metric queries

# WRONG - returns no data (required dimension missing)
CPUUtilization[1m].mean()

# RIGHT - filter by instance OCID
CPUUtilization[1m]{resourceId="<instance-ocid>"}.mean()

Querying without dimensions returns data for ALL resources — usually not what's intended, and rate-limited at 1000 req/min.

NEVER set alarm thresholds without a trigger delay

# BAD - fires on every transient CPU spike (alert fatigue)
CPUUtilization[1m].mean() > 80

# BETTER - fires only on sustained breach
CPUUtilization[5m].mean() > 80
# + set trigger delay: 5 minutes (5 consecutive breaches)

NEVER create alarms without notification destinations

# WRONG - alarm fires but nobody is notified
oci monitoring alarm create ... --destinations '[]'

# RIGHT - always link to a notification topic
oci monitoring alarm create ... --destinations '["<notification-topic-ocid>"]'

Cost impact: undetected production outages = $5,000–50,000+/hour.

NEVER ignore Cloud Guard findings

Cloud Guard detects misconfigurations before they become incidents
Wire it: Cloud Guard → Notifications → email/Slack/PagerDuty
Unresolved findings fail CIS/SOC2/HIPAA audits

Metric Namespace Reference

OCI uses service-specific namespaces — using the wrong namespace returns no data with no error.

Common mistake: using oci_compute instead of oci_computeagent — the agent namespace requires the OCI Compute Agent to be running on the instance.

Alarm Missing Data Handling

Log Collection Troubleshooting

Logs not appearing in Log Analytics?
│
├─ Is logging enabled on the resource?
│  └─ Compute: is oci-compute-agent running? (systemctl status oracle-cloud-agent)
│  └─ Functions: is logging enabled in function configuration?
│
├─ Is Service Connector configured and ACTIVE?
│  └─ Source: Log Group → Target: Log Analytics
│  └─ Check status: oci sch service-connector get --id <ocid>
│
├─ IAM policy for Service Connector?
│  └─ "Allow any-user to use log-content in tenancy"
│  └─ "Allow service loganalytics to READ logcontent in tenancy"
│  └─ Missing EITHER policy causes silent failure
│
└─ 10–15 minute ingestion lag?
   └─ Wait before concluding logs are missing

Metric Query Performance

Unfiltered queries scan ALL resources in compartment — slow and consumes rate limit budget.

# Expensive: scans all instances
CPUUtilization[1m].mean()

# Optimized: filter to specific instance
CPUUtilization[1m]{resourceId='<instance-ocid>'}.mean()

Rate limit: 1000 metric queries/minute per tenancy. Dashboard with many unfiltered widgets can exhaust this.

Progressive Loading Reference

Load references/oci-monitoring-reference.md when:

Need the complete list of OCI service metric namespaces and metric names
Writing complex MQL expressions (composites, functions, grouping)
Implementing composite alarm conditions
Setting up Log Analytics workspace, APM, or Service Connector Hub in detail

Do NOT load for alarm threshold patterns, namespace gotchas, or log troubleshooting — this file covers those.

Related Skills

acedergren/skills/api-audit

development

VerifiedTrustedCommunity

--- name: api-audit description: "Use when auditing API routes for schema drift, missing auth, or validation gaps. Scans routes against shared TypeScript types to find mismatches, missing middleware, and undocumented endpoints. Read-only — produces a severity-grouped report. Keywords: audit routes, schema drift, auth gaps, missing validation, type mismatch, orphaned schemas. Triggers on "audit API routes" or "find schema drift"." --- # API Route & Type Audit Skill ## When to Use Load this skil

14SKILL.mdUpdated Mar 26, 2026

acedergren/skills/api-audit

acedergren/write-natural-swedish

development

VerifiedTrustedCommunity

Use when drafting, translating, polishing, or reviewing Swedish text so it sounds natural, fluent, contemporary, and appropriate for its audience. Triggers include "write better Swedish", "make this sound natural in Swedish", "translate into Swedish", "polish this Swedish", "tech company Swedish", "contemporary Swedish words", "Swedish developer docs", and "avoid Anglicisms".

13SKILL.mdUpdated May 13, 2026

acedergren/write-natural-swedish

acedergren/shadcn-svelte

development

VerifiedTrustedCommunity

Use when working with shadcn-svelte components, TanStack Table in Svelte 5, or Tailwind v4.1. Covers non-obvious reactivity bugs, library selection trade-offs, and migration pitfalls not in the official docs. Keywords: shadcn-svelte, TanStack Table, Tailwind v4.1, Svelte 5 runes, bits-ui, superforms, data table, svelte-check.

13SKILL.mdUpdated Mar 26, 2026

acedergren/shadcn-svelte

acedergren/oracle-idcs-org-provisioning

data-ai

VerifiedTrustedCommunity

Use when mapping IDCS claims to org membership after OAuth login succeeds. Covers mapProfileToUser, session.create.before, session.create.after hooks, MERGE INTO upserts, tenant-org mapping, and first-admin bootstrap. Keywords: IDCS groups, org_members, provisioning, session hooks, tenant map, MERGE INTO.

13SKILL.mdUpdated Mar 26, 2026

acedergren/oracle-idcs-org-provisioning

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/acedergren/agentic-tools.git

# Copy into Claude Code skills folder (global)
cp -r agentic-tools/skills/monitoring-operations ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

acedergren/agentic-tools

8 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT