Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

davepoon/ops-fires

Name: ops-fires
Author: davepoon

plugins/claude-ops/skills/ops-fires/SKILL.md

npx skillsauth add davepoon/buildwithclaude ops-fires

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

OPS ► FIRES

Runtime Context

Before executing, load available context:

Daemon health: Read ${CLAUDE_PLUGIN_DATA_DIR:-$HOME/.claude/plugins/data/ops-ops-marketplace}/daemon-health.json
- Check infra-monitor service status — if not running, pre-gathered infra data may be stale
- If action_needed is not null → surface it immediately as a potential fire
Secrets: AWS credentials are required for ECS/CloudWatch queries.

Secret Resolution
- First: check $AWS_ACCESS_KEY_ID / $AWS_PROFILE env vars
- Then: doppler secrets get AWS_ACCESS_KEY_ID --plain (if doppler configured in prefs)
- Then: use password_manager_config.query_cmd from preferences
- Sentry token: $SENTRY_AUTH_TOKEN → Doppler SENTRY_AUTH_TOKEN → vault
Preferences: Read ${CLAUDE_PLUGIN_DATA_DIR}/preferences.json for secrets_manager config to know which vault to query.

CLI/API Reference

aws CLI

| Command | Usage | Output | |---------|-------|--------| | aws ecs list-services --cluster <name> --query 'serviceArns' | ECS services | ARN list | | aws ecs describe-services --cluster <name> --services <arn> --query 'services[0].{status:status,running:runningCount,desired:desiredCount}' | Service health | JSON | | aws logs tail /ecs/<service> --since 1h --format short | ECS logs | Log lines (use with Monitor for live) |

gh CLI (GitHub)

| Command | Usage | Output | |---------|-------|--------| | gh run list --limit 20 --json status,conclusion,name,headBranch,createdAt | Recent CI runs | JSON array | | gh run view <id> --repo <repo> --log-failed | Failed CI logs | Log output |

sentry-cli / Sentry API

| Command | Usage | Output | |---------|-------|--------| | sentry-cli issues list --project <slug> --status unresolved | Unresolved issues | Issue list | | curl -H "Authorization: Bearer $SENTRY_AUTH_TOKEN" "https://sentry.io/api/0/projects/<org>/<proj>/issues/?query=is:unresolved" | API fallback when MCP unavailable | JSON array |

Agent Teams support

If CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 is set, use Agent Teams when dispatching multiple fix agents simultaneously. This enables:

Fix agents share findings (e.g., API agent discovers DB is the root cause → infra agent pivots to DB fix)
You can prioritize: "CRITICAL ECS issue first, then CI failures"
Real-time progress: agents report as they find root causes, you can merge fixes in optimal order

Team setup (only when flag is enabled, dispatch phase):

TeamCreate("fire-fixers")
Agent(team_name="fire-fixers", name="fix-[service]", ...)

If the flag is NOT set, use standard parallel subagents.

Pre-gathered infrastructure data

${CLAUDE_PLUGIN_ROOT}/bin/ops-infra 2>/dev/null || echo '{"clusters":[],"error":"infra check failed"}'

CI failures (last 24h)

${CLAUDE_PLUGIN_ROOT}/bin/ops-ci 2>/dev/null || echo '[]'

External projects health

${CLAUDE_PLUGIN_ROOT}/bin/ops-external 2>/dev/null || echo '[]'

Your task

Analyze the pre-gathered data — including external projects. Then run parallel checks:

ECS health — parse infra data for unhealthy services, stopped tasks, failed deployments.
Sentry — if Sentry MCP is connected, query recent unresolved errors. Otherwise note it's unavailable.
CI — parse CI data for failing pipelines, broken main/dev branches.
GitHub Actions — gh run list --limit 20 --json status,conclusion,name,headBranch,createdAt 2>/dev/null
External projects — parse ops-external data. Flag auth_expired as HIGH (credential rotation needed), unreachable/degraded as MEDIUM, not_configured as LOW.

Classify each issue by severity:

| Severity | Criteria | | -------- | ------------------------------------------------- | | CRITICAL | Service down, DB unreachable, auth broken | | HIGH | Elevated error rate, deploy stuck, CI main broken | | MEDIUM | Non-critical service degraded, flaky tests | | LOW | Warning-level, non-urgent |

Output format

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 OPS ► FIRES DASHBOARD — [timestamp]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

CRITICAL
[service] — [issue] — [since]

HIGH
[service] — [issue] — [since]

MEDIUM
[service] — [issue] — [since]

ECS HEALTH
[cluster] [service] [desired/running] [status]

CI STATUS
[repo] [branch] [workflow] [status] [last run]

SENTRY (top errors, 24h)
[error] [count] [first seen] [project]

EXTERNAL PROJECTS
[alias] [source] [status] [details — e.g. auth_expired, unreachable]

──────────────────────────────────────────────────────

Use batched AskUserQuestion calls (max 4 options each). Only show relevant actions (e.g., skip dispatch options if no issues found):

AskUserQuestion call 1:

  [Dispatch fix agent for [top critical issue]]
  [Dispatch fix agent for [second issue]]
  [View logs for [service]]
  [More...]

AskUserQuestion call 2 (only if "More..."):

  [Open Sentry dashboard]
  [Open GitHub Actions]
  [All clear — nothing to do]

If no fires: show "ALL SYSTEMS OPERATIONAL" with last-checked timestamps.

Dispatch fix agent

When user selects to fix an issue, use AskUserQuestion to confirm the scope before dispatching:

Dispatch fix agent for: [issue title]
  Severity: [CRITICAL/HIGH/MEDIUM]
  Repo: [repo]
  Error: [brief description]
  
  The agent will:
  - Investigate root cause in [repo]
  - Create feature branch with fix
  - Open PR for review

  [Dispatch agent]  [Show me the logs first]  [Skip — I'll fix manually]

On confirmation, spawn an Agent with:

The error details and logs
Access to the relevant repo
Instruction to create a feature branch, fix, and open a PR
Report back when done or blocked

Use the agents/infra-monitor.md agent definition for infra issues.

If $ARGUMENTS contains a project alias, filter to that project's services only.

Native tool usage

Monitor — live service health

Use Monitor to stream ECS task logs or GitHub Actions runs when investigating fires:

Monitor(command: "aws logs tail /ecs/<service> --follow --since 5m")

Tasks — incident tracking

Use TaskCreate for each active fire. Update with TaskUpdate as fires are investigated/fixed/escalated.

WebFetch — status pages

When diagnosing fires, use WebFetch to check AWS status page (https://health.aws.amazon.com/health/status), Vercel status, or third-party API status pages.

WebSearch — known outage patterns

Use WebSearch to find if the error pattern matches a known AWS/infrastructure issue (e.g., "ECS task stopped CannotPullContainerError" → known ECR throttling).

davepoon/ops-fires

plugins/claude-ops/skills/ops-fires/SKILL.md

Production incidents dashboard. Reads ECS health, Sentry errors, CI failures. Offers to dispatch fix agents for active fires.

2,899 stars

testing

Updated May 10, 2026

$ install --global

skillsauth

npx skillsauth add davepoon/buildwithclaude ops-fires

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 10, 2026, 2:58 AM209.9s1 file scanned

SKILL.md

name:: ops-fires
description:: Production incidents dashboard. Reads ECS health, Sentry errors, CI failures. Offers to dispatch fix agents for active fires.
argument-hint:: [project-alias|all]
effort:: medium
maxTurns:: 30

OPS ► FIRES

Runtime Context

Before executing, load available context:

Daemon health: Read ${CLAUDE_PLUGIN_DATA_DIR:-$HOME/.claude/plugins/data/ops-ops-marketplace}/daemon-health.json
- Check infra-monitor service status — if not running, pre-gathered infra data may be stale
- If action_needed is not null → surface it immediately as a potential fire
Secrets: AWS credentials are required for ECS/CloudWatch queries.

Secret Resolution
- First: check $AWS_ACCESS_KEY_ID / $AWS_PROFILE env vars
- Then: doppler secrets get AWS_ACCESS_KEY_ID --plain (if doppler configured in prefs)
- Then: use password_manager_config.query_cmd from preferences
- Sentry token: $SENTRY_AUTH_TOKEN → Doppler SENTRY_AUTH_TOKEN → vault
Preferences: Read ${CLAUDE_PLUGIN_DATA_DIR}/preferences.json for secrets_manager config to know which vault to query.

CLI/API Reference

aws CLI

gh CLI (GitHub)

sentry-cli / Sentry API

Agent Teams support

If CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 is set, use Agent Teams when dispatching multiple fix agents simultaneously. This enables:

Fix agents share findings (e.g., API agent discovers DB is the root cause → infra agent pivots to DB fix)
You can prioritize: "CRITICAL ECS issue first, then CI failures"
Real-time progress: agents report as they find root causes, you can merge fixes in optimal order

Team setup (only when flag is enabled, dispatch phase):

TeamCreate("fire-fixers")
Agent(team_name="fire-fixers", name="fix-[service]", ...)

If the flag is NOT set, use standard parallel subagents.

Pre-gathered infrastructure data

${CLAUDE_PLUGIN_ROOT}/bin/ops-infra 2>/dev/null || echo '{"clusters":[],"error":"infra check failed"}'

CI failures (last 24h)

${CLAUDE_PLUGIN_ROOT}/bin/ops-ci 2>/dev/null || echo '[]'

External projects health

${CLAUDE_PLUGIN_ROOT}/bin/ops-external 2>/dev/null || echo '[]'

Your task

Analyze the pre-gathered data — including external projects. Then run parallel checks:

ECS health — parse infra data for unhealthy services, stopped tasks, failed deployments.
Sentry — if Sentry MCP is connected, query recent unresolved errors. Otherwise note it's unavailable.
CI — parse CI data for failing pipelines, broken main/dev branches.
GitHub Actions — gh run list --limit 20 --json status,conclusion,name,headBranch,createdAt 2>/dev/null
External projects — parse ops-external data. Flag auth_expired as HIGH (credential rotation needed), unreachable/degraded as MEDIUM, not_configured as LOW.

Classify each issue by severity:

Output format

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 OPS ► FIRES DASHBOARD — [timestamp]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

CRITICAL
[service] — [issue] — [since]

HIGH
[service] — [issue] — [since]

MEDIUM
[service] — [issue] — [since]

ECS HEALTH
[cluster] [service] [desired/running] [status]

CI STATUS
[repo] [branch] [workflow] [status] [last run]

SENTRY (top errors, 24h)
[error] [count] [first seen] [project]

EXTERNAL PROJECTS
[alias] [source] [status] [details — e.g. auth_expired, unreachable]

──────────────────────────────────────────────────────

Use batched AskUserQuestion calls (max 4 options each). Only show relevant actions (e.g., skip dispatch options if no issues found):

AskUserQuestion call 1:

  [Dispatch fix agent for [top critical issue]]
  [Dispatch fix agent for [second issue]]
  [View logs for [service]]
  [More...]

AskUserQuestion call 2 (only if "More..."):

  [Open Sentry dashboard]
  [Open GitHub Actions]
  [All clear — nothing to do]

If no fires: show "ALL SYSTEMS OPERATIONAL" with last-checked timestamps.

Dispatch fix agent

When user selects to fix an issue, use AskUserQuestion to confirm the scope before dispatching:

Dispatch fix agent for: [issue title]
  Severity: [CRITICAL/HIGH/MEDIUM]
  Repo: [repo]
  Error: [brief description]
  
  The agent will:
  - Investigate root cause in [repo]
  - Create feature branch with fix
  - Open PR for review

  [Dispatch agent]  [Show me the logs first]  [Skip — I'll fix manually]

On confirmation, spawn an Agent with:

The error details and logs
Access to the relevant repo
Instruction to create a feature branch, fix, and open a PR
Report back when done or blocked

Use the agents/infra-monitor.md agent definition for infra issues.

If $ARGUMENTS contains a project alias, filter to that project's services only.

Native tool usage

Monitor — live service health

Use Monitor to stream ECS task logs or GitHub Actions runs when investigating fires:

Monitor(command: "aws logs tail /ecs/<service> --follow --since 5m")

Tasks — incident tracking

Use TaskCreate for each active fire. Update with TaskUpdate as fires are investigated/fixed/escalated.

WebFetch — status pages

When diagnosing fires, use WebFetch to check AWS status page (https://health.aws.amazon.com/health/status), Vercel status, or third-party API status pages.

WebSearch — known outage patterns

Use WebSearch to find if the error pattern matches a known AWS/infrastructure issue (e.g., "ECS task stopped CannotPullContainerError" → known ECR throttling).

Related Skills

davepoon/anti-ui-slop

development

VerifiedTrustedCommunity

Stop coding agents from shipping generic UI. Use UIZZE's 800,000+ real web and iOS screens to build product-specific interfaces, define a design contract, cover required states, and run a hard finish gate. Use for web or iOS UI design, implementation, redesign, critique, and pre-ship review in Codex, Claude Code, Cursor, Copilot, and other coding agents.

3,220SKILL.mdUpdated Jul 27, 2026

davepoon/anti-ui-slop

davepoon/theboardroom

development

VerifiedTrustedCommunity

Convene an AI executive board of directors (CEO, CFO, COO, CLO, CISO sub-agent personas) to vet a business idea, product concept, new service offering, M&A target, or operational initiative — and deliver an integrated board memo with a Go/No-Go recommendation. Use this skill whenever the user wants an idea vetted, stress-tested, or reviewed from multiple executive perspectives; asks to "present this to the board," "run this by the boardroom," "vet this idea," "poke holes in this plan," or "prep me for a board meeting"; or shares a business plan, pitch, proposal, or initiative document and asks for structured executive feedback. Also trigger when the user asks for a Go/No-Go decision, risk review across finance/legal/security/operations, or preparation for presenting an initiative to real leadership.

3,183SKILL.mdUpdated Jul 16, 2026

davepoon/theboardroom

davepoon/travel-agent-skill

data-ai

VerifiedTrustedCommunity

私人旅行管家 — 从出发地到目的地的完整行程规划+攻略导出。输入出发地、目的地、天数、预算、风格偏好，自动输出闭环行程，包含交通推荐、酒店推荐、美食路线、每日预算，并可选生成攻略。当用户提到「做攻略」「旅行规划」「旅游计划」「行程安排」时使用。

3,180SKILL.mdUpdated Jul 15, 2026

davepoon/travel-agent-skill

davepoon/ontoly-software-graph

tools

VerifiedTrustedCommunity

Use Ontoly's deterministic Software Graph and MCP server for codebase architecture, request tracing, dependency analysis, and impact analysis.

3,180SKILL.mdUpdated Jul 15, 2026

davepoon/ontoly-software-graph

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/davepoon/buildwithclaude.git

# Copy into Claude Code skills folder (global)
cp -r buildwithclaude/plugins/claude-ops/skills/ops-fires ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

davepoon/buildwithclaude

2,899 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

davepoon/ops-fires

$ install --global

Security Scan Results

SKILL.md

OPS ► FIRES

Runtime Context

Secret Resolution

CLI/API Reference

aws CLI

gh CLI (GitHub)

sentry-cli / Sentry API

Agent Teams support

Pre-gathered infrastructure data

CI failures (last 24h)

External projects health

Your task

Output format

Dispatch fix agent

Native tool usage

Monitor — live service health

Tasks — incident tracking

WebFetch — status pages

WebSearch — known outage patterns

Related Skills

davepoon/anti-ui-slop

davepoon/theboardroom

davepoon/travel-agent-skill

davepoon/ontoly-software-graph

davepoon/ops-fires

$ install --global

Security Scan Results

SKILL.md

OPS ► FIRES

Runtime Context

Secret Resolution

CLI/API Reference

aws CLI

gh CLI (GitHub)

sentry-cli / Sentry API

Agent Teams support

Pre-gathered infrastructure data

CI failures (last 24h)

External projects health

Your task

Output format

Dispatch fix agent

Native tool usage

Monitor — live service health

Tasks — incident tracking

WebFetch — status pages

WebSearch — known outage patterns

Related Skills

davepoon/anti-ui-slop

davepoon/theboardroom

davepoon/travel-agent-skill

davepoon/ontoly-software-graph