Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

nixopus/incident-response

Name: incident-response
Author: nixopus

skills/incident-response/SKILL.md

npx skillsauth add nixopus/agent incident-response

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Incident Response

Event Classification

Classify the incident by severity before acting:

| Severity | Criteria | Response time | Action | |---|---|---|---| | Critical | App completely down, all users affected | Immediate | Diagnose + attempt auto-fix + notify | | High | App degraded, errors for some users | Within minutes | Diagnose + attempt auto-fix + notify | | Medium | Non-user-facing failure (build failed, deploy failed) | Within session | Diagnose + fix suggestion + notify | | Low | Warning, non-critical issue detected | Informational | Notify only |

Severity signals

| Signal | Severity | |---|---| | Container exited, restart_count > 3 | Critical | | HTTP probe returns 502/503/504 | Critical | | Build failed | Medium | | Container OOM-killed once | High | | Health endpoint returns unhealthy | High | | Deployment succeeded but no traffic | High | | SSL certificate expiring soon | Medium | | Disk usage > 85% | Medium |

Incident Workflow

1. GATHER

Collect context about the affected resource:

get_application — app details, current deployment, configured port
get_application_deployments — recent deployment history
get_deployment_logs — if build/deploy failed
list_containers → get_container — container status
get_container_logs — runtime errors

2. DIAGNOSE

Delegate to the diagnostic agent with full context:

Include: application ID, deployment ID, error message, container status
The diagnostic agent uses failure-diagnosis skill for pattern matching
Wait for diagnosis result: root cause + whether it's code-fixable

3. DECIDE

Based on diagnosis:

| Root cause type | Action | |---|---| | Code error (syntax, missing dep, config) | Auto-fix via PR | | Dockerfile issue (wrong base image, missing file) | Auto-fix via PR | | Environment variable missing or wrong | Notify user — env vars need manual input | | Infrastructure (server resources, Docker daemon) | Notify user — requires manual intervention | | Database connection failed | Notify user — check database status and credentials | | External service down | Notify user — nothing to fix on our side | | Unknown | Notify user with gathered evidence |

4. FIX (if code-fixable)

Delegate to the GitHub agent:

Branch: auto-fix/<short-description> (e.g. auto-fix/missing-prisma-schema)
Read the problematic file, generate the minimal fix
Commit with message: fix: <description of what was fixed>
Open PR from fix branch into default branch
Never merge — return PR URL for user approval

5. NOTIFY

Send to all configured notification channels:

If fix PR created:

Failure detected for [app name].
Root cause: [one-line summary].
Auto-fix PR: [pr_url]
Review and merge to trigger redeploy.

If no fix possible:

Failure detected for [app name].
Root cause: [one-line summary].
Recommended action: [specific next step].

If diagnosis inconclusive:

Issue detected for [app name].
Findings: [what was observed].
Unable to determine root cause automatically.
Please investigate: [specific things to check].

6. VERIFY (after user merges fix)

If the fix PR is merged and a new deployment triggers:

Run post-deploy-verification checks
If healthy: notify "Issue resolved after fix merge"
If still failing: escalate — "Fix did not resolve the issue, further investigation needed"

Rules

Never merge PRs automatically — always require user approval
Never push to main/master — always use fix branches
Do not retry the same fix more than once
Maximum 3 auto-fix attempts per incident before escalating to user
Include all relevant context when delegating to sub-agents
Every response must end with a concrete result or completed action

Anti-Patterns

Fixing symptoms instead of root cause: If the container OOM-kills, don't just increase memory — investigate the memory consumer
Auto-fixing infrastructure issues: Server-level problems (disk full, Docker daemon down) can't be fixed via code PR
Notifying without actionable information: "Something went wrong" is useless — always include what failed, why, and what to do
Cascading fixes: If fix A causes failure B, stop and escalate — don't chain auto-fixes

Related Skills

failure-diagnosis — Pattern tables for identifying root causes
rollback-strategy — When to rollback vs fix forward
post-deploy-verification — Verify fix worked after merge

Event Context

Your prompt contains the full incident context formatted by the event pipeline. This includes the event type, source details, error information, and any relevant identifiers (application, deployment, repository, etc.). Use all provided context to drive your investigation.

Safety Rules

Never merge PRs. Always return the PR URL for user approval.
Never push to main/master. Always create a fix branch.
If you cannot determine the root cause, notify the user with what you found and stop.
Do not retry the same fix more than once. Maximum 3 auto-fix attempts per incident before escalating.
Include all relevant context identifiers when delegating to diagnostics or github.
After delegation returns, immediately process the result. Never say work is "underway".
Every response must end with concrete information or a completed action.

nixopus/incident-response

skills/incident-response/SKILL.md

Structured incident response workflow — severity classification, diagnosis delegation, auto-fix decisions, notification, and post-incident review. Use when an automated failure event is received or when the user reports a production incident.

1 stars

testing

Updated Apr 22, 2026

$ install --global

skillsauth

npx skillsauth add nixopus/agent incident-response

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 11:58 PM170.0s1 file scanned

SKILL.md

name:: incident-response
description:: Structured incident response workflow — severity classification, diagnosis delegation, auto-fix decisions, notification, and post-incident review. Use when an automated failure event is received or when the user reports a production incident.
version:: 1.0

Incident Response

Event Classification

Classify the incident by severity before acting:

Severity signals

Incident Workflow

1. GATHER

Collect context about the affected resource:

get_application — app details, current deployment, configured port
get_application_deployments — recent deployment history
get_deployment_logs — if build/deploy failed
list_containers → get_container — container status
get_container_logs — runtime errors

2. DIAGNOSE

Delegate to the diagnostic agent with full context:

Include: application ID, deployment ID, error message, container status
The diagnostic agent uses failure-diagnosis skill for pattern matching
Wait for diagnosis result: root cause + whether it's code-fixable

3. DECIDE

Based on diagnosis:

4. FIX (if code-fixable)

Delegate to the GitHub agent:

Branch: auto-fix/<short-description> (e.g. auto-fix/missing-prisma-schema)
Read the problematic file, generate the minimal fix
Commit with message: fix: <description of what was fixed>
Open PR from fix branch into default branch
Never merge — return PR URL for user approval

5. NOTIFY

Send to all configured notification channels:

If fix PR created:

Failure detected for [app name].
Root cause: [one-line summary].
Auto-fix PR: [pr_url]
Review and merge to trigger redeploy.

If no fix possible:

Failure detected for [app name].
Root cause: [one-line summary].
Recommended action: [specific next step].

If diagnosis inconclusive:

Issue detected for [app name].
Findings: [what was observed].
Unable to determine root cause automatically.
Please investigate: [specific things to check].

6. VERIFY (after user merges fix)

If the fix PR is merged and a new deployment triggers:

Run post-deploy-verification checks
If healthy: notify "Issue resolved after fix merge"
If still failing: escalate — "Fix did not resolve the issue, further investigation needed"

Rules

Never merge PRs automatically — always require user approval
Never push to main/master — always use fix branches
Do not retry the same fix more than once
Maximum 3 auto-fix attempts per incident before escalating to user
Include all relevant context when delegating to sub-agents
Every response must end with a concrete result or completed action

Anti-Patterns

Fixing symptoms instead of root cause: If the container OOM-kills, don't just increase memory — investigate the memory consumer
Auto-fixing infrastructure issues: Server-level problems (disk full, Docker daemon down) can't be fixed via code PR
Notifying without actionable information: "Something went wrong" is useless — always include what failed, why, and what to do
Cascading fixes: If fix A causes failure B, stop and escalate — don't chain auto-fixes

Related Skills

failure-diagnosis — Pattern tables for identifying root causes
rollback-strategy — When to rollback vs fix forward
post-deploy-verification — Verify fix worked after merge

Event Context

Safety Rules

Never merge PRs. Always return the PR URL for user approval.
Never push to main/master. Always create a fix branch.
If you cannot determine the root cause, notify the user with what you found and stop.
Do not retry the same fix more than once. Maximum 3 auto-fix attempts per incident before escalating.
Include all relevant context identifiers when delegating to diagnostics or github.
After delegation returns, immediately process the result. Never say work is "underway".
Every response must end with concrete information or a completed action.

Related Skills

nixopus/api-catalog

tools

VerifiedTrustedCommunity

Compressed catalog of all Nixopus API operations for the nixopus_api() tool

1SKILL.mdUpdated Apr 26, 2026

nixopus/static-deploy

development

VerifiedTrustedCommunity

Deploy static file sites — Caddy/nginx serving, Staticfile config, and Dockerfile patterns. Use when deploying a static HTML site with no server-side runtime, or when index.html or a Staticfile is detected at the project root.

1SKILL.mdUpdated Apr 22, 2026

nixopus/static-deploy

nixopus/shell-deploy

devops

VerifiedTrustedCommunity

Deploy shell script applications — interpreter detection, setup scripts, and Dockerfile patterns. Use when deploying a shell script project, or when start.sh is detected.

1SKILL.mdUpdated Apr 22, 2026

nixopus/self-heal

development

VerifiedTrustedCommunity

Self-healing loop for failed deployments — diagnose, fix, redeploy up to 3 attempts, then escalate or rollback. Load when a deployment fails or build errors occur.

1SKILL.mdUpdated Apr 22, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/nixopus/agent.git

# Copy into Claude Code skills folder (global)
cp -r agent/skills/incident-response ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

nixopus/agent

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT