claude/skills/incident-response/SKILL.md
Run a full incident response workflow for an active incident. Covers investigation, blast radius via Snowflake, Slack channel triage, fix implementation, Jira ticket, draft PR, Notion debrief, and Datadog monitor review. Use when asked to "run incident response", "we have an incident", "investigate this error", or given a Sentry URL with urgency context.
npx skillsauth add iainmcl/dotfiles incident-responseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
End-to-end incident response: gather context → investigate root cause → measure blast radius → implement fix → document everything.
The incident Slack channel is the live feed of the investigation. Check it:
At each re-check, only surface messages newer than the previous check. If a new finding is directly relevant to the current step (e.g. someone just identified the root cause while you're about to investigate), incorporate it and note that you did so.
Track a last_checked timestamp after each read and use it to filter the next fetch.
At least one of the following entry points must be provided. Ask for any that are missing:
| Input | Example | Notes |
|---|---|---|
| Sentry issue URL | https://sentry.io/organizations/.../issues/123/ | Primary entry point |
| Datadog monitor/alert URL | https://app.datadoghq.com/monitors/456 | Use if no Sentry issue exists yet |
| Slack incident channel | #inc-2026-01-invoice-failures | Use if alerted via Slack without a direct link |
At least one entry point is required — if none is provided, ask before proceeding.
Also ask for anything not already provided:
travelperk/billing-service)"new" to create one (can be skipped if not yet created)Entry point routing:
Tell the user: "Running incident response — I'll confirm each step as I go."
Before touching the codebase, read the incident channel. Other investigators may have already found the root cause, affected customers, or a mitigation path.
Use the Slack MCP (slack_read_channel) to fetch recent messages from the incident channel.
Extract and summarise:
Confirm to user: "Read #channel — here's what the team has found so far: ..."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
Use the Sentry MCP to fetch the issue at the provided URL.
Capture:
Confirm to user: "Fetched Sentry issue: <title> — <N> events, <N> users, first seen <date>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
Read the stack trace. Identify the exact file, line, and condition triggering the error.
Clone or navigate to the repo and read the relevant source files. Do not guess — confirm the bug in the code before proposing a fix.
gh repo clone <repo> /tmp/incident-<issue-id> 2>/dev/null || git -C /tmp/incident-<issue-id> pull
Cross-reference with Slack findings from Step 1. If a team member already identified the root cause, validate it in the code rather than starting from scratch.
Confirm to user: "Root cause identified: <file>:<line> — <one sentence explanation>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
Query Snowflake to determine how many records and customers are affected.
Tailor the query to the root cause — e.g. if the bug corrupts invoices created after a certain date, query for invoices in that state.
Save both the query and results. Include:
If Snowflake MCP is unavailable, tell the user: "Snowflake not connected — please run this query manually: <query>" and continue.
Confirm to user: "Blast radius: <N> records, <N> customers affected since <date>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
Search Datadog for monitors or dashboards related to the affected service.
Look for:
If Datadog MCP is unavailable, note which service/metric to check manually and continue.
Confirm to user: "Datadog checked — <summary of monitor state / any gaps identified>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
Use the Atlassian MCP (createJiraIssue) to create a bug ticket:
[Incident] <concise description>## Sentry issue
<sentry URL>
Frequency: <event count> events, <N> users affected
First seen: <date> | Last seen: <date>
## Root cause
<file>:<line> — <explanation>
## Blast radius
<N> records, <N> customers affected
<Snowflake query used>
## Slack incident channel
<link to channel>
## Stack trace (excerpt)
<most relevant frames>
Note the ticket number for the branch name and PR.
Confirm to user: "Jira ticket created: APP-<N> — <link>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
On a new branch APP-<ticket>-<short-description>:
If uncertain about the fix, implement the most likely approach and flag uncertainty in the PR.
Confirm to user: "Fix implemented on branch <branch-name>."
Slack check: fetch messages newer than
last_checked. Surface any new findings before proceeding.
gh pr create --draft \
--title "fix: <description> (APP-<ticket>)" \
--body "<body>"
PR body:
## What
<one sentence: what bug this fixes>
## Why
<root cause in plain English>
## Blast radius
<N> records, <N> customers affected since <date>
## Sentry
<sentry URL> — <N> events, <N> users
## Jira
https://travelperk.atlassian.net/browse/APP-<ticket>
## Fix
<what changed and why it works>
## Risks
<anything uncertain, edge cases, areas for reviewer attention>
Sandbox note: All
ghcommands requiredangerouslyDisableSandbox: true.
Confirm to user: "Draft PR opened: <link>."
Slack check: fetch messages newer than
last_checked. Incorporate any final timeline details or follow-up actions posted by the team into the debrief.
Use the Notion MCP to update the debrief page at the provided URL (or create one if "new" was specified).
Include:
Write in blameless, conversational language. Do not make claims not supported by evidence from the codebase, Sentry, or Slack.
If Notion MCP is unavailable, output the debrief content as markdown so the user can paste it manually.
Confirm to user: "Notion debrief updated: <link>."
Once all steps are complete, output:
## Incident response complete
- Root cause: <one sentence>
- Blast radius: <N> records, <N> customers
- Jira: APP-<N> — <link>
- PR: <link>
- Notion debrief: <link>
Remaining actions:
- <any manual steps needed (e.g. Snowflake query to run, monitor to update)>
development
Run a weekly achievement review - pulls from Jira, GitHub, and Slack to capture what you shipped in the last week, maps achievements to your 2026 goals, and appends impact-focused entries to your brag doc. Use when asked to "do a weekly review", "capture this week's wins", "update my brag doc", "what did I ship this week", "record my achievements", "what have I done this week", "add to my performance doc", or anything about tracking weekly progress, brag doc entries, or performance evidence. Trigger even if the user just says "weekly review" or "document what I did".
testing
Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
tools
Set up a project update config for the current repo, so that running project-update requires no setup questions. Use when asked to "set up project updates", "configure project update", "initialise project update", or "create a project update config". Run this once per project repo.
testing
Find the highest-frequency unresolved Sentry error for the VAT & Invoicing or Billing team, understand its root cause, create a Jira ticket in the APP project, implement a fix, and open a draft PR. Use when asked to "fix sentry issues", "triage sentry errors", "look at sentry", "what's broken in sentry", "create a fix for a sentry issue", or "sentry triage". Runs the full flow autonomously in the background.