engineering/skills/incident-response/SKILL.md
Triage and manage production incidents. Trigger with "we have an incident", "production is down", "something is broken", "there's an outage", "SEV1", or when the user describes a production issue needing immediate response.
npx skillsauth add grailautomation/claude-plugins incident-responseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Guide incident response from detection through resolution and postmortem.
| Level | Criteria | Response Time | |-------|----------|---------------| | SEV1 | Service down, all users affected | Immediate, all-hands | | SEV2 | Major feature degraded, many users affected | Within 15 min | | SEV3 | Minor feature issue, some users affected | Within 1 hour | | SEV4 | Cosmetic or low-impact issue | Next business day |
Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is.
Blameless. Focus on systems and processes. Include timeline, root cause analysis (5 whys), what went well, what went poorly, and action items with owners and due dates.
documentation
Write a feature spec or PRD from a problem statement or feature idea
development
Synthesize qualitative and quantitative user research into structured insights and opportunity areas. Use when analyzing interview notes, survey responses, support tickets, or behavioral data to identify themes, build personas, or prioritize opportunities.
research
Synthesize user research from interviews, surveys, and feedback into structured insights
data-ai
Generate a stakeholder update tailored to audience and cadence