skills/incident-hotfix/SKILL.md
Use this skill for incident response and hotfix deployment. Invoke when production issues occur requiring immediate attention.
npx skillsauth add gentamura/dotfiles incident-hotfixInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Structured incident response, hotfix, and postmortem process.
| Level | Impact | Response Time | Examples | |-------|--------|---------------|----------| | P0 | System down | Immediate | Complete outage, data loss | | P1 | Major feature broken | < 1 hour | Auth broken, payments failing | | P2 | Feature degraded | < 4 hours | Slow performance, partial outage | | P3 | Minor issue | < 24 hours | UI bug, non-critical error |
Choose one:
Rollback (safest)
Hotfix (targeted)
Feature Flag (quick)
# From production/main branch
git checkout main
git pull origin main
git checkout -b hotfix/issue-description
Rules for hotfix code:
bun run lint:fix
bun run build
bun run test
# Test the specific fix
# Verify in staging if time permits
# Merge to main
git checkout main
git merge hotfix/issue-description
# Tag and deploy
git tag -a v1.2.4 -m "Hotfix: issue description"
git push origin main --tags
# Merge hotfix to develop branch
git checkout develop
git merge hotfix/issue-description
git push origin develop
Write a postmortem within 48 hours of resolution.
# Incident Postmortem: [Title]
## Summary
| Field | Value |
|-------|-------|
| Date | YYYY-MM-DD |
| Duration | X hours Y minutes |
| Severity | P0/P1/P2/P3 |
| Author | [Name] |
## Impact
- [Number of users affected]
- [Revenue impact if applicable]
- [Other business impact]
## Timeline (UTC)
| Time | Event |
|------|-------|
| HH:MM | Issue first detected |
| HH:MM | Team alerted |
| HH:MM | Root cause identified |
| HH:MM | Fix deployed |
| HH:MM | Issue resolved |
## Root Cause
[Clear, technical explanation of what caused the incident]
## Detection
How was the incident detected?
- [ ] Monitoring alert
- [ ] Customer report
- [ ] Internal discovery
Could we have detected it earlier?
- [Analysis]
## Resolution
What fixed the issue?
- [Description of fix]
Was it a rollback or hotfix?
- [Details]
## Lessons Learned
### What Went Well
- [Point 1]
- [Point 2]
### What Went Wrong
- [Point 1]
- [Point 2]
### Where We Got Lucky
- [Point 1]
## Action Items
| Action | Owner | Due Date | Status |
|--------|-------|----------|--------|
| [Action 1] | [Name] | YYYY-MM-DD | [ ] |
| [Action 2] | [Name] | YYYY-MM-DD | [ ] |
## Prevention
How do we prevent this class of issue?
- [ ] Add monitoring for [X]
- [ ] Add test for [Y]
- [ ] Improve process for [Z]
🚨 [P1] Investigating issues with [service]
Impact: [Brief description]
Status: Investigating
ETA: Unknown
Updates to follow.
🔄 [P1] Update on [service] issue
Status: Root cause identified, fix in progress
Impact: [Updated impact]
ETA: [Time estimate]
Next update in [X] minutes.
✅ [P1] Resolved: [service] issue
Duration: [X hours Y minutes]
Resolution: [Brief description]
Postmortem to follow within 48 hours.
tools
Use this skill to break down requirements into user stories, acceptance criteria, and actionable tasks. Invoke when starting a new feature or receiving new requirements.
devops
Use this skill for release preparation and execution. Invoke when deploying to staging or production environments.
development
Use this skill to review pull requests against coding standards and best practices. Invoke when reviewing code changes before merge.
tools
Use this skill to create a pull request from current local changes. Invoke when the user asks to create a branch, commit, push, and open a PR.