skills/mav-bp-alerting/SKILL.md
Alerting conventions for fatal errors in applications. Covers severity levels, alert context, centralised notification, and project alerting guidance. Applied when writing error handling or reviewing code.
npx skillsauth add thermiteau/maverick-private mav-bp-alertingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Ensure fatal errors trigger alerts that reach operations teams. Alerting is not logging — logs record what happened, alerts demand attention.
Before applying these standards, load the project-specific alerting implementation:
digraph lookup {
"docs/maverick/skills/alerting/SKILL.md exists?" [shape=diamond];
"Read and use alongside these standards" [shape=box];
"Invoke upskill" [shape=box];
"Read generated skill" [shape=box];
"docs/maverick/skills/alerting/SKILL.md exists?" -> "Read and use alongside these standards" [label="yes"];
"docs/maverick/skills/alerting/SKILL.md exists?" -> "Invoke upskill" [label="no"];
"Invoke upskill" -> "Read generated skill";
"Read generated skill" -> "Read and use alongside these standards";
}
docs/maverick/skills/alerting/SKILL.mdupskill skill with:
sendAlert|publish|PagerDuty|Opsgenie|alertService|notify|Sentry\.capture**/alert*.*, **/notify*.*| Severity | When to alert | Response expectation | Example |
| ---------- | --------------------------------------------------- | -------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| critical | System down, data loss, complete service failure | Immediate — wake someone up | Database connection pool exhausted, payment processing completely down, data corruption detected |
| high | Degraded service, partial failures, repeated errors | Within the hour — investigate promptly | Third-party API returning errors for 5+ minutes, error rate exceeds threshold, authentication service intermittent |
| warning | Emerging issues, threshold approaching | Next business day — monitor and plan | Disk usage above 80%, API rate limit at 90%, certificate expiring in 7 days |
Every alert must include sufficient context for the responder to begin investigation without needing to search for information.
| Field | Purpose | Example |
| ------------- | ---------------------------------------------------------- | -------------------------------------------------------------------- |
| severity | Urgency classification | critical, high, warning |
| service | Which service raised the alert | payment-service, auth-api |
| environment | Which environment | production, staging |
| message | Human-readable description of what happened and its impact | Payment processing failed — all checkout requests returning errors |
| timestamp | When it happened (ISO 8601 UTC) | 2024-01-15T10:30:00.000Z |
| context | Structured metadata for investigation | Error count, window, affected endpoint |
| logPointer | Where to find detailed logs | Log group, filter query, or dashboard link |
Alert at the outermost error boundary, not inside individual functions:
Do not alert inside low-level functions (database queries, HTTP calls). Let errors propagate to the boundary where impact and scope can be assessed.
Do not flood the alerting service with duplicate alerts for the same ongoing issue:
The implementation of dedup depends on your language and runtime — check the project skill for the specific technology in use.
Frontend applications generally do not send alerts directly. Instead:
If the project requires frontend-initiated alerts (e.g., critical workflow failures):
Alerting and logging are complementary but separate concerns:
| Concern | Logging | Alerting | | --------- | ------------------------------------------ | ------------------------------------------------------ | | Purpose | Record what happened | Demand attention | | Volume | Every error, warn, debug | Only critical/high issues | | Audience | Developers investigating after the fact | Operations team responding now | | Transport | Log aggregation (CloudWatch, Datadog, ELK) | Notification service (SNS, PagerDuty, Opsgenie, Slack) | | Timing | Continuous | Event-driven with deduplication |
Always log first, then alert. The log provides the investigation trail. The alert demands attention.
| Pattern | Issue | Fix | | ------------------------------------------------ | --------------------------- | ------------------------------------------ | | Fatal error with no alert | Silent failure | Add alert at the error boundary | | Alert inside a loop or retry | Alert flood | Move alert outside loop, add dedup | | Alert with no context | Unactionable | Add service, error, scope, log pointer | | Alert on expected conditions | Alert fatigue | Remove — only alert on unexpected failures | | Different alerting mechanisms in different files | Inconsistency | Use single alerter module | | Alert but no log | Missing investigation trail | Always log before alerting | | Frontend calling alerting service directly | Security risk | Route through backend API |
development
Use when a best-practice skill needs project-specific implementation details and no project skill exists at docs/maverick/skills/<topic>/SKILL.md. Scans the codebase and generates a project-specific skill file.
testing
Create or update technical documentation for a project. Covers architecture, service interactions, data flows, and design decisions. Produces professional markdown with Mermaid diagrams.
development
How to process code review feedback — verify before implementing, push back when wrong, clarify before acting on partial understanding. Applied when receiving review from the code-reviewer agent or human reviewers.
development
Analyze a project's codebase against Maverick standard practices and write a findings report. Checks linting, unit tests, integration tests, documentation, and CI/CD. Run when onboarding an existing project or on demand.