plugins/standards/skills/leading-availability/SKILL.md
Owns CI/CD pipelines, infrastructure provisioning, environment portability, observability, disaster recovery, and operational resilience for the project.
npx skillsauth add qmu/workaholic leading-availabilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This leading skill owns the project's delivery pipelines, infrastructure, observability, and recovery domains. It derives its viewpoint directly from the repository's CI/CD automation, external dependencies, environment requirements, provisioning practices, logging and metrics instrumentation, backup strategies, and recovery procedures, and produces documentation that accurately reflects what is implemented.
The goal of availability leadership is operational continuity — the system keeps serving its users through change, failure, and shifts in the environment around it. From this viewpoint, continuity is a property the codebase carries on its own, so that running the system depends on what is written down rather than on what an operator happens to remember. Where the system runs and how it recovers are choices that remain open over time, never commitments that calcify into dependence.
The responsibility of availability leadership is to keep the running system answerable from outside it. It refuses to let operational knowledge concentrate in a single head, refuses to let infrastructure choices quietly close off future options, and refuses to leave any plausible failure without a rehearsed response. Continuity is owed not just on a good day but on the worst one, and the worst one is anticipated before it arrives.
CI/CD automation establishes a pipeline from commit to deployment before other infrastructure work. Manual deployment processes are simpler to set up initially and sufficient for small teams. The priority is automation because it makes reproducibility and regression detection the baseline from day one, at the cost of setup effort before the first feature ships.
Vendor neutrality weighs every dependency — across application, middleware, infrastructure, and development tooling — by how easily it could be replaced. External libraries and managed services offer convenience, faster delivery, and less in-house effort. The preference is for in-house implementation and portable abstractions because they preserve the option to migrate and reduce surface area exposed to upstream change, at the cost of more integration work and reinvention. An external dependency is adopted only when in-house implementation is demonstrably unreasonable — significant effort saved, the dependency well-maintained, and no viable simple alternative.
Infrastructure as code defines provisioned resources in version-controlled files reproducible from a clean state. Console-based provisioning is faster for one-off changes and easier to learn. The priority is IaC because it makes infrastructure reproducible and auditable, at the cost of slower ad-hoc fixes.
Observable-by-design systems emit structured logs, metrics, and traces as components are written. Lean instrumentation added only when needed reduces upfront code and storage costs. The priority is built-in observability because it makes system state visible from outside without attaching a debugger, at the cost of extra telemetry code and storage.
Keep infrastructure small and simple. Portability means capacity can be remeasured and resized at any time, so over-provisioning or elaborate capacity planning upfront adds complexity without proportional value. Start minimal, observe actual demand, and scale in response.
Recovery plans are built around concrete failure scenarios, not abstract availability targets. Each scenario — data corruption, region outage, accidental deletion, dependency failure — defines its own recovery path, expected data loss window, and restoration sequence. RTO and RPO targets are derived from these scenarios, not the other way around.
Infrastructure and delivery pipelines are designed to detect failures and recover without human intervention where possible. Health checks, automatic restarts, rollback triggers, and circuit breakers are built into the system by default. When a component fails, the system diagnoses the condition and attempts restoration before a human is paged.
When an external dependency is adopted, manage how it touches the codebase. Confine imports to a vendor-specific module or single boundary, or ensure the dependency only takes and returns language or domain primitives so the integration surface stays small. Track the dependency for known vulnerabilities and license compatibility. Evaluate maintenance activity, bus factor, and funding model before adopting, and document an exit strategy — what would replacing the dependency require? Flag any new dependency introduction during review and require justification for why an in-house implementation is unreasonable. Prefer standard library or platform-native solutions over third-party alternatives where the gap is small.
documentation
Release note content structure and guidelines for GitHub Releases.
testing
Ship workflow - merge PR, deploy via CLAUDE.md, and verify production.
development
Generate branch-story sections 4-7 (Outcome, Historical Analysis, Concerns, Successful Development Patterns) from archived tickets and carry-over verdicts. Used by the report workflow when assembling a PR story.
business
Story writing, PR creation, and release readiness assessment for branch reporting.