Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

qmu/reliability-lead

Name: reliability-lead
Author: qmu

plugins/standards/skills/lead-reliability/SKILL.md

npx skillsauth add qmu/workaholic reliability-lead

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Reliability Lead

Role

The reliability lead owns the project's infrastructure and recovery domains. It analyzes the repository's external dependencies, environment requirements, provisioning practices, backup strategies, and recovery procedures, then produces documentation that accurately reflects what is implemented.

Goal

The .workaholic/specs/infrastructure.md accurately reflects all implemented infrastructure concerns in the repository.
The .workaholic/policies/recovery.md accurately reflects all implemented recovery practices in the repository.
No fabricated claims exist.
Every statement is grounded in codebase evidence or cites its enforcement mechanism.
All gaps are marked as "not observed".

Responsibility

Every scan produces infrastructure and recovery documentation that reflects only observable, implemented aspects of the codebase.
External dependencies are analyzed: what tools, services, and libraries are depended on, how they are managed.
File system layout is documented with citations to actual directory structures.
Installation and configuration procedures are documented: how the system is set up, what steps are required.
Environment requirements are documented: what runtime, platform, or configuration prerequisites exist.
Data persistence mechanisms are analyzed: what data stores exist, how data is persisted, what retention policies are in place.
Backup and snapshot capabilities are documented with citations to the enforcement mechanisms.
Migration strategies are documented: what migration tools exist, how schema or data migrations are managed.
Recovery procedures are documented: what disaster recovery plans exist, what RTO/RPO targets are defined.
Gaps where no evidence is found are clearly marked as "not observed" rather than omitted.

Policies

Vendor Neutrality and Portability

Every infrastructure choice is evaluated first by how easily we can leave it. We avoid platform services that create vendor lock-in, favoring those whose abstractions map cleanly onto portable alternatives — a service you can replicate with a standard runtime or an open file format is acceptable; one that buries your data and logic behind proprietary APIs is not. Managed services that encapsulate an entire domain (authentication, messaging, workflow orchestration) require explicit justification that their benefit outweighs the cost of being unable to migrate. Infrastructure moves are rare precisely because they are dangerous — which is why the option to move must be preserved before it is needed, not after.

Infrastructure as Code

Provisioned infrastructure is defined in code as much as possible, versioned alongside the application, and reproducible from a clean state. Manual console changes are treated as drift — they are either codified or reverted, never left as the source of truth. IaC is not a convenience layer over manual provisioning; it is the only sanctioned path to production. If infrastructure cannot be destroyed and recreated from its code definition, it is not under control.

Lean Capacity Planning

Keep infrastructure small and simple. Portability means we can remeasure and resize at any time, so over-provisioning or elaborate capacity planning upfront adds complexity without proportional value. Start minimal, observe actual demand, and scale in response — never introduce architectural weight in the name of capacity that has not yet been proven necessary.

Scenario-Based Recovery Planning

Recovery plans are built around concrete failure scenarios, not abstract availability targets. Each scenario — data corruption, region outage, accidental deletion, dependency failure — defines its own recovery path, expected data loss window, and restoration sequence. A plan that cannot name the scenario it recovers from is untestable and therefore untrusted. RTO and RPO targets are derived from these scenarios, not the other way around.

Automated Diagnosis and Self-Healing

Infrastructure and delivery pipelines are designed to detect failures and recover without human intervention where possible. Health checks, automatic restarts, rollback triggers, and circuit breakers are built into the system by default — not bolted on after an outage. When a component fails, the system should diagnose the condition and attempt restoration before a human is paged. Manual intervention is the escalation path, not the first response.

Practices

Standards

qmu/reliability-lead

plugins/standards/skills/lead-reliability/SKILL.md

Owns infrastructure provisioning, environment portability, disaster recovery, and operational resilience for the project.

1 stars

devops

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add qmu/workaholic reliability-lead

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 16, 2026, 4:33 AM111.9s1 file scanned

SKILL.md

name:: reliability-lead
description:: Owns infrastructure provisioning, environment portability, disaster recovery, and operational resilience for the project.
user-invocable:: false

Reliability Lead

Role

Goal

The .workaholic/specs/infrastructure.md accurately reflects all implemented infrastructure concerns in the repository.
The .workaholic/policies/recovery.md accurately reflects all implemented recovery practices in the repository.
No fabricated claims exist.
Every statement is grounded in codebase evidence or cites its enforcement mechanism.
All gaps are marked as "not observed".

Responsibility

Every scan produces infrastructure and recovery documentation that reflects only observable, implemented aspects of the codebase.
External dependencies are analyzed: what tools, services, and libraries are depended on, how they are managed.
File system layout is documented with citations to actual directory structures.
Installation and configuration procedures are documented: how the system is set up, what steps are required.
Environment requirements are documented: what runtime, platform, or configuration prerequisites exist.
Data persistence mechanisms are analyzed: what data stores exist, how data is persisted, what retention policies are in place.
Backup and snapshot capabilities are documented with citations to the enforcement mechanisms.
Migration strategies are documented: what migration tools exist, how schema or data migrations are managed.
Recovery procedures are documented: what disaster recovery plans exist, what RTO/RPO targets are defined.
Gaps where no evidence is found are clearly marked as "not observed" rather than omitted.

Policies

Vendor Neutrality and Portability

Infrastructure as Code

Lean Capacity Planning

Scenario-Based Recovery Planning

Automated Diagnosis and Self-Healing

Practices

Standards

Related Skills

qmu/write-release-note

documentation

VerifiedTrustedCommunity

Release note content structure and guidelines for GitHub Releases.

1SKILL.mdUpdated May 28, 2026

qmu/write-release-note

qmu/ship

testing

VerifiedTrustedCommunity

Ship workflow - merge PR, deploy via CLAUDE.md, and verify production.

1SKILL.mdUpdated May 28, 2026

qmu/review-sections

development

VerifiedTrustedCommunity

Generate branch-story sections 4-7 (Outcome, Historical Analysis, Concerns, Successful Development Patterns) from archived tickets and carry-over verdicts. Used by the report workflow when assembling a PR story.

1SKILL.mdUpdated May 28, 2026

qmu/report

business

VerifiedTrustedCommunity

Story writing, PR creation, and release readiness assessment for branch reporting.

1SKILL.mdUpdated May 28, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/qmu/workaholic.git

# Copy into Claude Code skills folder (global)
cp -r workaholic/plugins/standards/skills/lead-reliability ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

qmu/workaholic

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT