release-manager/SKILL.md
Manages post-Gate-2 release activities with Agile V rigor. Rollout plans, rollback procedures, sign-off checklists. Use after Human Gate 2 for production deployment.
npx skillsauth add agile-v/agile_v_skills release-managerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You operate after Human Gate 2 (Acceptance). Goal: Controlled Production Deployment.
Artifacts verified → approved → safe deployment to production with traceability + reversibility.
Position: Stage 1-4 → [Gate 2] → Stage 5: Acceptance → RELEASE (You) → OPERATE (observability-planner) Checkpoint Type: Human-Action (physical deployment, prod infrastructure, user-facing release)
Traceability: Release notes cite REQ-XXXX · Rollout steps cite TC-XXXX · Rollback triggers are measurable · Post-release checks map to REQ Done Criteria
# Release Plan: Cycle N to Production
## Summary
**Cycle:** CN · **Version:** v2.3.0 · **Date:** [Target] · **Type:** Major/Minor/Patch/Hotfix
**Strategy:** [Big Bang/Phased/Canary/Blue-Green]
## Requirements in Scope
| REQ-ID | Feature | Priority | Verified |
|---|---|---|---|
| REQ-0012 | OAuth login | CRITICAL | ✓ (TC-0034, TC-0035) |
| REQ-0015 | Dashboard perf | HIGH | ✓ (TC-0042) |
**Total:** [N] | **Critical:** [X] | **High:** [Y]
## Pre-Release Checklist
**Gate 2 Complete:** [ ] All REQs verified (VALIDATION_SUMMARY.md) · [ ] No CRITICAL/MAJOR defects · [ ] ATM complete · [ ] VSR signed
**Infrastructure:** [ ] Env provisioned · [ ] DB migrations tested · [ ] Secrets rotated · [ ] Monitoring configured (observability-planner) · [ ] Alerts active
**Artifacts:** [ ] Build from verified code (Git SHA) · [ ] Signed (checksum) · [ ] Rollback ready (prev version)
**Approvals:** [ ] Product Owner · [ ] Eng Lead · [ ] Security/Compliance · [ ] Business stakeholder
**Communication:** [ ] Release notes drafted · [ ] Customer comm ready · [ ] Internal notified · [ ] On-call confirmed
## Deployment Window
| Window | Start | End | Availability Target | Rationale |
|---|---|---|---|---|
| Primary | [DateTime TZ] | [DateTime TZ] | 99.9% | Low-traffic period |
| Rollback | [DateTime TZ] | [DateTime TZ] | N/A | If issues |
**Freeze Period:** [Dates] — No deploys except critical hotfixes
# Release Notes: Version X.Y.Z
## What's New
**New Features:** · [Feature] (REQ-XXXX): [User benefit + how to use]
**Improvements:** · [What improved] (REQ-YYYY): [UX impact]
**Bug Fixes:** · [What fixed] (REQ-ZZZZ): [User impact]
**Security:** · [High-level only, no exploit details] (REQ-AAAA)
**Technical:** · [Performance, compatibility] (REQ-BBBB)
## Breaking Changes
[API/data/workflow changes] · [Change] (REQ-CCCC): [What breaks] → [Migration path]
## Deprecated
[Feature] (REQ-DDDD): Deprecated vX.Y.Z, removed vX+1.0.0 → [Alternative]
## Known Issues
[Issue] (CR-XXXX): [Workaround if any]
## Upgrade Instructions
1. Backup DB · 2. Run migration · 3. Restart services
## Rollback Instructions
1. Stop new version · 2. Restore backup · 3. Redeploy prev version
## Traceability
**Cycle:** CN · **Git:** [SHA] · **Build:** BUILD_MANIFEST.md · **Validation:** VALIDATION_SUMMARY.md · **ATM:** ATM.md
Tone: User-facing, non-technical. Benefits, not implementation.
## Rollout: Phased Canary
### Phase 1: 1% Traffic (1 hour)
**Validation Gates:** [ ] Error rate <0.5% (baseline 0.2%) · [ ] p95 latency <500ms (baseline 300ms) · [ ] No CRITICAL alerts
**Monitoring:** `sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))` · `histogram_quantile(0.95, ...)`
**Rollback Trigger:** Error >1% OR latency >1s OR CRITICAL alert
**Decision:** Gates pass → Phase 2 | Fail → Rollback
### Phase 2: 10% (2h) → Phase 3: 50% (4h) → Phase 4: 100% (24h)
[Same validation structure]
**Alt Strategies:** Big Bang (100% immediate, high risk) · Blue-Green (parallel, switch 100%) · Feature Flags (deploy code, enable progressively)
## Rollback Plan
### Triggers (Measurable)
- Error rate >1% for >5 min · p95 latency >1s for >5 min · CRITICAL alert (DB failure, OOM, security breach) · Service unavailable · Data integrity violation
### Procedure (Target: <10 min)
1. **Stop Deploy** (if in-progress): `kubectl rollout pause deployment/app`
2. **Revert:** `kubectl rollout undo deployment/app` → Verify: `kubectl rollout status`
3. **Verify Success:** [ ] Error <0.5% · [ ] Latency <500ms · [ ] No CRITICAL alerts · [ ] Health check `/health` 200
4. **Notify:** "Deployment vX.Y.Z rolled back due to [trigger]. Service restored."
5. **Root Cause:** Log CAPA-XXXX · Investigate · Generate CR-XXXX if REQ violated
### Data Rollback (if DB migrations)
**Backup:** [Location, timestamp] · **Script:** [Link] · **Procedure:** Stop app · Restore backup · Restart prev version · Verify integrity
**Rule:** Data rollback extends window to [X] minutes
### Post-Rollback
[ ] Update RISK_REGISTER.md · [ ] CAPA_LOG.md (root cause + action) · [ ] CR-XXXX if REQ gap · [ ] Post-mortem <48h
## Post-Release Validation
**Functional:** For each REQ in scope:
[ ] REQ-XXXX: [Validation: smoke test, manual, monitoring query]
**Performance:** [ ] Dashboard TTI ≤3s (REQ-0015, PERF-0001) · [ ] API p95 <500ms (REQ-0022)
**Accessibility:** [ ] axe DevTools 0 violations (A11Y REQs)
**Security:** [ ] SSL valid · [ ] Security headers (CSP, HSTS) · [ ] No secrets in client code
**Business:** [ ] Conversion within 5% baseline (first 24h) · [ ] Engagement stable (DAU, session duration)
**Monitoring:** [ ] Dashboards active (observability-planner) · [ ] Alerts firing correctly · [ ] On-call ready
**Sign-Off:** [ ] Eng Lead (functional) · [ ] PO (business metrics) · [ ] Release Manager (monitoring)
**Status:** PASS/FAIL · **Date:** [Date] · **Next Review:** [24/48h]
## INC-XXXX: [Title]
**Severity:** CRITICAL/MAJOR · **Detected:** [DateTime] · **Resolved:** [DateTime] · **Duration:** [15 min]
**Impact:** [Checkout unavailable, 500 users]
**Root Cause:** [N+1 query → DB timeout]
**Affected REQ:** REQ-XXXX
**Resolution:** [Rollback; hotfix deployed]
**Follow-Up:**
- CAPA-XXXX: [Add integration test for timeout]
- CR-XXXX: [Update REQ-0020: specify retry behavior]
- RISK-XXXX: [Update RISK_REGISTER: external API timeout risk]
Feed to next cycle: All CRs from incidents → input to Cycle N+1 planning
Release Plan specifies which REQs from which cycles included.
As Human-Action checkpoint:
Do not automate beyond approved scope. Production deployments require human approval at each phase.
Produce:
"Verified" becomes "Deployed" with same rigor as pre-production.
development
The Verification Agent — challenges Build Agent artifacts via independent verification. Executes tests against artifacts. Use to audit code, schematics, or firmware against requirements.
development
# Skill: system-understanding-agent ## Purpose Use this skill when Agile V is applied to an existing codebase, documentation set, or knowledge base. The skill consumes Understand Anything outputs and creates a concise, reviewable system overview that gives agents sufficient context before modifying code. This is **Gate 0** of the integrated Agile V lifecycle. No requirements should be generated, and no code should be built, until this skill has run and the system overview has been reviewed.
development
# Skill: regression-selection-agent ## Purpose Select and prioritize regression tests based on the impact map and graph dependency relationships. This skill ensures that existing tests are identified, prioritized, and run after a change, and that gaps in test coverage are flagged before the Red Team step. --- ## Trigger conditions Use this skill when: - Existing behavior must not break (regression risk). - An impact map is available. - The change affects shared modules, services, or APIs.
development
# Skill: impact-analysis-agent ## Purpose Identify the likely impact of a proposed change before implementation. This skill maps the change request to graph nodes, identifies affected files, functions, APIs, and tests, and produces a reviewable impact map that gates the Build Agent's context. --- ## Trigger conditions Use this skill when: - A change request targets an existing system. - The change could affect multiple files or modules. - Regression risk exists (the change touches shared c