skills/skills/dlc-audit/SKILL.md
AI-DLC compliance audit — deep process adherence assessment with foundation bootstrapping, 0-10 numeric scoring, and maturity rating
npx skillsauth add msifoss/ai-dlc dlc-auditInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Deep assessment of AI-DLC (AI-Driven Development Lifecycle) process adherence. Goes beyond document existence checks to evaluate whether practices are actually being followed. Scores 9 dimensions on a 0-10 numeric scale with letter-grade equivalents and an overall maturity rating.
/motherhenchecks "do the docs exist?" —/dlc-auditchecks "are we actually following the practices?"
User invokes /dlc-audit with an optional action argument.
| Action | Description |
|--------|-------------|
| assess (default) | Run the full 9-dimension compliance assessment |
| quick | Lightweight assessment — reads doc headers, git metadata, and file existence only. Skips deep content analysis. Lower token cost, faster results. |
| init | Detect missing foundational docs and create skeleton templates |
| full | Run init first (create anything missing), then assess. Prompts for confirmation before creating any files. |
| compliance | Run assess, then map scores to EU AI Act requirements |
Parse $ARGUMENTS to determine the action. If empty or unrecognized, default to assess.
When action is quick, limit Phase 1 to file existence checks and git metadata only — do not read file contents beyond the first 10 lines (enough for frontmatter/headers). In Phase 3, score based on document presence, git history patterns, and structural signals rather than deep content analysis. This reduces token consumption by ~70% at the cost of scoring precision (±1 point per dimension). Report the dashboard with a note: ⚡ Quick mode — scores are approximate. Run full \assess` for precise scoring.`
Read project state files to understand the current repo:
Project identity (read if they exist):
CLAUDE.md — architecture, conventions, deployment stateCHANGELOG.md — version historypyproject.toml or package.json — project metadataSECURITY.md — security controlsGit state (run these):
# Recent commits (bolt cadence visibility)
git log --oneline -20
# Working tree status
git status --short
# Latest tag
git describe --tags --abbrev=0 2>/dev/null
# Captain's log recency
ls -lt docs/captains_log/ captain-logs/ 2>/dev/null | head -5
Check for the 14 foundational documents:
Search for each document at multiple possible paths. The document is considered FOUND if it exists at ANY of the listed paths (projects may organize differently):
| # | Document | Search Paths (check all, first match wins) |
|---|----------|---------------------------------------------|
| 1 | Requirements | docs/REQUIREMENTS.md, REQUIREMENTS.md |
| 2 | Traceability Matrix | docs/TRACEABILITY-MATRIX.md, TRACEABILITY-MATRIX.md |
| 3 | User Stories | docs/USER-STORIES.md, USER-STORIES.md |
| 4 | AI Context File | CLAUDE.md |
| 5 | Security Controls | SECURITY.md, docs/SECURITY.md |
| 6 | PM Framework | docs/pm/FRAMEWORK.md, docs/PM-FRAMEWORK.md, PM-FRAMEWORK.md |
| 7 | Solo+AI Workflow Guide | docs/standards/SOLO-AI-WORKFLOW-GUIDE.md, docs/SOLO-AI-WORKFLOW-GUIDE.md, SOLO-AI-WORKFLOW-GUIDE.md |
| 8 | CI/CD Deployment Proposal | docs/standards/CICD-DEPLOYMENT-PROPOSAL.md, docs/CICD-DEPLOYMENT-PROPOSAL.md, CICD-DEPLOYMENT-PROPOSAL.md |
| 9 | Multi-Developer Guide | docs/standards/MULTI-DEVELOPER-GUIDE.md, docs/MULTI-DEVELOPER-GUIDE.md, MULTI-DEVELOPER-GUIDE.md |
| 10 | Infrastructure Playbook | docs/standards/INFRASTRUCTURE-PLAYBOOK.md, docs/INFRASTRUCTURE-PLAYBOOK.md, INFRASTRUCTURE-PLAYBOOK.md |
| 11 | Cost Management Guide | docs/standards/COST-MANAGEMENT-GUIDE.md, docs/COST-MANAGEMENT-GUIDE.md, COST-MANAGEMENT-GUIDE.md |
| 12 | Security Review Protocol | docs/standards/SECURITY-REVIEW-PROTOCOL.md, docs/SECURITY-REVIEW-PROTOCOL.md, SECURITY-REVIEW-PROTOCOL.md |
| 13 | Ops Readiness Checklist | docs/standards/OPS-READINESS-CHECKLIST.md, docs/OPS-READINESS-CHECKLIST.md, OPS-READINESS-CHECKLIST.md |
| 14 | AI-DLC Case Study | docs/standards/CALLHERO-AI-DLC-CASE-STUDY.md, docs/AI-DLC-CASE-STUDY.md, AI-DLC-CASE-STUDY.md |
Use Glob to check existence of each. Record which exist (and at which path) and which are missing. For non-CallHero repos, check for AI-DLC-CASE-STUDY.md (generic) instead of CALLHERO-AI-DLC-CASE-STUDY.md. Always count foundation as /14.
Codebase health baseline (skip for docs-only repos):
If the repo contains application code (not just documentation), assess codebase health as an AI-readiness indicator.
Code execution warning: The test collection and dependency audit commands below import modules and make network calls. On untrusted repositories, these commands could execute arbitrary code. For untrusted repos, skip test collection and dep scanning — use file-based counting only.
# Test suite existence and count (SAFE: file-based counting, no code execution)
find tests/ -name "test_*.py" -o -name "*_test.py" 2>/dev/null | wc -l || \
find . -name "*.test.ts" -o -name "*.test.js" -o -name "*.spec.ts" 2>/dev/null | wc -l || \
find . -name "*_test.go" 2>/dev/null | wc -l
# Lint/type-check config existence
ls .pre-commit-config.yaml .eslintrc* tsconfig.json pyproject.toml mypy.ini .golangci.yml 2>/dev/null
# Dependency vulnerability quick check (requires network; add timeout)
timeout 30 pip-audit --desc 2>/dev/null | tail -5 || \
timeout 30 npm audit --json 2>/dev/null | head -20 || \
echo "No dep scanner found or timed out"
# Code complexity indicator (file count — excludes generated/vendored dirs)
find . -not -path './.git/*' -not -path './node_modules/*' -not -path './vendor/*' \
-not -path './dist/*' -not -path './build/*' -not -path './.venv/*' -not -path './__pycache__/*' \
\( -name '*.py' -o -name '*.ts' -o -name '*.js' -o -name '*.go' \) | wc -l
Record results as a health snapshot: test file count, lint config present (Y/N), known vulnerabilities count, codebase size. This feeds into D1 scoring.
Skip this phase if action is assess or quick.
For each of the 14 foundational documents that is MISSING, list the missing documents and ask for user confirmation before creating any files. Show the list of documents that will be created with their target paths, and wait for explicit approval.
Once confirmed, create a skeleton template at the first path listed for that document (the preferred location). These templates contain structure and TODO markers — never generate fake content.
Each template should:
<!-- TODO: ... --> instructions explaining what to fill inIf Requirements is missing:
Create docs/REQUIREMENTS.md with sections: Purpose, Functional Requirements (REQ-001 template), Non-Functional Requirements (NFR-001 template), Security Requirements (REQ-SEC-001 template). Include a note about ID numbering convention.
If Traceability Matrix is missing:
Create docs/TRACEABILITY-MATRIX.md with a table template: | REQ ID | User Story ID | Spec Section | Code Module | Test ID | Deploy Version | Status | and instructions for maintaining bidirectional traceability.
If User Stories is missing:
Create docs/USER-STORIES.md with the format: As a [role], I want [capability], so that [benefit] plus acceptance criteria template (Given/When/Then).
If AI Context File is missing:
Create CLAUDE.md with sections: What This Project Does, Architecture, Project Structure, Dev Environment, Conventions, Current Status. Note that this is the persistent AI context file.
If Security Controls is missing:
Create SECURITY.md with sections: Security Model, Authentication, Authorization, Data Protection, Vulnerability Reporting, Audit History.
If PM Framework is missing:
Create docs/pm/FRAMEWORK.md (and directory) with sections: Sprint Model (bolt cadence), Sizing Convention (S/M/L/XL), Backlog Management, Blocker Tracking, Retrospectives.
If Solo+AI Workflow Guide is missing:
Create docs/standards/SOLO-AI-WORKFLOW-GUIDE.md with sections: Bolt-Driven Development, Five Questions Pattern, Context Hygiene, Captain's Logs, Session Lifecycle.
If CI/CD Deployment Proposal is missing:
Create docs/standards/CICD-DEPLOYMENT-PROPOSAL.md with sections: Current State (manual deploy), Proposed Pipeline, Environments, Rollback Strategy, Prerequisites.
If Multi-Developer Guide is missing:
Create docs/standards/MULTI-DEVELOPER-GUIDE.md with sections: Branch Strategy, Code Review Process, Shared Context Management, Onboarding Checklist.
If Infrastructure Playbook is missing:
Create docs/standards/INFRASTRUCTURE-PLAYBOOK.md with sections: Cloud Provider, IaC Tooling, Networking, IAM Patterns, Monitoring, Disaster Recovery.
If Cost Management Guide is missing:
Create docs/standards/COST-MANAGEMENT-GUIDE.md with sections: Budget, Cost Monitoring, Alert Thresholds, Kill Switch, Cost Review Cadence.
If Security Review Protocol is missing:
Create docs/standards/SECURITY-REVIEW-PROTOCOL.md with sections: Review Cadence, Five-Persona Review Process, Finding Severity Levels, Disposition Workflow, Review Archive.
If Ops Readiness Checklist is missing:
Create docs/standards/OPS-READINESS-CHECKLIST.md with sections: Monitoring, Alerting, Runbooks, Incident Response, Backup & Recovery, scored checklist format.
If Case Study is missing (CallHero repos only):
Create docs/standards/CALLHERO-AI-DLC-CASE-STUDY.md with sections: Executive Summary, Project Overview, AI-DLC Phase Mapping, Shortcomings, Lessons Learned. For non-CallHero repos, create a generic docs/AI-DLC-CASE-STUDY.md assessment template instead.
After creating templates, report what was created:
### Foundation Bootstrap
Created [N] foundational documents:
- docs/REQUIREMENTS.md (skeleton)
- docs/standards/COST-MANAGEMENT-GUIDE.md (skeleton)
- ...
All documents contain TODO markers. Fill in project-specific content before running `assess`.
If all documents already exist, report:
### Foundation Status: Complete
All foundational documents exist. Proceeding to assessment.
Skip this phase if action is init.
Evaluate 9 dimensions of AI-DLC process adherence. For each dimension, read the relevant files, run checks, and assign a numeric score (0-10) with a letter-grade equivalent.
| Score | Letter | Label | |-------|--------|-------| | 9-10 | A | Exemplary | | 7-8 | B | Mature | | 5-6 | C | Developing | | 3-4 | D | Minimal | | 0-2 | F | Not Started |
What it measures: Project bootstrap quality, context file completeness, governance setup, repository structure, and codebase health readiness for AI-assisted development.
What to check:
CLAUDE.md — does it cover project identity, architecture, conventions, and current state?Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No context file, or generic/boilerplate. No governance model. No consistent repo structure. | | 3-4 | Context file exists but incomplete (missing conventions, terminology, or architecture). Governance mentioned but not detailed. Basic repo structure. | | 5-6 | Context file covers project identity, structure, and conventions. Governance model selected and documented. Repo structure consistent. Linting configured. | | 7-8 | Context file is comprehensive, specific, and actionable. Pre-commit hooks installed. CI pipeline runs lint, test, and security. PM framework initialized. For existing codebases: codebase health baseline documented (test coverage, lint status, known vulnerabilities). | | 9-10 | Context file is a living document updated regularly. All 14 templates initialized. CI/CD skeleton passes. Context file enables orientation in under 5 minutes. Governance model reviewed and appropriate. Codebase health assessment complete — technical debt documented, dependency vulnerabilities at zero critical, lint passing clean. |
What it measures: Requirement completeness, architecture decision quality, and initial security review.
What to check:
CLAUDE.md — are architecture decisions documented with rationale and trade-offs?Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No formal requirements. Architecture implicit or undocumented. No ADRs. | | 3-4 | Requirements exist but lack structure (no IDs, no priority, no traceability). Architecture described informally. | | 5-6 | Requirements have unique IDs (REQ-001). At least one ADR exists. Initial threat model drafted. Technology stack documented. | | 7-8 | All requirements numbered, prioritized, categorized. Multiple ADRs with trade-off analysis. Threat model covers major components. Security requirements exist. | | 9-10 | Requirements complete, testable, stakeholder-approved. ADRs reference industry standards. Threat model is component-level. Human sign-off recorded. |
What it measures: User story quality, technical specification depth, Five Questions usage, and traceability matrix initialization.
What to check:
.olympus/workflows/, docs/intents/, or equivalent artifact hierarchy directoriesScoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No user stories. No technical specification. No traceability matrix. | | 3-4 | User stories exist but lack acceptance criteria. Spec is informal. Matrix missing or stub. | | 5-6 | User stories have acceptance criteria. Tech spec covers main components. Matrix links REQ to Story to Spec. Five Questions used on some features. | | 7-8 | All stories satisfy INVEST criteria. Tech spec includes API contracts, data models, error handling. Matrix has no orphan rows. Five Questions used on all feature areas. | | 9-10 | Specs precise enough for unambiguous construction. Validation gates documented with findings. All edge cases and abuse cases specified. Human sign-off on PRD and tech spec. Artifact hierarchy (IDEA → INTENT → UNIT → BOLT) elaborated with conformance scores >= 90%. Dependency graph generated. |
What it measures: Bolt discipline, test-paired development, captain's log practice, code quality, and AI-generated code verification.
What to check:
git log --oneline -50 — is bolt cadence visible? (commit messages referencing bolts, sprints, or structured work units)git log --oneline --name-only -20 — do code commits also touch test files?find tests/ -name "test_*.py" -o -name "*_test.py" 2>/dev/null | wc -l or equivalentgit log --oneline -20 | grep -i "co-authored-by\|ai-generated\|claude\|copilot" — are AI contributions tagged?Scoring rubric:
| Score | Criteria |
|-------|----------|
| 0-2 | No bolt structure. Tests absent or afterthought. No captain's logs. Inconsistent code quality. |
| 3-4 | Some bolt structure but inconsistent. Tests exist for some features. Captain's logs sporadic. |
| 5-6 | Consistent bolt workflow (plan, execute, review, retro). Tests paired with code for most bolts. Captain's logs for most bolts. Commit messages follow conventions. |
| 7-8 | Every bolt has a captain's log. Test delta positive every sprint. Bolt metrics tracked. T-shirt sizing used and calibrated. Traceability matrix updated per bolt. AI-generated code is identifiable (Co-Authored-By trailers, commit conventions, or metadata). |
| 9-10 | Exemplary bolt discipline: clear scope, accurate estimation, paired tests, retro insights captured. Test coverage >80%. Commits reference requirement IDs. Context file updated during construction. Zero XL bolts. Multi-agent execution applied where appropriate. The Ascent verification loop consistently followed. Tiered verification policy in place: critical paths get full human review, standard code gets AI-assisted review (/five-persona-review), low-risk changes use automated checks. Review bandwidth allocated by risk tier. |
What it measures: Five-persona review execution, finding management, OWASP coverage, and security controls.
What to check:
docs/security/, docs/reviews/, security review docs for dated audit files — when was last review?SECURITY.md — are findings tracked with dispositions (fixed, accepted, deferred)?git log --since="30 days ago" --oneline -- "*.py" "*.js" "*.ts" — code changes since last review?Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No security review conducted. No finding tracking. Default configurations in production. | | 3-4 | Some security review but informal. Findings noted but not tracked with severity or status. Some controls missing. | | 5-6 | Five-persona review conducted at least once. Findings have IDs and severity. Critical findings resolved. Basic cloud security controls in place. | | 7-8 | Five-persona review conducted per phase. All Critical and High findings resolved. OWASP Top 10 checklist complete. IAM audit performed. Encryption verified. Dependency scan passes. | | 9-10 | Five-persona review integrated into every bolt. Finding lifecycle tracked from discovery to verification. Won't Fix decisions documented with compensating controls. Quarterly re-reviews scheduled. Security findings trend downward. Compliance mapping (NIST, ISO) documented. |
What it measures: Ops readiness checklist score, monitoring, alerting, runbooks, deployment automation, and resilience patterns.
What to check:
Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No ops readiness assessment. No monitoring. Manual deployments. No runbooks. | | 3-4 | Some monitoring. Semi-automated deployment. Runbooks sketchy or missing. No DLQs or circuit breakers. | | 5-6 | Ops readiness checklist scored. Health checks on main services. Basic alerting. Automated deployment. Rollback procedure exists. | | 7-8 | Ops readiness score >=75%. Structured logging with correlation IDs. Alarms on critical paths with linked runbooks. DLQs on async processors. Load testing completed. Rollback tested. | | 9-10 | Ops readiness score >=90%. Every critical path has health check, metric, alarm, and runbook. Performance meets defined targets (p50, p95, p99). Circuit breakers and retry logic. Zero-downtime deployment with automated smoke tests. Canary testing active. |
What it measures: Budget awareness, cost monitoring, dashboards, kill switches, and ongoing optimization.
What to check:
Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No cost awareness. No budget. No monitoring of spend. No kill switches. | | 3-4 | Budget exists informally. Cost checked occasionally. No automated alerts. | | 5-6 | Cost baseline documented. Budget alarms configured. Basic cost dashboard. AI/ML costs tracked. | | 7-8 | Cost dashboard with per-service breakdown. Budget alarms at 50%, 80%, 100%. Kill switches implemented. Cost-per-transaction tracked. Cost analyst persona findings addressed. | | 9-10 | Cost management proactive. Kill switches tested and verified. Cost projections documented. Trends reviewed monthly. Decommissioned resources cleaned up. FinOps integrated into bolt planning. |
What it measures: Learning system adoption, context file maintenance, drift detection, retrospective practice, and framework self-improvement.
What to check:
git log --oneline -- CLAUDE.md — how often is it updated?Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No evolution activities. Context file unchanged since creation. No retrospectives. | | 3-4 | Occasional retrospectives. Context file updated sporadically. No drift detection. | | 5-6 | Regular bolt retros. Context file updated with major learnings. Some drift detection (dependency updates). Quarterly security re-review scheduled. | | 7-8 | Five-phase learning loop active (passive feedback, pattern extraction, preference learning, context injection, agent discovery). Drift detection across infrastructure, config, process, dependencies. Quarterly retros with action items. Patterns documented. | | 9-10 | Learning system continuous and measurable. Context file accuracy audited quarterly. Decommissioning procedures followed. Metrics dashboard current with positive trends. Case study documented. Team demonstrates measurable improvement over time. Automated learning system with 30-day pattern decay, preference tracking, and agent discovery integration. |
What it measures: Whether humans are steering development decisions or just accepting AI output, and whether review ceremony scales with both trust level AND task complexity.
What to check:
/cost-estimate) feed into gate decisions? METR research shows experienced developers are 19% slower with AI on complex, unfamiliar tasks — ceremony should increase for high-complexity work regardless of trust level.Scoring rubric: | Score | Criteria | |-------|----------| | 0-2 | No evidence of human decision-making. AI appears to be running on autopilot. | | 3-4 | Some human decisions visible but sparse. Most logs read like AI output. Deploys appear automated without review. | | 5-6 | Human decisions visible in some logs. Security reviews have dispositions. Deploys are human-initiated. Some architecture choices show human judgment. | | 7-8 | Captain's logs clearly show human voice and decision-making. Security findings human-triaged with rationale. Deploy approval gates in place. Five Questions Pattern used. Architecture decisions have human rationale. Review ceremony scales with trust level. | | 9-10 | Exemplary human-AI partnership. Logs show clear division of labor. Human overrides documented with rationale. AI suggestions rejected when appropriate. Human owns scope, priorities, and final approval. Evidence of the human teaching the AI (context file improvements from human insight). Trust-adaptive gates implemented with ceremony scaling across BOTH trust level and task complexity — high-complexity or novel-domain bolts trigger elevated oversight regardless of trust level. Learning Paradox embraced. |
To improve scoring consistency, use these reference points:
| Signal | Typical Score Range | |--------|-------------------| | Document exists but is a stub/template with TODOs | 2-3 | | Document exists with real content but is stale (>6 months unchanged) | 3-5 | | Document exists, has real content, and was updated within 90 days | 5-7 | | Document exists, has real content, is current, and shows evidence of active use (referenced in commits, logs, or reviews) | 7-9 | | Practice is documented AND verified (audit trail, review records, measurable outcomes) | 9-10 |
Anchoring principle: A score of 5 means "the practice exists and is followed sometimes." A score of 7 means "the practice is consistently followed with evidence." A score of 9 means "the practice is exemplary with measurable results." When in doubt, score conservatively — it's better to undercount and improve than to overcount and stagnate.
Format results as a compliance report. Present BOTH the numeric score and letter grade for each dimension:
## AI-DLC Compliance Report
### Generated: <date>
### Repository: <repo name from CLAUDE.md or git remote>
### Foundation Status
| # | Document | Status | Path |
|---|----------|--------|------|
| 1 | Requirements | EXISTS / CREATED / MISSING | <actual path found> |
| 2 | Traceability Matrix | EXISTS / CREATED / MISSING | <actual path found> |
| ... | ... | ... | ... |
[N]/14 foundational documents present.
### Process Adherence (9 Dimensions)
| # | Dimension | Score | Grade | Details |
|---|-----------|-------|-------|---------|
| 1 | Foundation & Context | X/10 | A-F | [key criteria met; key criteria missed] |
| 2 | Requirements & Architecture | X/10 | A-F | [key criteria met; key criteria missed] |
| 3 | Specification & Elaboration | X/10 | A-F | [key criteria met; key criteria missed] |
| 4 | Construction Process | X/10 | A-F | [key criteria met; key criteria missed] |
| 5 | Security Posture | X/10 | A-F | [key criteria met; key criteria missed] |
| 6 | Operational Readiness | X/10 | A-F | [key criteria met; key criteria missed] |
| 7 | Cost Management | X/10 | A-F | [key criteria met; key criteria missed] |
| 8 | Evolution & Learning | X/10 | A-F | [key criteria met; key criteria missed] |
| 9 | Human-AI Collaboration | X/10 | A-F | [key criteria met; key criteria missed] |
The Details column should cite specific rubric criteria from the scored band and the next band up. For example: "Context file comprehensive ✓, CI pipeline ✓; missing PM framework for 9+"
**Overall Score: X.X / 10**
**Maturity Rating: [Rating]**
Calculate the overall score as the unweighted average of all nine dimensions:
Overall Score = (D1 + D2 + D3 + D4 + D5 + D6 + D7 + D8 + D9) / 9
| Overall Score | Rating | Description | |---------------|--------|-------------| | 0.0 - 2.9 | Foundational | The project lacks essential AI-DLC structure. Start with Phase 0 and build up. | | 3.0 - 4.9 | Developing | Core framework elements are in place but significant gaps remain. Focus on lowest-scoring dimensions. | | 5.0 - 6.9 | Operational | The project follows AI-DLC practices consistently. Address remaining gaps to reach maturity. | | 7.0 - 8.9 | Optimized | The project demonstrates mature, thorough AI-DLC adoption. Continuous improvement is active. | | 9.0 - 10.0 | Exemplary | The project is a reference implementation. All dimensions are strong and measurable. |
Uses the same boundaries as per-dimension grading for consistency:
| Overall Score | Letter | |---------------|--------| | 9.0 - 10.0 | A | | 7.0 - 8.9 | B | | 5.0 - 6.9 | C | | 3.0 - 4.9 | D | | 0.0 - 2.9 | F |
After the dashboard, list specific actions sorted by urgency. Use the score to determine priority. For each action item, recommend the specific skill that can address it using the Skill-to-Dimension mapping below.
Before recommending skills, check which are actually installed:
# Check for SKILL.md-based skills
ls ~/.claude/skills/*/SKILL.md 2>/dev/null | sed 's|.*/skills/||;s|/SKILL.md||'
Also note which skills are available in the current session's system prompt. When recommending a skill that is NOT installed, append "(not installed — install first)" to the recommendation. Never recommend a skill without indicating whether it's available.
When recommending actions, reference the skill that directly addresses each dimension:
| Dimension | Primary Skills | Usage |
|-----------|---------------|-------|
| D1 Foundation & Context | /dlc-audit init, /init-project, /motherhen | Bootstrap missing docs, scaffold project, check freshness |
| D2 Requirements & Architecture | /arch-audit | Multi-persona architectural review with Mermaid diagrams |
| D3 Specification & Elaboration | /five-persona-review, /cost-estimate | Adversarial spec review, bolt sizing |
| D4 Construction Process | /pm, /captainslog, /bolt-review, /cost-estimate | Sprint management, decision records, end-of-bolt review, sizing |
| D5 Security Posture | /security-audit, /five-persona-review | 9-category security audit, 5-persona adversarial review |
| D6 Operational Readiness | /prodstatus, /budget | Production health dashboard, infrastructure cost analysis |
| D7 Cost Management | /budget init, /budget review, /cost-estimate | Cost baseline, cost monitoring, effort estimation |
| D8 Evolution & Learning | /motherhen, /changelog, /captainslog | Drift detection, version management, session continuity |
| D9 Human-AI Collaboration | /captainslog, /bolt-review | Decision records with human voice, comprehensive review |
### Action Items
**Critical (score 0-2 — fix immediately):**
- [Dimension]: [specific action with file path and what to change] → Run `[skill]`
**High (score 3-4 — address this sprint):**
- [Dimension]: [specific action] → Run `[skill]`
**Improvement (score 5-6 — address soon):**
- [Dimension]: [specific action] → Run `[skill]`
**Polish (score 7-8 — optional refinement):**
- [Dimension]: [specific action] → Run `[skill]`
**Exemplary (score 9-10 — maintain):**
- [Dimension]: [brief confirmation of what's working well]
For dimensions scoring below 5, include a structured improvement plan with skill recommendations:
### Improvement Plan
| Dimension | Current | Target | Actions | Skill | Timeline |
|-----------|---------|--------|---------|-------|----------|
| D5 Security | 3/10 | 6/10 | Run five-persona review, set up finding tracking | `/security-audit` then `/five-persona-review` | 2 weeks |
| D7 Cost | 2/10 | 5/10 | Document cost baseline, configure budget alarms | `/budget init` | 1 week |
Then offer the user options:
If the user chooses to fix, work through items by priority (lowest scores first), updating scores as each is resolved.
These 14 documents are the artifacts that demonstrate AI-DLC adoption. Each contributes to one or more assessment dimensions.
| # | Document | Dimensions | Created In | |---|----------|-----------|------------| | 1 | CLAUDE.md (Context File) | D1 Foundation, D8 Evolution | Phase 0, updated continuously | | 2 | PM-FRAMEWORK.md | D1 Foundation, D4 Construction | Phase 0 | | 3 | REQUIREMENTS.md | D2 Requirements | Phase 1 | | 4 | SECURITY.md | D2 Requirements, D5 Security | Phase 1 | | 5 | USER-STORIES.md | D3 Specification | Phase 2 | | 6 | TRACEABILITY-MATRIX.md | D3 Specification, D4 Construction | Phase 2, updated through Phase 5 | | 7 | CICD-DEPLOYMENT-PROPOSAL.md | D6 Ops Readiness | Phase 2 | | 8 | INFRASTRUCTURE-PLAYBOOK.md | D6 Ops Readiness | Phase 2 | | 9 | COST-MANAGEMENT-GUIDE.md | D7 Cost Management | Phase 2 | | 10 | SOLO-AI-WORKFLOW-GUIDE.md | D4 Construction | Phase 3 | | 11 | MULTI-DEVELOPER-GUIDE.md | D4 Construction | Phase 3 | | 12 | SECURITY-REVIEW-PROTOCOL.md | D5 Security | Phase 4 | | 13 | OPS-READINESS-CHECKLIST.md | D6 Ops Readiness | Phase 4 | | 14 | AI-DLC-CASE-STUDY.md | D8 Evolution | Phase 6 |
| Documents Present | Quick Rating | |-------------------|-------------| | 0-3 | Foundational — significant framework gaps | | 4-7 | Developing — core structure exists | | 8-11 | Operational — most framework artifacts in place | | 12-14 | Optimized/Exemplary — comprehensive adoption |
Skip this phase unless action is compliance.
After completing the standard assessment (Phases 1-5), map the 9 dimension scores to EU AI Act requirements. This helps teams use their AI-DLC audit as evidence of regulatory compliance.
| EU AI Act Requirement | Article(s) | AI-DLC Dimension(s) | What to Check | |-----------------------|-----------|---------------------|---------------| | Risk management system | Art. 9 | D2 (Requirements), D9 (Human-AI Collaboration) | Risk tiers documented? Trust-adaptive gates in place? | | Data governance | Art. 10 | D3 (Specification) | Data models specified? Data flow documented? | | Technical documentation | Art. 11 | D1 (Foundation), D3 (Specification) | 14 foundational docs present? Specs traceable? | | Record-keeping / Logging | Art. 12 | D4 (Construction), D8 (Evolution) | Captain's logs maintained? Git audit trail structured? | | Transparency | Art. 13 | D4 (Construction), D9 (Human-AI Collaboration) | AI contributions identifiable? Human decisions documented? | | Human oversight | Art. 14 | D9 (Human-AI Collaboration) | Human decision gates at every phase? Human overrides documented? | | Accuracy, robustness, cybersecurity | Art. 15 | D5 (Security), D6 (Ops Readiness) | Five-persona review conducted? Ops readiness scored? | | Quality management system | Art. 17 | D1 (Foundation), D4 (Construction) | PM framework? Bolt discipline? CI pipeline? | | Post-market monitoring | Art. 72 | D6 (Ops Readiness), D8 (Evolution) | Monitoring active? Drift detection? Quarterly reviews? | | Corrective actions | Art. 20 | D5 (Security), D8 (Evolution) | Finding lifecycle tracked? Retrospectives with action items? |
Present after the standard assessment dashboard:
### EU AI Act Compliance Mapping
| Requirement | Article | Mapped Dimensions | Score Avg | Status |
|-------------|---------|-------------------|-----------|--------|
| Risk management | Art. 9 | D2 (X), D9 (X) | X.X | EVIDENCE PRESENT / GAP / NOT ASSESSED |
| Data governance | Art. 10 | D3 (X) | X.X | EVIDENCE PRESENT / GAP / NOT ASSESSED |
| ... | ... | ... | ... | ... |
**Status labels:**
- **EVIDENCE PRESENT** (score >= 6): Project-level practices provide evidence toward this requirement. This is NOT a compliance certification — see Compliance References below.
- **GAP** (score 4-5): Partial coverage. Practices exist but are insufficient as compliance evidence.
- **NOT ASSESSED** (score 0-3): Significant compliance risk. No meaningful evidence exists.
### Compliance Summary
- Articles with full coverage: X/10
- Articles with gaps: X/10
- Articles not assessed: X/10
- **Recommendation:** [brief next-step guidance]
When presenting results, note that:
init mode never overwrites existing files. It only creates files that are missing.AI-DLC-CASE-STUDY.md (generic) instead of CALLHERO-AI-DLC-CASE-STUDY.md. If neither exists in init/full mode, create the generic version. Always count foundation as /14 regardless of repo.## AI-DLC Compliance Report
### Generated: 2026-03-03
### Repository: <repo name>
### Foundation Status
| # | Document | Status | Path |
|---|----------|--------|------|
| 1 | Requirements | EXISTS | docs/REQUIREMENTS.md |
| 2 | Traceability Matrix | MISSING | — |
| ... | ... | ... | ... |
11/14 foundational documents present.
### Process Adherence (9 Dimensions)
| # | Dimension | Score | Grade | Details |
|---|-----------|-------|-------|---------|
| 1 | Foundation & Context | 8/10 | B | Context file comprehensive ✓, CI pipeline ✓; missing codebase health baseline for 9+ |
| 2 | Requirements & Architecture | 6/10 | C | REQ IDs ✓, ADRs exist ✓; requirements not prioritized, no threat model |
| ... | ... | ... | ... | ... |
**Overall Score: 6.4 / 10**
**Maturity Rating: Operational**
### Action Items
**Improvement (score 5-6):**
- D2: Add priority levels to requirements → Run `/arch-audit`
development
Team sync for Astro website repos — checks git/GitHub/server state and tells you exactly what to do next
development
Simple team guide for website collaborators — checks your situation and tells you what to do in plain English
tools
--- name: ticky description: Full lifecycle ticket management — draft, submit, sync, and clean Azure DevOps work items across repos. user-invocable: true allowed-tools: Bash, Read, Write, Edit, Glob, Grep argument-hint: <mode> [args...] — modes: draft, submit, clean, update, get, create --- # Ticky — Full Lifecycle Ticket Management Manage Azure DevOps work items through their full lifecycle: draft locally, submit to ADO, sync status, and clean up cross-repo tickets. **CLI:** `${TICKY_HOME:-$
testing
# /staff — Staff Engineer Panel Analysis Convene a panel of 4 staff engineers from top tech companies + Will Larson as moderator to independently analyze a technical problem, debate options, and produce a consensus decision with implementation plan. > Like a real Staff Engineer round-table: each engineer brings their company's culture and battle scars. They disagree, challenge assumptions, find latent bugs, and converge on the smallest change that eliminates the actual risk. ## Trigger User