skills/adr-backfill/SKILL.md
Backfill missing ADR from git history and documentation
npx skillsauth add liza-mas/liza adr-backfillInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reconstruct Architecture Decision Records from a repository's git history and documentation. You're doing archaeology — finding the decisions buried in commits, specs, and docs, then surfacing them as ADRs.
An ADR is warranted when someone made a choice that shaped the system. Not every commit is a decision. Your job is to find the ones that were.
Classify files — Distinguish architectural files (where decisions manifest) from supportive files (tests, utils). Persist this classification.
Identify candidate commits — Find commits that touch architectural files with structural changes (not just edits).
Cluster into decisions — Group related commits that represent a single decision being implemented.
Fill gaps — Pull in minor commits (typo fixes, forgotten files) that belong to a cluster but were filtered out.
For each cluster — Analyze intent, ask the user for context, generate the ADR.
Scan complementary sources — Check specs/ and docs/ for decisions not captured by commits.
Enrich ADRs — Add cross-references, diagrams, and implementation notes from related documentation.
Order chronologically — Renumber ADRs to maintain chronological sequence.
Update ADR index — Keep specs/architecture/ADR/README.md in sync after any ADR is added, removed, or renumbered.
Maintain state in files so work isn't lost if the conversation ends.
Consider all files - present and deleted. Deletion may reveal an architectural decision.
Tier 0 — Dependency manifests (highest signal)
requirements.txt, pyproject.toml, package.json, go.mod, Cargo.tomlTier 1 — Infrastructure & deployment
Dockerfile, docker-compose*.yml, CI configs, terraform, k8s manifestsTier 2 — Domain structure
Tier 3 — Interface contracts
specs/ — Design documents that may contain decisions preceding implementationdocs/ — Usage guides and criteria that may encode decisions-as-codeSome files blur the line. When uncertain, ask: "If this changed significantly, would a senior engineer want to know why before approving?" If yes, it's architectural.
Not every change to an architectural file is a decision. Look for:
A 200-line refactor in a util is less ADR-worthy than a one-line addition of celery to requirements.txt.
Assign candidate numbers in strict chronological order (earliest commit date first). This numbering carries through to the final ADR sequence ID. When presenting candidates grouped by confidence tier, sort candidates within each group by ascending number (i.e. chronologically).
When uncertain about boundaries, ask the user.
Classification ambiguity
"I'm unsure whether
src/lib/kafka_client.pyis architectural (core infrastructure) or supportive (utility). It's used by 12 services. How do you see it?"
Cluster boundaries
"Commits abc123–def456 span 3 weeks and touch both the new auth system and the database migration. Should these be one decision or two?"
Intent unclear
"This cluster adds Redis, but I can't tell if it's for caching, session storage, or as a Celery broker. What was the driver?"
Low confidence
"I found 4 commits that add logging configuration. Is this ADR-worthy or just housekeeping?"
Always ask before generating an ADR. Present your analysis, let the user confirm or correct.
Before generating each ADR, ask for:
If PR descriptions or commit messages already contain this, confirm rather than re-ask.
Git commits are the primary source but not the only one. Some decisions are better documented in specs or docs than in commit messages.
Check specs/ for:
A spec file often documents a decision that spans multiple commits or was made before coding began. When you find a spec that describes a decision:
Example: specs/platform-detection.md documented the three-module detection architecture before implementation — this warranted its own ADR even without a clear commit cluster.
Check docs/ for:
Docs may not warrant their own ADR but can enrich commit-based ADRs with:
Example: docs/SELECTION-CRITERIA.md documented the scoring system as a deliberate choice — this warranted an ADR capturing the "criteria as code" decision.
After generating commit-based ADRs:
Gate:
"I found these decisions in specs/docs that aren't captured by commit-based ADRs:
- Platform detection architecture (specs/platform-detection.md)
- Sector taxonomy resolution (specs/sector-taxonomy.md) Which should become ADRs?"
After initial ADR generation, enrich with cross-references and context from related documentation.
Build a reference graph:
Use explicit references: "With the ATS extractor architecture established (ADR-0003, ADR-0008)..."
For each ADR, check if related docs contain:
Add these as inline content within relevant sections, not just as references.
Example enrichment:
### Architecture
**6-Phase Pipeline Vision** (designed from day one):
\`\`\`
Phase 0: Company Discovery → company-inventory.md
Phase 1: Job Discovery → career_jobs/*.json
...
\`\`\`
When the user provides collaboration context, add an "Implementation Notes" section capturing:
Example:
### Implementation Notes
**Collaboration model:** Paired with Claude on specs, implementation, and tests.
Human wrote ~20 LoC to demonstrate the `@register_extractor` decorator mechanism
and the base class pattern — faster than explaining. Every line reviewed with
many requested changes.
**Subsequent findings:** Architecture review identified technical debt:
- REQUEST_TIMEOUT duplication (10+ files, values 10-30s)
- Intra-module duplication in `extract_job_listings.py`
ADRs should be numbered chronologically by decision date, not generation order.
If spec-derived ADRs fall chronologically between commit-based ADRs:
git mv for rename tracking# 15 - ... → # 2 - ...)ADR-0005 → ADR-0008)To avoid conflicts when numbers swap:
git mv 0002-*.md temp-0002-*.md)git mv temp-0015-*.md 0002-*.md)git log of spec fileUse MADR format unless the user specifies otherwise. Place in specs/architecture/ADR/ or the project's existing ADR location.
---
*Reconstructed from commits {first_sha}..{last_sha} ({date_range})*
---
*Reconstructed from {source_file} ({date_range})*
After generating, renumbering, or removing ADRs, update specs/architecture/ADR/README.md.
The index is a markdown table with two columns: ADR (linked title) and Decision (one-sentence outcome).
A good backfilled ADR:
A bad backfilled ADR:
Persist your work so it survives conversation boundaries:
specs/architecture/ADR/adr-backfill-state.yml — classification, clusters, processing progressspecs/architecture/ADR/adr-backfill-clusters/ — one file per cluster with analysis and user inputCheck for existing state at the start. Offer to resume or restart.
When the user invokes this skill:
specs/architecture/ADR/README.md index# adr-backfill-state.yml
version: 1
repository: "[email protected]:org/repo.git"
started_at: "2024-01-15T10:30:00Z"
last_updated: "2024-01-15T14:22:00Z"
file_classification:
# path → {tier: int | null, category: string, override: bool}
"requirements.txt": {tier: 0, category: "dependency_manifest"}
"src/core/domain.py": {tier: 2, category: "domain_structure"}
"tests/test_domain.py": {tier: null, category: "test"}
commits:
# sha → annotation
"abc123":
pruned: false
highest_tier: 0
signals:
- {type: "dependency_added", weight: 1.0, details: {package: "celery"}}
architectural_files: ["requirements.txt"]
processed: true
"def456":
pruned: true
processed: true
clusters:
# Commit-based cluster
- id: 17
title: "Celery Integration"
commit_shas: ["abc123", "abc124", "abc125"]
gap_commits: ["abc123a"]
date_range: "2024-01-10 to 2024-01-12"
confidence: 0.85
status: "adr_generated" # pending | analyzed | user_enriched | adr_generated | skipped
analysis:
intent_hypothesis: "Introduce async task processing"
solution_detected: "Celery with Redis broker"
pr_metadata:
number: 142
title: "Add background job processing"
description: "..."
user_context:
problem: "Synchronous email sending was blocking requests..."
alternatives: "RQ, Dramatiq, custom threading"
rationale: "Celery has best ecosystem support..."
limitations: "Redis becomes SPOF..."
adr_path: "specs/architecture/ADR/0017-celery-integration.md"
# Spec/doc-derived cluster (supplementary ADR)
- id: 18
title: "Platform Detection Architecture"
commit_shas: [] # Empty for spec-derived ADRs
source: "specs/platform-detection.md" # Source file instead of commits
date_range: "2024-01-08 to 2024-01-15"
confidence: 0.9
status: "adr_generated"
analysis:
intent_hypothesis: "Design comprehensive ATS detection system"
solution_detected: "Three-module architecture with fallback strategies"
adr_path: "specs/architecture/ADR/0012-platform-detection-architecture.md"
processing_cursor:
step: "cluster_analysis" # classification | pruning | filtering | clustering | gap_filling | cluster_analysis | gap_analysis | enrichment | ordering
cluster_index: 17
substep: "user_enrichment"
configuration:
max_time_gap_days: 7
adr_output_dir: "specs/architecture/ADR"
adr_format: "madr" # madr | nygard | custom
git_remote: "origin"
pr_platform: "github" # github | gitlab | bitbucket | none
# [seq id] - [title]
## Context and Problem Statement
...
## Considered Options
1. [option name] - [description]
2. [option name] - [description]
## Decision Outcome
Chose **Option N**: ...
### Architecture
[diagrams, tables, module descriptions from enrichment]
### Rationale
...
### Implementation Notes
[collaboration patterns, technical decisions, debt identified — when provided by user]
### Consequences
**Positive:**
- ...
**Limitations accepted:**
- ...
---
*Reconstructed from commits {first_sha}..{last_sha} ({date_range})*
development
Coordinate Pairing-mode doer/reviewer sessions through a Markdown blackboard. Use when the user invokes /adversarial-pairing with role and blackboard-path arguments or asks multiple pairing agents to coordinate plan review, implementation, staged code review, and follow-up review rounds without Liza multi-agent mode.
data-ai
Analyze Liza agents logs
development
Code Review Protocol
tools
Analyze Liza `.liza/agent-prompts/` and `.liza/agent-outputs/` from a context-engineering perspective: prompt payload shape, context budget use, cacheability, duplicated or missing context, instruction hierarchy, tool-output pressure, role-specific context fit, and prompt-output feedback loops. Use when diagnosing agent context bloat, prompt drift, poor agent handoffs, repeated misunderstandings, excessive tool output, or whether Liza agents received the right information at the right time.