skills/spec-backfill/SKILL.md
Backfill missing specifications, reconcile spec/code drift, maintain changelog.
npx skillsauth add liza-mas/liza spec-backfillInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reconstruct specifications from a repository's codebase, existing documentation, and git history. You're doing archaeology — finding the features buried in code and docs, then surfacing them as structured specs.
A spec is warranted when code implements user-facing functionality or business logic. Not every module needs a spec. Your job is to find the functional boundaries and document them.
Maintain state in files so work isn't lost if the conversation ends.
specs/spec-backfill-state.yaml) — if present, offer to resume from cursor position (see references/session-state-schema.md)specs/spec-mapping.yaml) — determines phase:
Three modes of operation. The agent runs the phase that matches the situation.
| Phase | When | Steps |
|-------|------|-------|
| First run | No spec-mapping.yaml exists | Classify → Map → Gaps → Generate → Verify |
| Incremental | Mapping exists, detect drift | Detect stale → Reconcile → Generate if needed |
| Targeted | User picks specific files/modules | Classify targets → Map → Generate → Verify |
1. CLASSIFY CODE
- Scan codebase, classify each path:
functional (needs spec) | architectural (needs ADR) | utility (needs neither)
- Judgment call: "If a PM asked 'what does this do for users?',
would there be a meaningful answer?" If yes → functional.
- Show classification to user for validation
2. MAP CODE TO SPECS
- For each functional path: does a spec exist? Check alignment.
- No spec → gap. Spec conflicts with code → conflict.
- Conflict types: code-ahead | spec-ahead | semantic-mismatch | stale-spec
- Output: coverage report per references/report-format.md
3. ASK — surface unknowns, prioritize gaps
- "I found N code paths without specs. Which are highest priority?"
- Never generate without confirmation.
4. GENERATE SPECS (per gap, one at a time)
- Extract observable behavior from code
- Identify inputs, outputs, side effects, business rules
- Ask user for: purpose, user story, acceptance criteria
- Generate spec only after confirmation
- Templates: references/spec-templates.md
5. VERIFY
- All functional code has spec mapping
- No relevant legacy content dropped (diff old vs new, surface differences)
- No internal contradictions
- Cross-references valid (ADRs, code paths)
- Archive superseded specs to _archive after confirmation
6. PERSIST
- Update spec-mapping.yaml (including SHAs) per references/mapping-schema.md
- Update changelog
- Output report per references/report-format.md
1. DETECT STALENESS — compare stored SHAs against current HEAD
- code_sha stale, spec_sha current → code-ahead-of-spec
- code_sha current, spec_sha stale → spec edited (intentional or drift?)
- Both stale → full reconciliation needed
- New code files → candidates for classification
- Deleted code files → flag for mapping cleanup
- Nothing stale/new → report "all current" and stop
2. SURFACE stale mappings and new files to user
3. RECONCILE each stale mapping
- Re-read code and spec, assess alignment
- Update status: aligned | gap | conflict
4. GENERATE / UPDATE if needed (same as First Run step 4)
5. PERSIST (same as First Run step 6)
1. User specifies files or modules
2. CLASSIFY targets (if not already in mapping)
3. MAP → GENERATE → VERIFY → PERSIST (same as First Run steps 2–6, scoped to targets)
specs/build/
0.md # Vision — why the product exists
1.1.md # Epic — large user-facing goal
1.1.1.md # User Story — specific user need
Build captures intent — what we set out to do and when.
specs/functional/
0.md # Product description — what it is today
1.1.md # Domain — bounded context
1.1.1.md # Feature — specific capability
Functional captures current state — what exists now and how it's organized.
An epic becomes a domain when all its stories are implemented (code mapped). Ask user to confirm before creating a functional entry from build.
Hierarchical: 1.1.md = first child of root, 1.1.1.md = first child of 1.1, 1.2.md = sibling of 1.1.
Code classification ambiguity
"I'm unsure whether
src/lib/pricing_engine.pyis functional or utility. It computes discounts based on customer tier. How do you see it?"
Spec mapping unclear
"Code in
src/orders/fulfillment.pyhandles shipping, returns, and tracking. One feature or three?"
Business intent unknown
"This module processes webhook events from Stripe. I can see what it does but not why."
Conflict resolution
"Spec says max 5 projects. Code enforces max 10. Which is correct?"
Gap prioritization
"I found 12 code paths without specs. Which are highest priority?"
Always ask before generating a spec.
Transition: inferred → confirmed only after explicit user approval.
A good backfilled spec:
A bad backfilled spec:
Prefer scripts/ over raw file editing for mapping updates — LLM YAML editing is fragile.
| Script | Purpose |
|--------|---------|
| scripts/detect-stale.sh [mapping-file] | Compare stored SHAs against HEAD. Outputs one line per entry: CURRENT, CODE_AHEAD, SPEC_EDITED, BOTH_STALE, DELETED, NEW |
| scripts/update-mapping.sh <mapping-file> <code-path> [options] | Update a single mapping entry. Supports --status, --code-sha auto, --spec-sha auto, --spec-path, --classify, --remove |
Both require yq (mikefarah/yq). Run detect-stale.sh first in Incremental phase, then update-mapping.sh per entry after reconciliation.
First-run classification on large codebases (>50 files): delegate scanning to a subagent (via your preferred subagent mechanism — e.g. generic-subagent skill, Task tool, or equivalent). The main agent retains classification decisions but the subagent does the file-by-file reading and initial categorization.
Delegation triggers:
2 intermediate tool calls whose outputs aren't needed in final delivery
Consult on demand, not upfront:
| Reference | When needed |
|-----------|-------------|
| references/mapping-schema.md | Persisting or reading mapping state |
| references/session-state-schema.md | Resuming across conversation boundaries |
| references/spec-templates.md | Generating specs (step 4) |
| references/report-format.md | Producing output reports |
development
Coordinate Pairing-mode doer/reviewer sessions through a Markdown blackboard. Use when the user invokes /adversarial-pairing with role and blackboard-path arguments or asks multiple pairing agents to coordinate plan review, implementation, staged code review, and follow-up review rounds without Liza multi-agent mode.
data-ai
Analyze Liza agents logs
development
Code Review Protocol
tools
Analyze Liza `.liza/agent-prompts/` and `.liza/agent-outputs/` from a context-engineering perspective: prompt payload shape, context budget use, cacheability, duplicated or missing context, instruction hierarchy, tool-output pressure, role-specific context fit, and prompt-output feedback loops. Use when diagnosing agent context bloat, prompt drift, poor agent handoffs, repeated misunderstandings, excessive tool output, or whether Liza agents received the right information at the right time.