skills/sdd-derive/SKILL.md
Use when deriving SDD specs from existing code or retroactively documenting implemented behavior. Triggers: "derive specs", "generate specs from code", "document this code in SDD", "retrofit specs", "spec out what's already built".
npx skillsauth add ahgraber/skills sdd-deriveInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Orchestrate spec generation from existing code: a change directory (new or modified behavior) or baseline specs (retroactive documentation).
The orchestrator holds context across phases; subagents are dispatched for scoped work that benefits from context isolation.
SPECS_ROOTis resolved by thesddrouter before this skill runs. Replace.specs/with the project's actual specs root.
sdd-derive.sdd-translatesdd-proposesdd-exploreBaseline specs (.specs/specs/) document implemented behavior.
Write baseline specs only when existing code anchors them; otherwise generate a change directory.
digraph output_type {
".specs/specs/ exists?" [shape=diamond];
"Codebase has relevant\nimplementation?\n(answered by Phase 2 discovery)" [shape=diamond];
"New or existing behavior?" [shape=diamond];
"Generate change directory\n(.specs/changes/<name>/)\nADDED-only delta specs" [shape=box];
"Generate change directory\n(.specs/changes/<name>/)" [shape=box];
"Generate baseline specs\n(.specs/specs/)" [shape=box];
".specs/specs/ exists?" -> "Codebase has relevant\nimplementation?\n(answered by Phase 2 discovery)" [label="no"];
"Codebase has relevant\nimplementation?\n(answered by Phase 2 discovery)" -> "Generate baseline specs\n(.specs/specs/)" [label="yes — retroactive doc"];
"Codebase has relevant\nimplementation?\n(answered by Phase 2 discovery)" -> "Generate change directory\n(.specs/changes/<name>/)\nADDED-only delta specs" [label="no — greenfield"];
".specs/specs/ exists?" -> "New or existing behavior?" [label="yes"];
"New or existing behavior?" -> "Generate change directory\n(.specs/changes/<name>/)" [label="new or modified"];
"New or existing behavior?" -> "Generate baseline specs\n(.specs/specs/)" [label="retroactive doc"];
}
Greenfield check: when .specs/specs/ does not exist, run discovery (Phase 2) before deciding.
If discovery finds no relevant implementation, generate a change directory with ADDED-only delta specs — never assert behavior in .specs/specs/ that has not been built.
Persist before returning. Every subagent MUST write its output to disk unconditionally.
Resolve paths before dispatch.
The orchestrator MUST expand $TMPDIR and other shell variables in the prompt; subagents do not consistently expand them, and literal $TMPDIR produces silent write failures.
Verify writes inline.
Every subagent prompt MUST require ls -la <path> after writing and report byte count.
The orchestrator confirms the file exists before treating dispatch as successful.
Dispatch by reference, not inline duplication.
Role discipline lives in references/<role>.md.
Resolve the skill's install path (e.g., ~/.claude/skills/sdd-derive/, .claude/skills/sdd-derive/, or a plugin directory) and substitute it into the template.
Subagents inherit Read permissions from the parent.
Read `<resolved_skill_root>/references/<role>.md` and follow it as your job description.
Below is your specific scope.
Capability: `<name>` \
Files in scope: `<list>` \
External-surface candidates: `<list>` \
Output path: `<literal absolute path>` \
After writing, run `ls -la <path>` and report byte count.
Return decision-relevant data, not artifact content. Full artifacts live on disk; the orchestrator reads the file when it needs detail. The synthesizer is the exception — its capability menu is structured, bounded, and required for Phase 3.
| Subagent | Return inline | | ----------- | ----------------------------------------------------------------------------------------- | | Explorer | path written, byte count, technique, finding count, anomalies | | Synthesizer | full capability menu (Phase 3 needs it), path written, byte count | | Observer | path written, byte count, observation count, surface item count, anomalies | | Lifter | path written, byte count, requirement count, scenario count, uncertainty count, anomalies |
Track a manifest, not content. Decide what runs, track completed paths + counts + status, detect failures, trigger the next phase. Never re-ingest all artifacts at once.
digraph subagent_dispatch {
need [label="Phase needs subagent", shape=ellipse];
resolve [label="Resolve absolute paths\n($TMPDIR, skill_root)", shape=box];
compose [label="Compose prompt:\nread references/<role>.md\n+ scope + literal output path", shape=box];
dispatch [label="Dispatch subagent", shape=box];
work [label="Subagent reads role doc,\ndoes scoped work,\nwrites artifact to disk", shape=box];
verify_cmd [label="ls -la <output_path>", shape=plaintext];
report [label="Subagent reports path + bytes\n+ counts inline", shape=box];
check [label="File exists and non-empty?", shape=diamond];
manifest [label="Update manifest:\npath + counts + status=ok", shape=box];
fail [label="Mark dispatch failed;\nre-dispatch or escalate", shape=octagon, style=filled, fillcolor=pink];
next [label="Trigger next phase", shape=doublecircle];
need -> resolve;
resolve -> compose;
compose -> dispatch;
dispatch -> work;
work -> verify_cmd;
verify_cmd -> report;
report -> check;
check -> manifest [label="yes"];
check -> fail [label="no"];
manifest -> next;
}
Extract from the request:
Don't speculate on large surface areas — confirm scope before generating anything.
Two sequential calls:
discovery-explore — fan out parallel explorers, one per technique (call graph, naive AST, data-flow / channel, port/interface, schema artifacts, test-suite).
Each emits structured findings.discovery-synthesize — single synthesizer consumes all explorer outputs and produces a capability menu (candidates with file scope and cost, overlaps, external-surface candidates, axis disagreements, gotchas).See references/discovery.md for the explorer schema, synthesizer expectations, and loop semantics.
One-time tooling suggestions.
Check .specs/.sdd/suggested-tools and present each suggestion below at most once (append the marker after presenting; create the file/dir if needed).
If declined or already listed, skip and proceed with reduced fidelity.
code-review-graph(first run only): CLI that builds a structural AST graph of the codebase, improving call-graph discovery with communities, bridge nodes, and impact-radius signals. Install:uv tool install code-review-graph. Without it, the call-graph explorer falls back to naive AST traversal. Say "skip" to dismiss.
schema-config(first run only when schemas detected): Detected schema artifacts:<files>. Create.specs/.sdd/schema-config.yamlto configure schema extraction commands — enables snapshot generation, drift detection, and authored-vs-generated diffs. Seereferences/sdd-schema.md§ 3 for the format. Without it, the schema-artifact explorer runs detection-only. Say "skip" to dismiss.
Users typically lack architectural ground-truth for overlap-ownership and external-surface classification. The orchestrator commits sensible defaults silently and only escalates on flagged conditions.
Defaults (applied silently unless escalation fires):
confidence >= medium.Escalation conditions (orchestrator MUST prompt):
When no escalation fires, present a brief summary (capabilities + counts + cost) and proceed. When any fires, present only the flagged item and ask one targeted question.
Loop semantics: if the user requests refinement, clarify the request unless unambiguous. Refinement options:
For each selected capability, dispatch sequentially:
references/observer.md.## Uncertainties section.
See references/lifter.md.Capabilities run in parallel across each other; observer/lifter is sequential within a capability.
digraph phase_4_per_capability {
selected [label="Capability selected\nfrom menu", shape=ellipse];
obs_dispatch [label="Dispatch observer\n(file scope, surface candidates,\noutput path)", shape=box];
obs_out [label="Observer writes observations.yaml\n(behaviors + tags + surface inventory)", shape=box];
lift_dispatch [label="Dispatch lifter\n(observations path, metadata,\noutput type, as-of anchor)", shape=box];
lift_work [label="Lifter consumes observations\n+ verifies source as needed", shape=box];
lift_out [label="Lifter writes spec.md", shape=box];
validate [label="uv run validate.py --single\n<observations> <spec>", shape=plaintext];
pass [label="PASS?", shape=diamond];
fix [label="Lifter fixes failures in spec", shape=box];
done [label="Capability complete:\nreturn path + counts", shape=doublecircle];
selected -> obs_dispatch;
obs_dispatch -> obs_out;
obs_out -> lift_dispatch;
lift_dispatch -> lift_work;
lift_work -> lift_out;
lift_out -> validate;
validate -> pass;
pass -> done [label="yes"];
pass -> fix [label="no"];
fix -> validate;
}
Validation runs in two places.
Per-capability — each lifter runs the validator before returning:
uv run --quiet <skill_root>/references/validate.py --single <observations.yaml> <spec.md>
If FAIL, the lifter fixes and re-runs until PASS — catching format drift at write time avoids round-tripping a corrective dispatch.
See references/lifter.md § Self-check.
Aggregate — the orchestrator runs across the whole run:
uv run --quiet <skill_root>/references/validate.py <observations_dir> <specs_dir>
The aggregate report covers format, YAML parse, kind-aware surface coverage diff, and uncertainty review. If lifters did their self-check, format/YAML failures should be zero — failures here indicate a skipped self-check. Surface gaps and uncertainty totals are the substantive output.
If a spec fails format at this stage, dispatch a corrective lifter pass with the failures cited.
If an observation YAML fails parse, dispatch a corrective observer pass.
See references/validate.md for severity rules and report format.
Write the spec artifacts. Add a generation note at the top:
Generated from code analysis on {date}, as-of commit {sha}
Clear ephemeral observations once specs are written. The commit SHA is the canonical anchor for re-derivation.
See references/derive-spec-additions.md for the ## Uncertainties section format and as-of anchor placement.
Change directory (new/modified behavior):
.specs/changes/<name>/proposal.md.specs/changes/<name>/specs/<capability>/spec.md (delta format).specs/changes/<name>/tasks.md (when applicable)Baseline specs (retroactive):
.specs/specs/<capability>/spec.md per capabilitysdd-derive produces a partial change directory — no design.md.
Use sdd-propose for a full artifact set.
Report after generation: capabilities covered, requirement count, uncertainties count, surface coverage gaps.
references/evidence-class-taxonomy.md.algorithmic is set, apply the strategy check and emit an Uncertainty.references/lifter.md § Verification Discipline..specs/specs/, or baseline in a change dir..specs/specs/ asserts implemented behavior; if nothing is built, use a change directory with ADDED-only delta specs.references/discovery.md — explorer schema, synthesizer expectations, loop semanticsreferences/evidence-class-taxonomy.md — tag definitions and composition rulesreferences/observer.md — observation entry shape, surface inventory, observer promptreferences/lifter.md — lift rules per tag, verification discipline, lifter promptreferences/validate.md — surface coverage diff, Phase 7 checklistreferences/derive-spec-additions.md — ## Uncertainties and as-of anchor (derive-specific)references/sdd-spec-formats.md — baseline, delta, scenario formats (shared)references/sdd-change-formats.md — proposal, design, tasks formats (shared)references/sdd-schema.md — schema artifacts and lifecycle (shared)references/sdd-derive-output-type.dot — DOT source for the output type decisionreferences/sdd-derive-subagent-dispatch.dot — DOT source for the subagent dispatch lifecyclereferences/sdd-derive-phase-2-discovery.dot — DOT source for the Phase 2 explore/synthesize/refinement loopreferences/sdd-derive-phase-4-per-capability.dot — DOT source for the Phase 4 observer/lifter/self-validate flowdevelopment
Use when writing or reviewing tests for Python behavior, contracts, async lifecycles, or reliability paths. Also use when tests are flaky, coupled to implementation details, missing regression coverage, slow to run, or when unclear what tests a change needs. Use for multi-Python version testing (nox) and free-threaded Python thread-safety validation.
development
Use when the user wants rigorous, non-sycophantic editorial feedback on a draft, essay, blog post, or argument through back-and-forth dialogue — pressure-testing thesis, structure, argument, clarity, tone, and evidence. Triggers: "be my sparring partner", "pressure-test this draft", "poke holes in my argument", "is this ready to publish", "sharpen this post", "where is this weak". Not for one-shot copyediting, proofreading, or ghostwriting.
testing
Use when distilling the through-line gist of one or more sources — the spine, argument, tension, or recurring frame running through a set of documents, notes, research, or transcripts, OR across the ideas within a single rich piece — into a few concise paragraphs. Triggers: "synthesize", "what's the through-line/gist", "extract the insight", "pull these together". Not for faithful summary or condensation that covers what a source says, nor for comparisons or catalogs where enumeration is the deliverable.
development
Use when writing or reviewing tests in any language, or diagnosing a suite that is slow, brittle, or hard to read. Triggers: "write tests", "how should I test this", "what kind of test", "test is flaky/fragile", "should I mock this", "test is hard to read". For Python-specific guidance see `python-testing`.