bundles/ai-agents/skills/spec-first/SKILL.md
Enforces a spec → plan → execute → verify loop before writing code, preventing "looks right" failures. Activates on "build X", "implement...", "add a feature that...", or any multi-file/unclear-requirements request. Creates spec.md, todo.md, and decisions.md as durable artifacts.
npx skillsauth add shipshitdev/library spec-firstInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Inputs:
Outputs:
spec-[feature-name].md content for .agents/memory/.todo.md checklist with per-step verification commands.Creates/Modifies:
.agents/memory/spec-[feature-name].md (spec artifact)..agents/memory/decisions-[feature-name].md (decision log).External Side Effects:
Confirmation Required:
Delegates To:
task-prd-creator for PRD-style issue creation.executing-plans for Stage D autonomous execution.A structured workflow for LLM-assisted coding that delays implementation until decisions are explicit.
Delay implementation until tradeoffs are explicit — Use conversation to clarify constraints, compare options, surface risks. Only then write code.
Treat the model like a junior engineer with infinite typing speed — Provide structure: clear interfaces, small tasks, explicit acceptance criteria. Code is cheap; understanding and correctness are scarce.
Specs beat prompts — For anything non-trivial, create a durable artifact (spec file) that can be re-fed, diffed, and reused across sessions.
Generated code is disposable; tests are not — Assume rewrites. Design for easy replacement: small modules, minimal coupling, clean seams, strong tests.
The model is over-confident; reality is the judge — Everything important gets verified by execution: tests, linters, typecheckers, reproducible builds.
Goal: Decide before you implement.
Prompts that work:
Output: Decision notes in .agents/memory/decisions-[feature-name].md
Goal: Turn decisions into unambiguous requirements.
File: .agents/memory/spec-[feature-name].md
# [Feature Name] Spec
## Purpose
One paragraph: what this is for.
## Non-Goals
Explicitly state what you are NOT building.
## Interfaces
Inputs/outputs, data types, file formats, API endpoints, CLI commands.
## Key Decisions
Libraries, architecture, persistence choices, constraints.
## Edge Cases and Failure Modes
Timeouts, retries, partial failures, invalid input, concurrency, idempotency.
## Acceptance Criteria
Bullet list of testable statements. Avoid "should be fast."
Prefer: "processes 1k items under 2s on M1 Mac."
## Test Plan
Unit/integration boundaries, fixtures, golden files, what must be mocked.
Goal: Stepwise checklist where each step has a verification command.
Tracking: a GitHub Issue per feature — the checklist below is the issue body.
# [Feature Name] TODO
- [ ] Add project scaffolding (build/run/test commands)
Verify: `bun run build && bun run test`
- [ ] Implement module X with interface Y
Verify: `bun run test -- --grep "module X"`
- [ ] Add tests for edge cases A/B/C
Verify: `bun run test -- --grep "edge cases"`
- [ ] Wire integration
Verify: `bun run integration`
- [ ] Add docs
Verify: `bun run docs && open docs/index.html`
Each item must be independently checkable. This prevents "looks right" progress.
Goal: Small diffs, frequent verification, controlled context.
Rules:
For large codebases:
Goal: Force the model to try to break its own work.
Prompts:
Goal: Keep the system easy to delete and rewrite.
Heuristics:
Durable spec + decisions live in .agents/memory/ (not project root); the stepwise todo is tracked as a GitHub Issue:
.agents/memory/
├── spec-[feature-name].md # what/why/constraints
└── decisions-[feature-name].md # tradeoffs, rejected options, assumptions
GitHub Issue (one per feature) # steps + verification commands (checklist body)
Naming: Use the feature/task name (e.g., user-auth, api-refactor) as the filename suffix and the issue title.
Why memory/ + Issues:
.agents/memory/ (the source of truth)Before running autonomous/agentic execution, verify:
| Dimension | Question | If No... | |-----------|----------|----------| | Intent | Do you have acceptance criteria and a test harness? | Don't run agent | | Memory | Do you have durable artifacts (spec/todo) so it can resume? | It will thrash | | Planning | Can it produce/update a plan with checkpoints? | It will improvise badly | | Authority | Is what it can do restricted (edit, test, commit)? | Too risky | | Control Flow | Does it decide next step based on tool output? | It's just generating blobs | | Tools | Does it have minimum necessary tooling and nothing extra? | Attack surface too large |
Approve at meaningful checkpoints (end of todo item, after test suite passes), not every micro-step.
Authoritarian (for correctness):
Edit these files: [paths]
Interface: [exact signatures]
Acceptance criteria: [list]
Required tests: [list]
Don't change anything else.
Options and tradeoffs (for design):
Give me 3 options and a recommendation.
Make the recommendation conditional on constraints A/B/C.
Context discipline (for large codebases):
Only use the files I provided.
If you need more context, ask for a specific file and explain why.
Make it provable:
Add a test that fails on the buggy version and passes on the correct one.
When this skill activates, produce:
SPEC-FIRST WORKFLOW
STAGE A - FRAMING:
[3 approaches with tradeoffs]
[Recommendation]
STAGE B - SPEC:
[Draft spec.md content]
STAGE C - TODO:
[Draft todo.md with verification commands]
Ready to proceed to Stage D (execution)?
development
Create an isolated git worktree from the correct base branch and check it out into a clean, gitignored directory. Use when the user asks to make a worktree, spin up a parallel/isolated workspace, work on something without disturbing the current checkout, branch off the current work, or run multiple agents on the same repo at once. Picks the base branch smartly — the current feature branch when you are on one, otherwise the develop integration branch — so worktrees continue your in-progress work by default instead of forking from the wrong place.
development
Verify a release was fully promoted through develop, staging, and master/main, then prune merged local and remote branches and stale git worktrees. Squash-merge aware — uses GitHub PR merge state as the merge oracle, not commit ancestry. Use when the user asks to clean up branches after a deploy, prune worktrees, remove merged branches, tidy up after promoting develop to staging to master, or confirm nothing stale was left behind before pruning.
development
Structured "done coding, now what?" workflow: verify tests pass, detect the repository environment (normal repo vs worktree, named branch vs detached HEAD), present exactly the right merge / PR / keep / discard options, and execute the chosen path including safe worktree cleanup. Use when implementation is complete and the branch needs to be integrated, published, or abandoned.
tools
Capture a client or stakeholder feature request, turn it into a planner-ready PRD epic with scoped sub-issues, check for duplicate work, and place approved issues on a GitHub Projects kanban. Use when a user invokes feature intake, asks to turn a rough client requirement into GitHub issues, or wants an idea written as a PRD and pushed to a board.