.codex/skills/dev/SKILL.md
Subagent-driven TDD implementation per task from /plan task list
npx skillsauth add harshanandak/forge .codex/skills/devInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Implement each task from the /plan task list using a subagent-driven loop: implementer → spec compliance reviewer → code quality reviewer per task.
This command reads the task list created by /plan and implements each task using a three-stage subagent loop. TDD is enforced inside each implementer subagent.
/dev
# Find task list and design doc
ls docs/plans/
Read:
docs/plans/YYYY-MM-DD-<slug>-tasks.md — extract ALL task text upfrontdocs/plans/YYYY-MM-DD-<slug>-design.md — including ambiguity policy sectionCreate an empty decisions log at the start of every /dev session:
# docs/plans/YYYY-MM-DD-<slug>-decisions.md
Format for each entry:
## Decision N
**Date**: YYYY-MM-DD
**Task**: Task N — <title>
**Gap**: [what the spec didn't cover]
**Score**: [filled checklist total]
**Route**: PROCEED / SPEC-REVIEWER / BLOCKED
**Choice made**: [if PROCEED: what was decided and why]
**Status**: RESOLVED / PENDING-DEVELOPER-INPUT
<HARD-GATE: /dev start>
Do NOT write any code until ALL confirmed:
1. git branch --show-current output is NOT main or master
2. git worktree list shows the worktree path for this feature
3. Task list file confirmed to exist (use Read tool — do not assume)
4. Decisions log file created
</HARD-GATE>
Before starting the per-task loop, check for cross-developer conflicts:
# Auto-sync to get latest team state (non-blocking)
forge sync || true
# Check for conflicts with the current beads issue
bash scripts/conflict-detect.sh --issue <beads-id>
If exit code 2 (validation error): show error message, abort — do not show conflict prompt.
If exit code 1 (conflicts found):
n: exit cleanly, no side effectsy: log override via bd comments add <id> "Conflict override: proceeding despite overlap with <conflicting-issues>", then continue to Per-Task LoopIf exit code 0: proceed silently to Per-Task Loop.
Repeat for each task in the task list, in order:
Provide the subagent with:
The implementer subagent:
<HARD-GATE: TDD enforcement (inside implementer subagent)>
Do NOT write any production code until:
1. A FAILING test exists for that code
2. The test has been run and output shows it FAILING
3. The failure reason matches the expected missing behavior
If code was written before its test: delete it. Start with the test.
"The test would obviously fail" is not evidence. Run it and show the output.
</HARD-GATE>
If the implementer encounters something not specified in the design doc, STOP and fill this checklist BEFORE deciding how to proceed:
Gap: [describe exactly what the spec doesn't cover]
Score each dimension (0=No / 1=Possibly / 2=Yes):
[ ] 1. Files affected beyond the current task?
[ ] 2. Changes a function signature or public export?
[ ] 3. Changes a shared module used by other tasks?
[ ] 4. Changes or touches persistent data or schema?
[ ] 5. Changes user-visible behavior not discussed in design doc?
[ ] 6. Affects auth, permissions, or data exposure?
[ ] 7. Hard to reverse without cascading changes to other files?
TOTAL: ___ / 14
Mandatory overrides — any of these = automatically BLOCKED:
[ ] Security dimension (6) scored 2
[ ] Schema migration or data model change
[ ] Removes or changes an existing public API endpoint
[ ] Affects a task that is already implemented and committed
Score routing:
Log the decision entry before continuing.
After the implementer finishes the task, dispatch a spec compliance reviewer subagent.
Provide:
git diff for this task's commitsReviewer checks:
If spec issues found: implementer fixes → re-review → repeat until ✅
<HARD-GATE: spec before quality>
Do NOT dispatch code quality reviewer until spec compliance reviewer returns ✅ for this task.
Running quality review before spec compliance is the wrong order.
</HARD-GATE>
After spec ✅, dispatch a code quality reviewer subagent.
Provide:
git diff)Reviewer checks:
If quality issues found: implementer fixes → re-review → repeat until ✅
<HARD-GATE: task completion>
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE.
Do NOT mark task complete or move to next task until ALL confirmed in this session:
1. Spec compliance reviewer returned ✅
2. Code quality reviewer returned ✅
3. Identify what command proves this task is done (e.g. `bun test`, a CLI invocation, a script run).
4. Run it fresh — show the actual output. "Last run was fine" is not evidence.
5. Tests run fresh — actual output shows passing.
6. Implementer has committed (git log shows the commit).
7. `bash scripts/beads-context.sh update-progress <id> <task-num> <total> "<title>" <commit-sha> <test-count> <gate-count>` ran successfully (exit code 0). If it fails: STOP. Show error. Do not proceed to next task.
Forbidden phrases (these are not evidence):
- "should pass"
- "looks good"
- "seems to work"
</HARD-GATE>
Mark task complete. Move to next task.
After all tasks are complete (or BLOCKED):
Dispatch a final code reviewer for the full implementation:
If any decisions have Status=PENDING-DEVELOPER-INPUT:
⏸️ /dev blocked — developer input needed
The following decisions were deferred during implementation:
Decision 1: [gap description]
Task: Task N — <title>
Score: 11/14 (mandatory override: schema change)
Options considered: [A] vs [B]
Recommendation: [A] because [reason]
Blocked tasks: Task 6, Task 7 (depend on this decision)
Decision 2: ...
Please review and respond. After decisions are resolved, the implementer
will complete the blocked tasks and re-run spec + quality review.
Wait for developer input. After decisions resolved: implement blocked tasks → spec review → quality review → complete.
<HARD-GATE: /dev exit>
Do NOT declare /dev complete until:
1. All tasks are marked complete OR have BLOCKED status with PENDING-DEVELOPER-INPUT
2. BLOCKED decisions have been surfaced to developer and are awaiting input
3. Final code reviewer has approved (or issues fixed and re-reviewed)
4. All decisions in decisions log have Status of RESOLVED or PENDING-DEVELOPER-INPUT
5. No unresolved spec or quality issues remain
</HARD-GATE>
bash scripts/beads-context.sh validate <id>
bash scripts/beads-context.sh stage-transition <id> dev validate \
--summary "<N tasks done, M decision gates fired>" \
--decisions "<key spec gaps and how they were resolved>" \
--artifacts "<changed source files and test files>" \
--next "<validation priorities — lint issues, type concerns>"
The frequency of decision gates is a plan quality metric:
Document the gate count in the final commit message.
✓ Task 1: Types and interfaces — COMPLETE
Spec: ✅ Quality: ✅ Tests: 4/4 passing Commit: abc1234
Decision gates: 0
✓ Task 2: Validation logic — COMPLETE
Spec: ✅ Quality: ✅ Tests: 8/8 passing Commit: def5678
Decision gates: 1 (PROCEED, score 2 — documented in decisions log)
✓ Task 3: API endpoint — COMPLETE
Spec: ✅ Quality: ✅ Tests: 6/6 passing Commit: ghi9012
Decision gates: 0
✓ Final code review: ✅ (coherent, consistent, correctly integrated)
✓ Decisions log: docs/plans/2026-02-26-stripe-billing-decisions.md
- Decision 1: RESOLVED (score 2, proceeded with conservative choice)
- Decision gates fired: 1 (plan quality: Good)
✓ Beads updated: forge-xyz → implementation complete
Ready for /validate
Utility: /status → Understand current context before starting
Stage 1: /plan → Design intent → research → branch + worktree + task list
Stage 2: /dev → Implement each task with subagent-driven TDD (you are here)
Stage 3: /validate → Type check, lint, tests, security — all fresh output
Stage 4: /ship → Push + create PR
Stage 5: /review → Address GitHub Actions, Greptile, SonarCloud
Stage 6: /premerge → Update docs, hand off PR to user
Stage 7: /verify → Post-merge CI check on main
development
Pull issues, metrics, quality gates, and analysis data from SonarCloud. ALWAYS use this skill when the user mentions SonarCloud, asks about code quality metrics, wants to check PR quality gates, or needs to review security vulnerabilities and technical debt from static analysis. Also trigger during /review workflow when SonarCloud issues need addressing. Trigger on phrases like "SonarCloud", "quality gate", "code quality metrics", "technical debt", "coverage report", "static analysis issues", "security vulnerabilities from scan".
tools
Produces comprehensive research reports that go far beyond what built-in web search can achieve. Sends research tasks to Parallel AI's pro/ultra processors which spend 3-25 minutes autonomously crawling, reading, and synthesizing dozens of sources — returning structured reports with citations. Built-in WebSearch can only run a few queries; this skill runs an entire research pipeline externally. No binary install — requires PARALLEL_API_KEY in .env.local. ALWAYS use this skill instead of doing multiple WebSearch calls when the user needs a comprehensive report, market analysis, competitive landscape, industry deep-dive, strategic recommendations, or multi-source synthesis. This is the RIGHT tool for any research task that would require more than 3-4 web searches to answer properly. Also trigger during /plan Phase 2 research and /research workflows.
testing
Test content
testing
Test content