plugins/tdd-pipeline/skills/tdd-orchestrate/SKILL.md
Enforce strict red-green-refactor TDD for any task. Use when implementing features, fixing bugs, adding functionality, or building new modules. Routes to the full 7-agent pipeline for new modules with 3+ behaviors, or runs inline red-green-refactor for bug fixes and small changes. Triggers on: "use TDD", "fix this bug", "add a feature", "implement", "run the pipeline", "TDD pipeline", "build a module with TDD", or any coding task in a project with TDD in its CLAUDE.md. Agents inherit the session model; pass `--model <name>` to pin one for the run.
npx skillsauth add kelp/kelp-claude-plugins tdd-orchestrateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Strict red-green-refactor for every code change.
Task: $0
If no task was provided, ask the user what to implement or fix. One sentence is enough.
Agents inherit your session model by default. Sonnet often fails the reviewer gates and loops many times, which costs more wall-clock and tokens than Opus one-shotting the stage — so prefer a capable model for this pipeline.
To pin a model regardless of session, pass --model <name> in the task (e.g. /tdd-orchestrate --model opus parser). Strip the flag from the task text before
briefing agents, then pass model: <name> to every
Agent dispatch in this skill (inline and full pipeline
alike). With no flag, omit model: so each agent
inherits the session model.
Use the full pipeline (see "Pipeline" section) when:
Use inline red-green-refactor when:
Most tasks are inline. Default to inline unless the scope clearly warrants the full pipeline.
Inline uses two agents and two commits. It skips the reviewer stages and the stub/RED-gate dance, because inline targets existing code where the test fails against the bug directly (no stub to typecheck against).
You are still a dispatcher. Do NOT write source or test files yourself. The orchestrator's job here is:
Use SendMessage to continue agents across stages when useful (see "Continuation Strategy" below).
Read the relevant source and test files just enough to
write a precise agent brief. Cite path:line for the
target. Identify the existing test pattern in the file
so the agent matches it. Do NOT write a plan document.
Dispatch subagent_type: tdd-pipeline:test-writer
with:
Run the project's test command yourself, confirm the
new test fails for the right reason (not a compile
error), commit the test with a message like
Test ... (RED) or Add failing test for ....
Watch for default-value traps. If a test asserts a falsy value (false, nil, 0, "") that the existing buggy code already returns, the test passes for the wrong reason. Reject and re-dispatch with a brief naming the specific input that should produce a non-default result.
The test must fail for the RIGHT reason. A compile error is not a valid red state — surface it and ask the test-writer to fix the stubs.
Dispatch subagent_type: tdd-pipeline:implementer
with:
Run the project's test command yourself, confirm all tests pass (not just the new one), commit with a message describing the fix.
If the code is clear and clean, skip this step. If you refactor, dispatch a fresh implementer with a brief naming the specific cleanup; do not extend the GREEN agent's scope.
If the task requires multiple bugs, repeat from step 1 for each. One bug = one RED + GREEN pair.
Inline does NOT use the test-reviewer or code-reviewer stages. The full test suite is your safety net. If a bug is genuinely subtle (API design, security boundary, concurrency), promote it to the full pipeline instead — don't bolt reviewers onto inline.
These are real mistakes from past sessions. Each one wastes significant time.
Don't test duplicated logic. Your test must import and call the production code. If your test reimplements the logic it's supposed to verify, it proves nothing.
Don't skip the red step. Every test must fail before you write implementation code. Writing the test and implementation together means you don't know whether the test catches regressions.
Don't get stuck planning. The plan is: write a failing test. If you've spent more than 2 minutes without creating or editing a test file, you're stalling.
Don't assert default values against stubs. If your test checks that a function returns false and the stub returns false by default, the test passes without any implementation. Either test for a truthy/non-default value, use inputs that force a non-default result, or make the stub return a deliberately wrong value so the test fails until real logic exists.
Don't test the wrong file descriptor. When testing TTY behavior, verify which fd (stdin, stdout, stderr) the code actually checks.
Don't mock what you can call. If the real code is available and fast, call it. Mocks diverge from production and hide bugs.
Use this section when the routing decision above chose the full pipeline. For inline tasks, ignore everything below.
If the task specifies a module name, use it. Otherwise, ask the user for the module name and behavior list.
Example: /tdd-orchestrate parser
You are a PURE DISPATCHER. You NEVER write code.
Violations you MUST NOT commit:
What you DO:
If you catch yourself about to use Write or Edit on a source or test file, STOP. Dispatch an agent.
These are violations of the full pipeline. If inline is the correct routing (see "Decide" above), several are legitimate inline behaviors; promote to the full pipeline only when scope justifies it.
In all cases: if you'd be using Write or Edit yourself on a source or test file, STOP. Dispatch an agent.
Use the Agent tool with one of the plugin's four role-specific agent types. The role instructions are already baked into each agent's system prompt — do NOT read or inject skill content; pass only the module name, behavior list, type signatures, and other dispatch inputs.
Agent types:
subagent_type: tdd-pipeline:test-writer —
writes tests and minimal type stubssubagent_type: tdd-pipeline:test-reviewer —
reviews tests (read-only)subagent_type: tdd-pipeline:implementer —
writes implementation code to pass testssubagent_type: tdd-pipeline:code-reviewer —
reviews implementation (read-only)The test-writer and implementer agents bundle the file/shell/quality briefing; the reviewers do not, since reviewers never write files.
If the task carried a --model <name> flag (see
"Model"), pass model: <name> on every Agent dispatch
below. Otherwise omit it and let agents inherit the
session model.
Default to SendMessage, not fresh Agent dispatches, inside any fix loop.
When a reviewer reports NEEDS_FIXES, the just-finished writer agent still exists. Continuing it with SendMessage preserves all the context it has already built up — the file layout it learned, the design doc it read, the tests it just wrote. A fresh Agent dispatch re-pays all that cost.
Use SendMessage when:
Use a fresh Agent dispatch when:
Each completion notification reports the agent ID and explicitly says "use SendMessage with to: '<id>' to continue this agent." Capture and use it.
Agents pay a cold-start cost: they read CLAUDE.md, grep the codebase, re-discover the layout you already know. Every fact you inline in the brief is a tool call the agent doesn't have to make.
Inline rather than reference, within reason:
path:line for known targets. The agent goes
straight there; no grep dance.std.Io.File.stdout(), not std.io.getStdOut()").Don't inline indiscriminately:
path:line:line
span and one sentence suffice.Cap exploration:
Tell agents "don't read more than N files; if you can't find what you need, report back." Prevents 30-tool-call discovery hikes when your brief was incomplete.
Trust agent verification:
If the agent verified the tests pass and reported the counts, spot-check by running the test command once yourself — don't ask the agent to re-verify. Trust but verify.
Read the project's CLAUDE.md for test commands, file
paths, and language-specific context. Every value
below marked with (CLAUDE.md) must come from there.
Dispatch subagent_type: tdd-pipeline:test-writer
with:
The agent writes the test file and type stubs to the source file path (CLAUDE.md). Stubs contain only signatures -- no real logic.
Dispatch subagent_type: tdd-pipeline:test-reviewer
with:
Fix loop: if NEEDS_FIXES, use SendMessage to continue the original test-writer agent with the reviewer's feedback as the fix list (see "Continuation Strategy"). The writer already has the design and file context — a fresh dispatch re-pays that cost. Then re-dispatch the test-reviewer (clean perspective on the fixed tests). Max 3 rounds, then escalate to user.
Run the module test command (CLAUDE.md).
Only proceed when tests compile and all fail.
Dispatch subagent_type: tdd-pipeline:implementer
with:
The agent replaces the stub source file with the real implementation to make all tests pass.
Run these checks yourself (do NOT dispatch an agent):
If any check fails: use SendMessage to continue the implementer with the specific failure (see "Continuation Strategy"). Do NOT waste a reviewer dispatch and do NOT spawn a fresh implementer — the one that just finished still has the file loaded.
Dispatch subagent_type: tdd-pipeline:code-reviewer
with:
Fix loop: if NEEDS_FIXES, use SendMessage to continue the original implementer agent with the reviewer's feedback as the fix list (see "Continuation Strategy"). The implementer already has the test file and implementation context loaded — a fresh dispatch re-pays that cost. Then re-dispatch the code-reviewer (clean perspective on the fixed code). Max 3 rounds, then escalate to user.
After code reviewer approves:
tools
Correct Zig 0.15.x patterns for I/O, ArrayList, format strings, and build.zig. Use when writing or reviewing any Zig code -- Claude's training data is outdated for these APIs.
tools
Add Zig 0.15.x training corrections to this project's CLAUDE.md. Run this in any Zig project to fix Claude's outdated patterns for I/O, ArrayList, format strings, build.zig, BoundedArray, and usingnamespace.
tools
Audit Zig source files for Zig 0.15.x mistakes -- checks for removed APIs (getStdOut, usingnamespace, BoundedArray, async), missing flush, wrong ArrayList usage, ambiguous format strings, signed division, and renamed stdlib functions.
tools
Tiger Style rules for Zig: assertions (2+ per fn, paired positive/negative space), bounded loops (no recursion), static memory after init, snake_case naming with unit suffixes, 70-line function limit, 100-column line limit, zig fmt. Use when writing or reviewing Zig in a project that follows Tiger Style.