Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

michaelalber/tdd-agent

Name: tdd-agent
Author: michaelalber

skills/team/tdd-agent/SKILL.md

npx skillsauth add michaelalber/ai-toolkit tdd-agent

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

TDD Agent (Autonomous Mode)

"Make it work, make it right, make it fast — in that order." — Kent Beck

Core Philosophy

The TDD Agent operates autonomously through the complete TDD cycle. Unlike pair programming, the AI drives all phases. Stricter guardrails apply because there's no human catching mistakes in real-time.

Non-Negotiable Constraints:

Every implementation MUST have a failing test first
Every test MUST be verified to fail before implementation
Every refactoring MUST maintain green tests
Every phase transition MUST be explicitly logged

Kent Beck's 12 Test Desiderata (agent responsibilities) and the per-phase knowledge-base lookup protocol live in references/knowledge-lookups.md — consult it at session start and before each phase transition.

Knowledge Base Lookups

Use search_knowledge (grounded-code-mcp) to ground decisions in authoritative references. Search before each phase transition (RED→GREEN→REFACTOR) and cite the source path in phase logs. The full query→trigger table and the Kent Beck desiderata are in references/knowledge-lookups.md.

Workflow

Drive each behavior through three phases. Run and verify after every step — never assume.

RED — Write Failing Test
1. Identify smallest testable behavior
2. Write test for that behavior
3. RUN the test suite
4. VERIFY the new test fails
5. VERIFY failure is for the expected reason
6. Only then, proceed to GREEN

GREEN — Minimal Implementation
1. Review the failing test
2. Identify minimal code to pass
3. Implement ONLY what's needed
4. RUN the test suite
5. VERIFY all tests pass
6. Only then, proceed to REFACTOR

REFACTOR — Improve Structure
1. Confirm all tests pass
2. Identify ONE improvement
3. Make the change
4. RUN the test suite
5. VERIFY all tests still pass
6. If red, REVERT immediately
7. Repeat or proceed to next RED

Run the RED/GREEN/REFACTOR self-check at each transition; stop and correct if any item fails. The full self-check lists, the mandatory phase-log templates, and the explicit-reasoning template are in references/guardrails.md. A complete multi-iteration worked example (user-service feature) and a minimal Calculator walkthrough are in references/autonomous-protocol.md.

State Block

<tdd-state>
phase: [RED | GREEN | REFACTOR]
iteration: N
feature: [description]
current_test: [test name or none]
tests_passing: [true | false]
test_count: N
failing_count: N
last_verified: [timestamp or "just now"]
</tdd-state>

Each iteration closes with an updated <tdd-state> block and a mandatory phase-log entry.

Output Template

Phase logs (RED / GREEN / REFACTOR markdown templates) — references/guardrails.md.
Explicit reasoning (options → reasoning → choice at each decision point) — references/guardrails.md.
Session init, worked iteration example, completion summary — references/autonomous-protocol.md.

Guardrails

Four hard gates — see references/guardrails.md for implementation detail, violation responses, and the severity table.

No Implementation Without Failure Proof — verify a test exists, was just run, output shows failure, and the failure is logged. If any is missing, stop.
Verify Before Claiming — never claim a test passes or fails without running it and showing actual output.
Minimal Means Minimal — during GREEN, if a simpler or hardcoded solution would pass, simplify.
Rollback on Red — if tests fail during REFACTOR, revert immediately; never fix a broken refactor forward.

The AI discipline rules (Trust Nothing, Be Boringly Predictable, Fail Loudly, Prefer Smaller Steps) are in references/guardrails.md.

Integration with Other Skills

This skill is an operating mode of the canonical tdd loop, not a replacement for it.

tdd — The canonical inner loop this mode drives. Defines the two critical test properties (behavioral, structure-insensitive), the per-cycle self-check, the GREEN strategies (Fake It / Obvious / Triangulation, with per-language idioms in its references/), and the REFACTOR smell catalog (the tdd skill's references/code-smells.md and references/refactoring-catalog.md). Load those on demand during GREEN/REFACTOR.
evaluate-tests — Run after the session to audit test quality and TDD compliance (commit-history scorecard, anti-pattern detection).

Error Recovery

Common recovery cases (tests won't run, wrong test failure, can't make test pass, state confusion) and their step-by-step protocols are in references/autonomous-protocol.md. In every case: fix infrastructure before writing implementation, examine the actual error not the expected one, and reconstruct the state block from a full test-suite run when state is unclear.

michaelalber/tdd-agent

skills/team/tdd-agent/SKILL.md

Fully autonomous TDD with strict guardrails. Use when you want the AI to drive the entire RED-GREEN-REFACTOR cycle independently while maintaining TDD discipline.

1 stars

development

Updated Jun 27, 2026

$ install --global

skillsauth

npx skillsauth add michaelalber/ai-toolkit tdd-agent

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 27, 2026, 6:29 AM102.4s4 files scanned

SKILL.md

name:: tdd-agent
audience:: team
description:: Fully autonomous TDD with strict guardrails. Use when you want the AI to drive the entire RED-GREEN-REFACTOR cycle independently while maintaining TDD discipline.

TDD Agent (Autonomous Mode)

"Make it work, make it right, make it fast — in that order." — Kent Beck

Core Philosophy

Non-Negotiable Constraints:

Every implementation MUST have a failing test first
Every test MUST be verified to fail before implementation
Every refactoring MUST maintain green tests
Every phase transition MUST be explicitly logged

Knowledge Base Lookups

Workflow

Drive each behavior through three phases. Run and verify after every step — never assume.

RED — Write Failing Test
1. Identify smallest testable behavior
2. Write test for that behavior
3. RUN the test suite
4. VERIFY the new test fails
5. VERIFY failure is for the expected reason
6. Only then, proceed to GREEN

GREEN — Minimal Implementation
1. Review the failing test
2. Identify minimal code to pass
3. Implement ONLY what's needed
4. RUN the test suite
5. VERIFY all tests pass
6. Only then, proceed to REFACTOR

REFACTOR — Improve Structure
1. Confirm all tests pass
2. Identify ONE improvement
3. Make the change
4. RUN the test suite
5. VERIFY all tests still pass
6. If red, REVERT immediately
7. Repeat or proceed to next RED

State Block

<tdd-state>
phase: [RED | GREEN | REFACTOR]
iteration: N
feature: [description]
current_test: [test name or none]
tests_passing: [true | false]
test_count: N
failing_count: N
last_verified: [timestamp or "just now"]
</tdd-state>

Each iteration closes with an updated <tdd-state> block and a mandatory phase-log entry.

Output Template

Phase logs (RED / GREEN / REFACTOR markdown templates) — references/guardrails.md.
Explicit reasoning (options → reasoning → choice at each decision point) — references/guardrails.md.
Session init, worked iteration example, completion summary — references/autonomous-protocol.md.

Guardrails

Four hard gates — see references/guardrails.md for implementation detail, violation responses, and the severity table.

No Implementation Without Failure Proof — verify a test exists, was just run, output shows failure, and the failure is logged. If any is missing, stop.
Verify Before Claiming — never claim a test passes or fails without running it and showing actual output.
Minimal Means Minimal — during GREEN, if a simpler or hardcoded solution would pass, simplify.
Rollback on Red — if tests fail during REFACTOR, revert immediately; never fix a broken refactor forward.

The AI discipline rules (Trust Nothing, Be Boringly Predictable, Fail Loudly, Prefer Smaller Steps) are in references/guardrails.md.

Integration with Other Skills

This skill is an operating mode of the canonical tdd loop, not a replacement for it.

tdd — The canonical inner loop this mode drives. Defines the two critical test properties (behavioral, structure-insensitive), the per-cycle self-check, the GREEN strategies (Fake It / Obvious / Triangulation, with per-language idioms in its references/), and the REFACTOR smell catalog (the tdd skill's references/code-smells.md and references/refactoring-catalog.md). Load those on demand during GREEN/REFACTOR.
evaluate-tests — Run after the session to audit test quality and TDD compliance (commit-history scorecard, anti-pattern detection).

Error Recovery

Related Skills

michaelalber/grilling

development

VerifiedTrustedCommunity

Interviews the user relentlessly about a plan, decision, or idea — one question at a time, each with a recommended answer. Shared engine behind "grill-me" and "grill-with-docs". Use on any "grill" trigger phrase or to stress-test thinking. Do NOT use to build the plan; it ends at shared understanding, not implementation.

1SKILL.mdUpdated Jul 23, 2026

michaelalber/grilling

michaelalber/grill-with-docs

testing

VerifiedTrustedCommunity

Runs a relentless interview to sharpen a plan or design, capturing the decisions as ADRs and a glossary along the way. Use when the user wants to be grilled AND wants the session to leave durable domain documentation behind. Do NOT use for a throwaway stress-test with no artifacts; use grill-me instead.

1SKILL.mdUpdated Jul 23, 2026

michaelalber/grill-with-docs

michaelalber/vue-security-review

tools

VerifiedTrustedCommunity

OWASP-based security review of Vue/TypeScript front-ends. Detects framework (Vite/Vue CLI/Nuxt), entry points, and data flows; scans the OWASP Top 10 (2025) mapped to Vue client-side risks (raw-HTML XSS via v-html, URL/protocol injection, bundled secrets, insecure token storage, dependency CVEs, missing CSP, open redirects, router guard bypass); emits an exec summary plus graded findings. Use to audit Vue for vulnerabilities. Not for architecture grading (vue-architecture-checklist).

1SKILL.mdUpdated Jul 20, 2026

michaelalber/vue-security-review

michaelalber/vue-modernization-analyzer

tools

VerifiedTrustedCommunity

Analyzes legacy Vue codebases and produces actionable modernization plans. Primary migration paths include Options API to Composition API, Vue 2 to Vue 3, Vue CLI to Vite, JavaScript to TypeScript, Vue Test Utils/Karma/Mocha to Vitest + Vue Testing Library, legacy Vuex to Pinia, and removed-in-Vue-3 pattern cleanup (filters, event bus, `$listeners`). Does NOT perform the migration — assesses, quantifies risk, and plans.

1SKILL.mdUpdated Jul 20, 2026

michaelalber/vue-modernization-analyzer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/michaelalber/ai-toolkit.git

# Copy into Claude Code skills folder (global)
cp -r ai-toolkit/skills/team/tdd-agent ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

michaelalber/ai-toolkit

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT