skills/tdd-refactor/SKILL.md
Safely refactor code while keeping tests green (REFACTOR phase). Use when tests pass and you want to improve code structure without changing behavior. Do NOT use when tests are red; do NOT use to fix bugs or add new behavior — return to RED phase first.
npx skillsauth add michaelalber/ai-toolkit tdd-refactorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
"Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior." — Martin Fowler
The REFACTOR phase improves code quality while maintaining behavior. Tests are your safety net — if they stay green, the behavior is preserved.
The Green-to-Green Rule:
Use search_knowledge (grounded-code-mcp) to ground decisions in authoritative references.
| Query | When to Call |
|-------|--------------|
| search_knowledge("refactoring code smells catalog Martin Fowler") | When identifying smells — authoritative smell catalog and refactoring recipes |
| search_knowledge("extract method extract class refactoring patterns") | When choosing a refactoring — specific mechanics and safe-step guidance |
| search_knowledge("SOLID principles single responsibility open closed") | When evaluating design after refactoring — check alignment with SOLID |
| search_knowledge("duplication DRY principle abstraction") | When addressing duplication — confirms when extraction is appropriate vs. premature |
| search_knowledge("cyclomatic complexity maintainability index metrics") | When using quality metrics to guide refactoring scope |
Protocol: Search when a smell is identified but the correct refactoring recipe is uncertain. Cite the source path in your response.
| Property | REFACTOR Phase Application | |----------|---------------------------| | Structure-insensitive | Tests should survive refactoring | | Behavioral | Tests verify behavior, not implementation | | Fast | Quick feedback for rapid refactor cycles | | Isolated | Changes shouldn't cascade test failures | | Inspiring | Confidence to make changes |
BEFORE any refactoring:
Pre-Flight Verification:
├── All tests passing?
│ └── NO → STOP. Still in GREEN phase.
│ └── YES → Continue
├── Tests cover the code being refactored?
│ └── NO → Consider adding characterization tests first
│ └── YES → Continue
├── Working state committed/saved?
│ └── NO → Commit or stash current state
│ └── YES → Proceed to refactor
1. Identify smell or improvement
2. Plan smallest step
3. Make the change
4. Run tests
5. If GREEN: commit (optional) or continue
6. If RED: revert immediately
7. Repeat until satisfied
| Smell | Description | Refactoring | |-------|-------------|-------------| | Duplication | Same code in multiple places | Extract Method/Function | | Long Method | Method doing too much | Extract Method | | Long Parameter List | Too many parameters | Introduce Parameter Object | | Feature Envy | Method uses another class's data extensively | Move Method | | Data Clumps | Same data groups appear together | Extract Class | | Primitive Obsession | Overuse of primitives | Replace with Value Object | | Divergent Change | One class changed for multiple reasons | Extract Class | | Shotgun Surgery | One change requires many class edits | Move Method/Field | | Magic Numbers | Unexplained numeric literals | Extract Constant | | Dead Code | Unused code | Delete it |
Before:
def process_order(order):
# validate
if order.total < 0:
raise ValueError("Invalid total")
if not order.items:
raise ValueError("No items")
# calculate tax
tax = order.total * 0.1
# apply discount
if order.total > 100:
discount = order.total * 0.05
else:
discount = 0
return order.total + tax - discount
After (one step at a time):
def process_order(order):
validate_order(order)
tax = calculate_tax(order)
discount = calculate_discount(order)
return order.total + tax - discount
def validate_order(order):
if order.total < 0:
raise ValueError("Invalid total")
if not order.items:
raise ValueError("No items")
def calculate_tax(order):
return order.total * 0.1
def calculate_discount(order):
if order.total > 100:
return order.total * 0.05
return 0
Key: Extract one function at a time, run tests between each.
Before:
function calc(x: number, y: number): number {
return x + y;
}
After:
function calculateSum(firstNumber: number, secondNumber: number): number {
return firstNumber + secondNumber;
}
Key: Use IDE refactoring tools when available. Run tests after.
Before:
if user.age >= 18 and user.country == "US" and user.has_id:
allow_purchase()
After:
is_adult = user.age >= 18
is_us_resident = user.country == "US"
has_valid_id = user.has_id
if is_adult and is_us_resident and has_valid_id:
allow_purchase()
Before:
const basePrice = product.price;
const result = basePrice * quantity;
return result;
After:
return product.price * quantity;
Key: Only inline if it improves clarity.
Before:
def calculate_pay(employee):
if employee.type == "hourly":
return employee.hours * employee.rate
elif employee.type == "salaried":
return employee.salary / 12
elif employee.type == "contractor":
return employee.invoice_amount
After:
class HourlyEmployee:
def calculate_pay(self):
return self.hours * self.rate
class SalariedEmployee:
def calculate_pay(self):
return self.salary / 12
class Contractor:
def calculate_pay(self):
return self.invoice_amount
Key: This is a larger refactoring. Do it in steps, with tests between each.
Run all tests, confirm passing.
Pick the smallest, most obvious improvement:
Apply the refactoring. Be surgical.
Execute the full test suite immediately.
Continue improving, or decide "good enough" and exit to next RED.
### REFACTOR Phase
**Starting state**: All tests passing (N tests)
**Smell identified**: [e.g., Duplication in calculate methods]
**Refactoring**: [e.g., Extract Method - create shared calculation helper]
**Change made**:
[brief description or code diff]
**Verification**:
- Tests run: Yes
- Result: All passing (N tests)
- Behavior preserved: Yes
**Continue refactoring?**: [Yes - next smell | No - code is clean]
<tdd-state>
phase: REFACTOR (or RED if done)
...
</tdd-state>
Never make a change that breaks tests:
During refactoring:
Resist combining refactorings:
If tests pass, the refactoring is valid. If tests fail, the refactoring is invalid (or tests are implementation-coupled).
Refactoring can continue indefinitely. Stop when:
See reference files for language-specific patterns:
tdd-cycle — Invokes this skill after GREEN is confirmed; transitions back to RED when REFACTOR is completetdd-agent — Calls this skill during the REFACTOR phase of its autonomous cycletdd-pair — Both partners use this skill together during the shared REFACTOR step in ping-pong roundstdd-implementer — Precedes this skill; the GREEN phase produces working but potentially unclean code that this skill improvestdd-verify — Can audit whether refactorings preserved behavioral tests or inadvertently introduced implementation coupling| Error | Why It's Wrong | Correct Approach | |-------|----------------|------------------| | Refactoring with red tests | No safety net | Fix tests first | | Multiple changes at once | Can't isolate failures | One at a time | | Fixing bugs during refactor | Changes behavior | Write test for bug first | | Adding features during refactor | Changes behavior | Complete cycle, then new RED | | Not running tests after each change | Delayed feedback | Test after every change |
If tests fail after a refactoring:
First instinct: REVERT
git checkout or undoAnalyze why:
If test is implementation-coupled:
Try again with smaller steps:
development
Federal / government security overlay applied ON TOP OF a base language security review (dotnet/python/php/rust/react). Language-agnostic: adds NIST SP 800-53 control mapping, FIPS 140-2/3 cryptographic compliance (with a per-language crypto table), CUI handling, EO 14028 supply-chain requirements, and DOE Order 205.1B, and emits POA&M-ready findings with FIPS 199 impact levels. Use for federal/DOE/DOD/national-laboratory systems. Triggers on "federal security review", "NIST compliance", "NIST 800-53", "FISMA", "CUI", "FIPS audit", "DOE security", "POA&M", "ATO review". Do NOT use alone — run the matching <lang>-security-review FIRST; this overlay maps and extends it.
tools
OWASP-based security review of React / TypeScript front-end applications. Detects the framework (Vite/CRA/Next), entry points, and data flows, scans against the OWASP Top 10 (2025) mapped to React client-side patterns (XSS via raw HTML, URL/protocol injection, secrets in the bundle, insecure token storage, dependency CVEs, missing CSP, open redirects), and produces a manager-friendly executive summary plus a graded technical findings table. Use to audit React code for vulnerabilities. Triggers on "react security review", "frontend security audit", "audit react for vulnerabilities", "owasp react", "react xss", "react security posture", "npm audit review". For federal / gov / DOE / NIST / FIPS / CUI context, run security-review-federal after this base review. Do NOT use to grade architecture/structure — use react-architecture-checklist.
tools
Analyzes legacy React codebases and produces actionable modernization plans. Primary migration paths include class components to function components + hooks, Create React App to Vite, React 16/17 to 18 to 19, JavaScript to TypeScript, Enzyme to React Testing Library, legacy Redux to Redux Toolkit / Zustand / Context, and deprecated lifecycle/API removal. Does NOT perform the migration — assesses, quantifies risk, and plans. Triggers on phrases like "modernize react", "class to hooks", "upgrade react", "migrate CRA to vite", "react legacy migration", "react 17 to 18", "react js to typescript", "react technical debt", "enzyme to RTL".
development
Scaffolds feature-based React / TypeScript architecture using feature folders, presentational + container components, custom hooks, a typed data layer, and structural CQRS (query hooks vs mutation hooks). React analog of dotnet-vertical-slice and python-feature-slice — no DI framework; uses props/context for dependency injection and a query cache for server state. Use when creating feature-based React projects, adding React features, organizing components by feature rather than by technical type, or scaffolding a feature's data layer. Triggers on phrases like "scaffold react feature", "create react slice", "react feature folder", "react vertical slice", "add react feature", "react feature architecture", "organize react by feature".