skills/meta/testing-skills-with-subagents/SKILL.md
Use to validate process documentation. Apply TDD to skill writing: RED (run without skill, document failures) → GREEN (write skill) → REFACTOR (close loopholes). Test under pressure: time constraints, sunk cost, exhaustion, authority.
npx skillsauth add liauw-media/codeassist testing-skills-with-subagentsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Skills are process documentation. Like code, they need tests. Use TDD to validate skills actually work under pressure.
NO SKILL WITHOUT A FAILING TEST FIRST.
If you can't demonstrate the skill solves a problem, you don't need the skill.
Benefits: ✅ Proves skill actually helps ✅ Finds gaps and loopholes ✅ Validates under pressure ✅ Creates realistic examples ✅ Builds confidence in skill
Without testing: ❌ Untested assumptions ❌ Skill might not work ❌ Loopholes undiscovered ❌ No proof of value ❌ False confidence
🔴 RED Phase: Establish baseline
Scenario: Debug intermittent test failure without root-cause-tracing skill
Setup:
- Fresh subagent
- Give debugging task
- Do NOT provide root-cause-tracing skill
- Observe behavior
Task given to subagent:
---
You are debugging a test that fails intermittently (1 in 20 runs).
Test:
```php
public function test_order_total()
{
$order = Order::factory()->create();
$order->addItem(['price' => 10, 'qty' => 2]);
$this->assertEquals(20, $order->total());
// Sometimes fails: Expected 20, got 0
}
Observed behavior (WITHOUT skill):
FAILURES DOCUMENTED:
✅ Baseline failures documented Ready for GREEN phase
### GREEN: Write Skill to Address Failures
🟢 GREEN Phase: Create skill
Based on RED phase failures, write skill that addresses:
NEVER STOP AT THE SYMPTOM. Trace backward until you find the ORIGINAL TRIGGER.
Skill written ✅ Ready to test if it works
### REFACTOR: Test with Skill, Close Loopholes
🔵 REFACTOR Phase: Test skill and refine
Setup:
Observed behavior (WITH skill):
Setup:
Observed behavior:
FAILURE! Skill failed under time pressure.
LOOPHOLE FOUND: Skill doesn't address time pressure.
Fix: Add to skill: "Time pressure is when you MOST need root cause tracing. Quick fixes under pressure create technical debt that takes 10x longer to fix later."
Setup:
Observed behavior:
Setup:
Observed behavior:
FAILURE! Skill failed under exhaustion.
LOOPHOLE FOUND: Skill doesn't address exhaustion.
Fix: Add to skill: "When exhausted, your judgment is impaired. This is when you MOST need to follow the process systematically. The process protects you when judgment fails."
Setup:
Observed behavior:
All pressure scenarios tested ✅ Loopholes found and fixed ✅ Skill ready for use ✅
## The Four Pressure Scenarios
### Pressure 1: Time Constraints
Scenario: "We need this fixed in 15 minutes"
Without skill:
Test approach:
Skill must include: "Time pressure is when you MOST need systematic approach. Quick fixes take 10x longer to fix later."
### Pressure 2: Sunk Cost
Scenario: "You've already spent 3 hours on this"
Without skill:
Test approach:
Skill must include: "Sunk cost is irrelevant. What matters: doing it right vs. doing it twice. Follow the process."
### Pressure 3: Exhaustion
Scenario: "You've been working for 8 hours straight"
Without skill:
Test approach:
Skill must include: "Exhaustion impairs judgment. Process protects you when judgment fails. Follow it systematically."
### Pressure 4: Authority
Scenario: "Boss/client demands quick fix"
Without skill:
Test approach:
Skill must include: "Authority pressure needs thoughtful response: 'Quick fix now = 10x work later. Let me do this right, it'll take [time] and prevent future issues.'"
## Complete Testing Process
### Step 1: Identify Problem
Problem observed: Subagents fixing symptoms instead of root causes
Evidence:
Need: Skill for root cause tracing
### Step 2: RED - Establish Baseline
Create 3-5 scenarios:
For each scenario:
Common failures found:
Baseline documented ✅
### Step 3: GREEN - Write Skill
Based on failures, write skill:
Must address:
Skill structure:
Skill written ✅
### Step 4: REFACTOR - Test and Refine
Test with skill:
Expected improvement:
Improvement verified ✅
Test pressure scenarios:
Loopholes found:
Update skill to close loopholes ✅
Re-test pressure scenarios:
All scenarios passing ✅ Skill validated ✅
### Step 5: Document Test Results
Test Report: root-cause-tracing skill
Scenarios tested: 9 total
RED phase results:
GREEN phase results:
REFACTOR phase results: Initial test: 4/4 passed Pressure test (v1): 2/4 passed Pressure test (v2): 4/4 passed
Loopholes found and fixed: 2
Skill validated ✅ Ready for use ✅
## Real-World Example: Testing TDD Skill
🔴 RED Phase
Scenario: Add new feature without TDD skill
Task to subagent: "Add user profile update endpoint"
Observed behavior:
Failures documented:
🟢 GREEN Phase
Write TDD skill addressing failures:
Skill written ✅
🔵 REFACTOR Phase
Test with skill:
Improvement verified ✅
Pressure test - Time constraint: "Add feature in 30 minutes" Result: ❌ Skips tests, writes code first Rationalization: "No time for TDD"
LOOPHOLE! Add to skill: "TDD seems slower but is faster. Bugs caught immediately, not after deployment. Follow process."
Update skill ✅
Re-test with time pressure: Result: ✅ Follows TDD despite pressure Explanation given: "TDD prevents bugs, saves time"
Skill validated ✅
## Skill Testing Checklist
For each skill:
- [ ] 3-5 baseline scenarios created
- [ ] RED: Tested without skill
- [ ] Failures documented
- [ ] GREEN: Skill written to address failures
- [ ] REFACTOR: Tested with skill
- [ ] Improvement verified
- [ ] Time pressure scenario tested
- [ ] Sunk cost scenario tested
- [ ] Exhaustion scenario tested
- [ ] Authority scenario tested
- [ ] Loopholes found and fixed
- [ ] Re-tested after updates
- [ ] All scenarios pass
- [ ] Test results documented
## Integration with Skills
**Required for:**
- `writing-skills` - Validate skills before documenting
- `sharing-skills` - Test before contributing upstream
**Use with:**
- `subagent-driven-development` - Each test is a subagent task
**Testing skills:**
- This skill validates other skills
- Ensures skills actually work
- Finds gaps before production use
## Common Mistakes
### Mistake 1: Testing Without Pressure Scenarios
❌ BAD: Only test happy path scenarios Skip pressure testing Assume skill will hold up
Result: Skill fails when it matters most
✅ GOOD: Test under all four pressures Find loopholes Strengthen skill Ensure it works when needed
### Mistake 2: Not Establishing Baseline
❌ BAD: Write skill without RED phase No proof it solves problem Assume problem exists
Result: Might not need the skill
✅ GOOD: RED: Document failures without skill Proves skill is needed Identifies what to address Creates realistic examples
### Mistake 3: Stopping at First Pass
❌ BAD: Skill passes basic test Don't test pressure scenarios Ship it
Result: Loopholes discovered in production
✅ GOOD: Test basic scenarios Test pressure scenarios Find loopholes Fix loopholes Re-test Only then ship
## Authority
**This skill is based on:**
- Test-Driven Development applied to process documentation
- RED/GREEN/REFACTOR cycle (Kent Beck)
- Pressure testing from aviation and medical fields
- Quality assurance best practices
**Research**: Studies show tested process documentation is followed 3x more often than untested.
**Parallel**: Just as code needs tests, skills need tests. Same principles apply.
## Your Commitment
When writing skills:
- [ ] I will test skills before using them
- [ ] I will establish baseline (RED phase)
- [ ] I will write skill to address failures (GREEN)
- [ ] I will test under pressure (REFACTOR)
- [ ] I will find and close loopholes
- [ ] I will document test results
- [ ] I will only share tested skills
---
**Bottom Line**: Skills are process documentation. Test them like code. RED (document failures) → GREEN (write skill) → REFACTOR (test under pressure). Find loopholes before they find you.
development
Use when decomposing complex work. Dispatch fresh subagent per task, review between tasks. Flow: Load plan → Dispatch task → Review output → Apply feedback → Mark complete → Next task. No skipping reviews, no parallel dispatch.
development
# Server Documentation System Set up a documentation system that tracks changes and maintains server/project documentation with Claude Code hooks. ## When to Use - Setting up a new server or development environment - Need to track configuration changes over time - Want automatic documentation of work sessions - Maintaining changelog for infrastructure ## Directory Structure ``` ~/docs/ # User home directory (cross-platform) ├── changelog.md # Global over
development
Delegate tasks to remote Claude Code agent containers for parallel execution, long-running analysis, or resource-intensive operations.
development
Use when working on multiple features simultaneously. Creates isolated workspaces without branch switching, enabling parallel development.