plugins/dev/skills/discipline/systematic-debugging/SKILL.md
Use when debugging failures, errors, or unexpected behavior. Covers root cause investigation, data flow tracing, hypothesis-driven debugging, and fix verification to prevent trial-and-error approaches.
npx skillsauth add madappgang/claude-code systematic-debuggingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Iron Law: "NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST"
Use this skill when:
Detect these patterns that indicate skipping root cause investigation:
Symptom: What you observe (test fails, error thrown, wrong output) Root Cause: Why it happens (null value, wrong condition, missing await)
Example:
Symptom: "TypeError: Cannot read property 'name' of undefined"
Root Cause: API returns null when user not found, but code expects object
Bad approach: Add user?.name (fixes symptom, not cause)
Good approach: Add validation if (!user) throw new NotFoundError() (fixes cause)
Principle: Follow data from source to error point
Steps:
Example:
Error: "Expected 'active' but got 'inactive'"
Location: user.test.ts:42 - expect(user.status).toBe('active')
Data: user.status = 'inactive'
Trace: user.status ← updateUser() ← API response ← database
Divergence: Database has status='inactive' (expected 'active')
Root Cause: Test setup didn't create user with active status
Principle: Form hypothesis, test with evidence, refine
Process:
Example:
Symptom: API request times out after 30s
Hypothesis 1: Database query is slow
Prediction: Should see long query time in logs
Test: Add query timing logs
Result: Queries complete in <100ms ✗ Hypothesis rejected
Hypothesis 2: Network connection is hanging
Prediction: Should see connection delay, not query delay
Test: Add request timing logs (connect time vs. query time)
Result: Connection takes 31s, query never runs ✓ Hypothesis confirmed
Root Cause: Firewall blocks connection, causing timeout
Principle: Verify fix addresses root cause, not just symptom
Checklist:
Objective: Identify where actual diverges from expected
Steps:
Example (TypeScript):
// Error: "Expected user email, got undefined"
// Stack trace: user-service.ts:42
// Phase 1: Trace data flow
console.log('1. API response:', response); // { data: { user: {...} } }
console.log('2. Extracted user:', response.data); // { user: {...} }
console.log('3. User object:', response.data.user); // { id: 1, name: 'Alice' }
console.log('4. Email field:', response.data.user.email); // undefined
// Divergence found: response.data.user has no email field
Objective: Determine why actual differs from expected
Questions:
Example (Python):
# Expected: parse_csv() returns list of dicts with 'email' key
# Actual: parse_csv() returns list of dicts without 'email' key
# Check input CSV file
with open('users.csv') as f:
print(f.readline()) # id,name,phone ← Missing 'email' column!
# Divergence: CSV file format changed, missing 'email' column
Objective: Form testable hypothesis about why divergence occurred
Hypothesis Template:
"I believe [divergence] occurs because [root cause].
If this is true, I should see [evidence].
I can test this by [action]."
Example (Go):
// Divergence: user.Email is empty string when fetched from cache
// Hypothesis 1: Cache serialization drops empty fields
// Evidence: Other empty fields (phone, address) also missing
// Test: Check cached JSON structure
// Result: {"id":1,"name":"Alice"} ← Empty fields missing ✓
// Root Cause: JSON serialization omits empty fields (omitempty tag)
Objective: Confirm fix addresses root cause
Verification Steps:
Example (TypeScript):
// Root Cause: JSON serialization omits fields with undefined values
// Before fix:
JSON.stringify({ id: 1, email: undefined }) // {"id":1}
// Fix: Filter out undefined before serialization
const filtered = Object.fromEntries(
Object.entries(user).filter(([_, v]) => v !== undefined)
);
// Verification:
expect(filtered).toEqual({ id: 1 }); // ✓ Correct behavior
expect(JSON.stringify(filtered)).toBe('{"id":1}'); // ✓ Serialized correctly
Strategy: Identify assertion, trace data, find divergence
// Test fails: expect(result).toBe(5)
test('calculates total', () => {
const result = calculateTotal([1, 2, 2]);
console.log('Input:', [1, 2, 2]); // Input data
console.log('Expected:', 5); // Expected result
console.log('Actual:', result); // Actual result (6)
console.log('Divergence:', result - 5); // Difference (1)
expect(result).toBe(5);
});
// Trace: calculateTotal() sums array incorrectly
// Root Cause: Off-by-one error in loop (includes index 0 twice)
Strategy: Profile execution, identify bottleneck
import time
def slow_function():
start = time.time()
# Phase 1: Identify slow section
data = fetch_data() # 0.1s
print(f"Fetch: {time.time() - start:.2f}s")
processed = process_data(data) # 5.2s ← Bottleneck!
print(f"Process: {time.time() - start:.2f}s")
save_data(processed) # 0.05s
print(f"Save: {time.time() - start:.2f}s")
# Root Cause: process_data() has O(n²) algorithm
Strategy: Trace data mutations, find unexpected write
// Symptom: User email changes unexpectedly
// Phase 1: Add logging to all mutation points
func UpdateUser(user *User) {
log.Printf("Before: %+v", user)
user.Email = normalizeEmail(user.Email)
log.Printf("After normalize: %+v", user)
db.Save(user)
log.Printf("After save: %+v", user)
}
// Logs show: Email changes in normalizeEmail()
// Root Cause: normalizeEmail() lowercases domain incorrectly
Strategy: Read stack trace, identify throw location, trace backwards
// Error: "TypeError: Cannot read property 'length' of null"
// Stack trace:
// at validateInput (validator.ts:12)
// at handleSubmit (form.ts:45)
// at onClick (button.tsx:8)
// Phase 1: Find throw location (validator.ts:12)
function validateInput(input: string) {
if (input.length < 3) { // ← Line 12, input is null
throw new Error('Too short');
}
}
// Phase 2: Trace backwards (form.ts:45)
function handleSubmit() {
const input = getInputValue(); // Returns null when field empty
validateInput(input); // ← Passes null to validateInput
}
// Root Cause: getInputValue() returns null, but validateInput expects string
// Fix: Add null check or change return type to empty string
Strategy: Identify last working state, compare changes
# Find last working commit
git bisect start
git bisect bad HEAD # Current state (broken)
git bisect good v1.2.0 # Last known working version
# Git bisect identifies commit abc123 as first bad commit
git show abc123 # Shows changes
# Root Cause: Commit abc123 changed API response format
Symptom:
// Test fails: "Expected button to be disabled"
test('disables submit when invalid', () => {
render(<Form />);
const button = screen.getByRole('button');
expect(button).toBeDisabled(); // ✗ Fails, button is enabled
});
Phase 1: Trace Data Flow
// Add logging to Form component
function Form() {
const [isValid, setIsValid] = useState(false);
console.log('isValid:', isValid); // false (expected)
return (
<button disabled={!isValid}>Submit</button>
// disabled={!false} → disabled={true} → Button should be disabled
);
}
Phase 2: Identify Divergence
// Check actual DOM state
const button = screen.getByRole('button');
console.log('Disabled attribute:', button.disabled); // false (actual)
console.log('Button HTML:', button.outerHTML);
// <button>Submit</button> ← Missing 'disabled' attribute!
Phase 3: Hypothesize Root Cause
// Hypothesis: disabled={!isValid} is not setting attribute
// Evidence: Check if React is updating DOM correctly
// Test: Add explicit disabled={true} to verify React works
return <button disabled={true}>Submit</button>;
// Result: Button is NOW disabled ✓
// Conclusion: disabled={!isValid} expression is wrong
Phase 4: Verify Fix
// Root Cause: !isValid evaluates to true, but disabled expects boolean
// Wait... that IS boolean. Let me re-check the initial state.
// Re-trace: Where is isValid initialized?
const [isValid, setIsValid] = useState(false); // ✓ Correct
// Check component rendering
console.log('Rendering with isValid:', isValid);
// Logs: "Rendering with isValid: undefined" ← FOUND IT!
// Root Cause: useState(false) runs AFTER first render in test
// Fix: Set initial state before render (or use initialProps)
test('disables submit when invalid', () => {
render(<Form initialValid={false} />); // ✓ Now works
});
Symptom:
# API request times out after 30 seconds
response = requests.get('https://api.example.com/users')
# Raises: requests.exceptions.Timeout
Phase 1: Trace Data Flow
import time
start = time.time()
try:
response = requests.get('https://api.example.com/users', timeout=30)
print(f"Request took: {time.time() - start:.2f}s")
except requests.exceptions.Timeout:
print(f"Timeout after: {time.time() - start:.2f}s") # 30.01s
Phase 2: Identify Divergence
# Expected: Request completes in <1s (normal API response time)
# Actual: Request times out at 30s
# Divergence: Something delays request for 30+ seconds
# Hypothesis: Network issue, DNS resolution, or connection hang
# Test: Add connection timing
response = requests.get(
'https://api.example.com/users',
timeout=(5, 30) # (connect timeout, read timeout)
)
# Result: Raises timeout after 5s ← Connection timeout!
Phase 3: Hypothesize Root Cause
# Hypothesis: DNS resolution fails or connection refused
# Test: Try IP address instead of domain
response = requests.get('http://192.168.1.100/users')
# Result: Works! ✓
# Root Cause: DNS resolution for api.example.com fails
# Verification: Check DNS
import socket
socket.gethostbyname('api.example.com') # Raises: gaierror (DNS failure)
Phase 4: Verify Fix
# Fix: Use IP address or fix DNS configuration
# Updated code:
API_HOST = os.getenv('API_HOST', '192.168.1.100')
response = requests.get(f'http://{API_HOST}/users')
# Verification:
assert response.status_code == 200 # ✓ Works
assert response.elapsed.total_seconds() < 1 # ✓ Fast
Symptom:
// User email is corrupted after update
user := User{ID: 1, Email: "[email protected]"}
UpdateUser(&user)
fmt.Println(user.Email) // Prints: "[email protected]" (unexpected lowercase)
Phase 1: Trace Data Flow
func UpdateUser(user *User) {
log.Printf("Before: %+v", user) // Email: "[email protected]"
user.Email = normalizeEmail(user.Email)
log.Printf("After normalize: %+v", user) // Email: "[email protected]"
db.Save(user)
log.Printf("After save: %+v", user) // Email: "[email protected]"
}
// Divergence: normalizeEmail() changes case
Phase 2: Identify Divergence
// Expected: Email preserves original case
// Actual: Email is lowercased
// Divergence: normalizeEmail() function
func normalizeEmail(email string) string {
return strings.ToLower(email) // ← Root cause!
}
Phase 3: Hypothesize Root Cause
// Hypothesis: normalizeEmail() should only lowercase domain, not entire email
// Expected behavior: [email protected] → [email protected]
// Test: Check email RFC standards
// Verification: Email local part (before @) is case-sensitive
// Email domain (after @) is case-insensitive
// Root Cause: normalizeEmail() lowercases entire email, should only lowercase domain
Phase 4: Verify Fix
// Fix: Only lowercase domain part
func normalizeEmail(email string) string {
parts := strings.Split(email, "@")
if len(parts) != 2 {
return email // Invalid email, return as-is
}
return parts[0] + "@" + strings.ToLower(parts[1])
}
// Verification:
assert.Equal(t, "[email protected]", normalizeEmail("[email protected]"))
assert.Equal(t, "[email protected]", normalizeEmail("[email protected]"))
After debugging and fixing, use verification-before-completion to confirm:
When debugging reveals a bug:
For complex debugging requiring multiple investigations:
Before marking debugging task complete:
If any checkbox is unchecked, continue debugging. Do not apply fix until root cause is understood.
testing
A test skill for validation testing. Use when testing skill parsing and validation logic.
tools
--- name: bad-skill description: This skill has invalid YAML in frontmatter allowed-tools: [invalid, array, syntax prerequisites: not-an-array --- # Bad Skill This skill has malformed frontmatter that should fail parsing. The YAML has: - Unclosed array bracket - Wrong type for prerequisites (should be array, not string)
tools
Plugin release process for MAG Claude Plugins marketplace. Covers version bumping, marketplace.json updates, git tagging, and common mistakes. Use when releasing new plugin versions or troubleshooting update issues.
testing
Fetch trending programming models from OpenRouter rankings. Use when selecting models for multi-model review, updating model recommendations, or researching current AI coding trends. Provides model IDs, context windows, pricing, and usage statistics from the most recent week.