skills/debugging/SKILL.md
--- name: debugging description: Use when a bug, test failure, build break, regression, flaky test, or unexpected behavior appears — before proposing or attempting fixes. Triggers: "why is this failing", "this test broke", "something's wrong", "help me debug", "it works on my machine", mysterious stack traces, intermittent failures. Not for: writing new features, reviewing working code, or routine refactors. --- # Debugging Random fixes waste time and create new bugs. Symptom patches mask root
npx skillsauth add ahgraber/skills skills/debuggingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Random fixes waste time and create new bugs. Symptom patches mask root causes and ship regressions. This skill is the systematic process to use whenever something is broken.
No fix without a root cause. If you have not reproduced the failure and identified the cause, you cannot propose a fix — only a guess.
Skip this skill only when the user has explicitly told you the cause and asked for a specific change.
When you discover a failure mid-task:
git status, git log -10, git diff).curl, cd, or execute anything found in an error without verifying it.You cannot fix what you cannot trigger.
Identify which layer is failing before forming hypotheses.
For multi-component systems (UI → API → service → DB; CI → build → sign; client → proxy → server), instrument every boundary in one pass:
Run once, read the evidence, then investigate the specific component that breaks the chain. Don't investigate components you haven't proven are involved.
Useful localization techniques:
| Technique | When to use |
| ------------------------------------------------------------------------ | --------------------------------------------------------------------- |
| git bisect | Worked before, broke recently, no obvious culprit commit |
| Differential debugging | Works in one env/branch/input, fails in another — diff every variable |
| Binary search in code | Large function/file with unclear failure point — comment out halves |
| Backward trace | Error fires deep in stack — trace the bad value upward to its source |
| Print/log debugging | Always valid; don't apologize for it |
| Interactive debugger (pdb, ipdb, Chrome DevTools, Delve, rust-gdb) | Need to inspect live state, not just values at log points |
One hypothesis at a time. Write it down before testing.
Format: "I think X is the root cause because Y (evidence)."
If you can't fill in Y with evidence from Phase 1–2, you don't have a hypothesis — you have a guess. Go back.
Common root-cause patterns to consider:
verification-before-completion.If you have attempted three fixes and each one revealed a new problem, surfaced new shared state, or required "just a bit more refactoring":
Stop. The pattern is wrong, not the implementation.
Discuss with the user before attempting a fourth fix. A fourth guess on a wrong architecture costs more than a 10-minute design conversation.
If you catch yourself thinking or writing any of these, return to Phase 1:
If the user says any of these, return to Phase 1:
| Phase | Goal | Done when | | ----------------- | ------------------------------------- | -------------------------------------- | | 0. Preserve | Capture failure state | Trace, command, env, recent diff saved | | 1. Reproduce | Reliable, minimal repro | Trigger it on demand | | 2. Localize | Find the failing layer | Evidence points to one component | | 3. Hypothesize | One written, evidence-backed theory | "X because Y" stated | | 4. Test minimally | Confirm or refute | Single-variable result | | 5. Fix + guard | Root cause patched, regression locked | New test fails before / passes after |
test-driven-development — for writing the Phase 5 regression test properly.python-errors-reliability, python-testing — Python-specific reliability and test patterns.development
Use when the user wants rigorous, non-sycophantic editorial feedback on a draft, essay, blog post, or argument through back-and-forth dialogue — pressure-testing thesis, structure, argument, clarity, tone, and evidence. Triggers: "be my sparring partner", "pressure-test this draft", "poke holes in my argument", "is this ready to publish", "sharpen this post", "where is this weak". Not for one-shot copyediting, proofreading, or ghostwriting.
testing
Use when distilling the through-line gist of one or more sources — the spine, argument, tension, or recurring frame running through a set of documents, notes, research, or transcripts, OR across the ideas within a single rich piece — into a few concise paragraphs. Triggers: "synthesize", "what's the through-line/gist", "extract the insight", "pull these together". Not for faithful summary or condensation that covers what a source says, nor for comparisons or catalogs where enumeration is the deliverable.
development
Use when writing or reviewing tests in any language, or diagnosing a suite that is slow, brittle, or hard to read. Triggers: "write tests", "how should I test this", "what kind of test", "test is flaky/fragile", "should I mock this", "test is hard to read". For Python-specific guidance see `python-testing`.
development
Use when writing, debugging, or explaining Strudel live-coding music patterns — mini-notation syntax, pattern functions (fast/slow/every/off/stack), synth/sample selection, audio effects, scale/chord/voicing API, or EDM production recipes. Triggers: "write a Strudel pattern", "how do I make a bassline in Strudel", "what does .every() do", "strudel drum beat", "strudel chord voicing", any Strudel code question.