claude/skills/tdd-assertions/SKILL.md
Detect and fix weak test assertions that AI generates across Rust, Python, TypeScript, Go, and Shell. Use this skill whenever you write or review tests, when the user says "strengthen assertions", "fix weak tests", or during /wreck, /fromage, and /simplify flows. Also use as a mental checklist before committing test code — AI assistants systematically produce assertions that pass when the code is broken, which is the cardinal sin of TDD. Trigger proactively on test generation and test review.
npx skillsauth add paulnsorensen/dotfiles tdd-assertionsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fix weak test assertions that pass when the code is broken.
AI coding assistants optimize for test coverage metrics, not test utility. The result is assertions that look thorough but can't catch regressions: existence checks instead of value equality, catch-all error types, length checks without content inspection, and mock verification without arguments.
A test that can't fail when behavior breaks isn't a test — it's a liability.
#[test], testing, bats)references/ (only the languages present)These apply to every language. They're the most common AI assertion failures.
The #1 AI testing sin. Asserts something exists without verifying it's correct.
WEAK: assert result is not None
WEAK: expect(result).toBeDefined()
WEAK: assert!(result.is_some())
STRONG: assert result == expected_value
STRONG: expect(result).toEqual({ id: 1, name: "Alice" })
STRONG: assert_eq!(result.unwrap(), expected)
Fix: Always assert the specific value you expect, not just that a value exists.
Verifies the container has items but not what those items are.
WEAK: assert len(results) == 1
WEAK: expect(items).toHaveLength(3)
WEAK: assert_eq!(vec.len(), 2)
STRONG: assert results[0].name == "Alice"
STRONG: expect(items).toEqual([...expected])
STRONG: assert_eq!(vec, vec!["a", "b"])
Fix: Check content first. Length-only assertions are OK as a final confirmation after content checks, never as the sole assertion.
Verifies an error occurred without checking it's the right error.
WEAK: with pytest.raises(Exception):
WEAK: expect(() => fn()).toThrow()
WEAK: assert!(result.is_err())
WEAK: assert.Error(t, err)
STRONG: with pytest.raises(ValueError, match=r"must be positive"):
STRONG: expect(() => fn()).toThrow(ValidationError)
STRONG: assert!(matches!(result, Err(MyError::NotFound(_))))
STRONG: require.ErrorAs(t, err, ¬FoundErr)
Fix: Assert the specific error type AND message/content.
Test only asserts the function didn't throw, with no behavioral check.
WEAK: run my_command; assert_success
WEAK: result, err := fn(); require.NoError(t, err) // nothing follows
WEAK: expect(() => fn()).not.toThrow() // sole assertion
STRONG: run my_command; [[ "$output" == "expected" ]]
STRONG: require.NoError(t, err); assert.Equal(t, expected, result)
STRONG: expect(fn()).toEqual(expected)
Fix: Every test needs a positive behavioral assertion. "Didn't crash" is necessary but never sufficient.
Verifies a mock was called but not how it was called.
WEAK: mock_fn.assert_called()
WEAK: expect(mockFn).toHaveBeenCalled()
WEAK: mock.AssertCalled(t, "Send")
STRONG: mock_fn.assert_called_once_with(user_id=42, role="admin")
STRONG: expect(mockFn).toHaveBeenCalledWith({ to: "[email protected]" })
STRONG: mock.AssertCalled(t, "Send", expected_msg)
Fix: Always verify mock call arguments. A mock called with wrong arguments is a test that passes while the code is broken.
Asserts that a mock returns what you told it to return — tautological.
WEAK:
mock = Mock(return_value=42)
assert mock() == 42 # You just tested Mock.__call__
WEAK:
const mock = jest.fn().mockReturnValue(42);
expect(mock()).toBe(42); // Tests jest.fn, not your code
Fix: The assertion should check the system under test which uses the mock, not the mock itself.
Uses truthiness where a value check is possible and more precise.
WEAK: assert bool(result)
WEAK: expect(!!result).toBe(true)
WEAK: assert!(some_function())
STRONG: assert result == expected_value
STRONG: expect(result).toEqual(expected)
STRONG: assert_eq!(some_function(), expected)
Fix: If you know what the value should be, assert that. Truthiness only when the contract genuinely is "any truthy value."
Assertions that literally cannot fail. A test that always passes tests nothing.
WEAK: assert True
WEAK: expect(1).toBe(1)
WEAK: assert_eq!(true, true)
WEAK: # Status catch-all
[[ $status -eq 0 || $status -eq 1 ]] # anything-passes guard
Fix: Delete tautological assertions. If the test needs a placeholder, mark it
as @pytest.mark.skip / it.todo() / #[ignore] instead.
Uses fuzzy matching when the result is deterministic.
WEAK: assert abs(result - 100) < 1 # result IS exactly 100
WEAK: expect(result).toBeCloseTo(100) # when result is integer math
WEAK: assert!((result - 1.0).abs() < f64::EPSILON) # when input is exact
STRONG: assert result == 100
STRONG: expect(result).toBe(100)
STRONG: assert_eq!(result, 1.0)
Fix: Use approximate equality only for floating-point arithmetic that genuinely introduces rounding. Integer math, string operations, and deterministic calculations should use exact equality.
Read these only for test frameworks present in the code being reviewed:
| Language | Reference |
|----------|-----------|
| Rust | references/rust.md |
| Python (pytest) | references/python.md |
| TypeScript/JavaScript (jest/vitest) | references/typescript.md |
| Go (testing/testify) | references/go.md |
| Shell/Bash (bats) | references/shell.md |
When fixing assertions, explain each change concisely:
Strengthened 4 assertions:
- assert result is not None → assert result == User(id=1, name="Alice")
- pytest.raises(Exception) → pytest.raises(ValueError, match="must be positive")
- mock.assert_called() → mock.assert_called_once_with(user_id=42)
- Deleted tautological assert True
Don't over-explain. The stronger assertion speaks for itself.
is not None is correct when the contract genuinely is "returns any value, not None"toBeCloseTo / assertAlmostEqual is correct for floating-point arithmetic — not weakexcept Exception is valid in top-level error boundaries — only flag in inner codetools
Reconstruct what a past coding-agent session was doing so you can resume it — goal, files touched, last verified state, and the next step — by querying the session logs. Use when the user says "what was I working on", "recover that session", "reconstruct where I left off", "resume my last session", "what did that session change", "rebuild context from logs", or invokes /work-recovery. Report-only — it never scores or judges. Do NOT use for usage scoring (that is /skill-improver, /tool-efficiency, /prompt-analytics) or one-off interactive log queries (that is /session-analytics).
development
Curate this repo's hallouminate wiki (.hallouminate/wiki/, the repo:dotfiles:wiki corpus) — add or update architecture pages, per-harness docs, and gotchas. Use when the user says "update the wiki", "document this in the wiki", "refresh the harness docs", "add a wiki page", "curate the wiki", "the wiki is stale", or invokes /wiki-curator. Also use at session end to write back a non-obvious decision or gotcha worth preserving. Grounds the existing wiki first, follows one-topic-per-file conventions, verifies every external doc URL before writing, and reindexes. Do NOT use for general code search (that is cheez-search) or for editing AGENTS.md command reference.
tools
Audit how a tool, command, or MCP server is actually used across coding-agent sessions and produce calibrated recommendations — tool-vs-task fit, error forensics, fix recommendations, permission friction, MCP health, and token economics. Use when the user says "tool efficiency", "am I using X efficiently", "audit tool usage", "why does X keep failing", "how do I fix this error", "what should I change", "permission friction", "is this MCP worth it", "tool error rate", "fix recommendations", or invokes /tool-efficiency. Do NOT use for auditing a skill or agent definition (that is /skill-improver) or for one-off interactive log queries (that is /session-analytics).
tools
Analyze how prompts and skill routing behave across coding-agent sessions and produce calibrated recommendations — prompt-pattern analysis, routing accuracy, and knowledge gaps. Use when the user says "analyze my prompts", "prompt patterns", "is routing working", "which skill should have fired", "knowledge gaps", "what do I keep asking", or invokes /prompt-analytics. Do NOT use for auditing a single skill/agent definition (that is /skill-improver), tool/MCP efficiency (that is /tool-efficiency), or one-off interactive log queries (that is /session-analytics).