plugins/sanctum/skills/test-updates/SKILL.md
Updates, generates, and validates tests using git-workspace context and TDD/BDD methodology. Use when code changes require new or updated test coverage.
npx skillsauth add athola/claude-night-market test-updatesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
detailed test management system that applies TDD/BDD principles to maintain, generate, and enhance tests across codebases. This skill practices what it preaches - it uses TDD principles for its own development and serves as a living example of best practices.
A modular test management system that:
pip install pytest)src/ or similar directorytests/ directory if it doesn't existSkill(sanctum:git-workspace-review) first to understand changesSkill(test-updates) --target <specific-module> for focused updates# Run full test update workflow
Skill(test-updates)
Verification: Run pytest -v to verify tests pass.
# Update tests for specific paths
Skill(test-updates) --target src/sanctum/agents
Skill(test-updates) --target tests/test_commit_messages.py
Verification: Run pytest -v to verify tests pass.
# Apply TDD to new code
Skill(test-updates) --tdd-only --target new_feature.py
Verification: Run pytest -v to verify tests pass.
Human-Readable Output:
# Analyze test coverage gaps
python plugins/sanctum/scripts/test_analyzer.py --scan src/
# Generate test scaffolding
python plugins/sanctum/scripts/test_generator.py \
--source src/my_module.py --style pytest_bdd
# Check test quality
python plugins/sanctum/scripts/quality_checker.py \
--validate tests/test_my_module.py
Verification: Run pytest -v to verify tests pass.
Programmatic Output (for Claude Code):
# Get JSON output for programmatic parsing - test_analyzer
python plugins/sanctum/scripts/test_analyzer.py \
--scan src/ --output-json
# Returns:
# {
# "success": true,
# "data": {
# "source_files": ["src/module.py", ...],
# "test_files": ["tests/test_module.py", ...],
# "uncovered_files": ["module_without_tests", ...],
# "coverage_gaps": [{"file": "...", "reason": "..."}]
# }
# }
# Get JSON output - test_generator
python plugins/sanctum/scripts/test_generator.py \
--source src/my_module.py --output-json
# Returns:
# {
# "success": true,
# "data": {
# "test_file": "path/to/test_my_module.py",
# "source_file": "src/my_module.py",
# "style": "pytest_bdd",
# "fixtures_included": true,
# "edge_cases_included": true,
# "error_cases_included": true
# }
# }
# Get JSON output - quality_checker
python plugins/sanctum/scripts/quality_checker.py \
--validate tests/test_my_module.py --output-json
# Returns:
# {
# "success": true,
# "data": {
# "static_analysis": {...},
# "dynamic_validation": {...},
# "metrics": {...},
# "quality_score": 85,
# "quality_level": "QualityLevel.GOOD",
# "recommendations": [...]
# }
# }
Verification: Run pytest -v to verify tests pass.
Use this skill when you need to:
Perfect for:
See modules/test-discovery.md for detection patterns.
modules/bdd-patterns.md)Before writing behavioral tests, identify the design invariants that the code relies on and write tests that would break if those invariants were violated.
What to encode:
Example:
def test_plugins_never_import_from_other_plugins():
"""Encode the invariant: plugins are independent modules.
If this test breaks, someone is coupling plugins
directly. Present the 3 options to a human:
1. Preserve: revert the import, keep plugins independent
2. Layer: add a shared interface in leyline instead
3. Revise: merge the plugins (requires ADR)
"""
for plugin_dir in plugin_dirs:
imports = extract_imports(plugin_dir)
for imp in imports:
assert not imp.startswith("plugins."), (
f"{plugin_dir} imports {imp} — "
f"violates plugin independence invariant"
)
Why this matters: Tests that encode invariants are load-bearing. When an agent later encounters a feature that clashes with the invariant, the test failure forces a conscious decision rather than a silent drift. Without these tests, bad invariant decisions compound until the codebase is unsalvageable.
When updating existing tests:
If an invariant-encoding test needs to change, do NOT silently update the assertion. Flag it for human review with the three options: preserve the invariant, layer on top, or revise the invariant. This is a judgment call that requires human wisdom: models default to the "average" of training data and get these wrong far too often.
modules/tdd-workflow.mdSee modules/test-generation.md for generation templates.
See modules/quality-validation.md for validation criteria.
The skill applies multiple quality checks:
See modules/bdd-patterns.md for additional patterns.
class TestGitWorkflow:
"""BDD-style tests for Git workflow operations."""
def test_commit_workflow_with_staged_changes(self):
"""
GIVEN a Git repository with staged changes
WHEN the user runs the commit workflow
THEN it should create a commit with proper message format
AND all tests should pass
"""
# Test implementation following TDD principles
pass
Verification: Run pytest -v to verify tests pass.
See modules/test-enhancement.md for enhancement strategies.
Q: Tests are failing after generation A: This is expected! The skill follows TDD principles - generated tests are designed to fail first. Follow the RED-GREEN-REFACTOR cycle:
Q: Quality score is low despite having tests A: Check for these common issues:
assert result is not NoneQ: Generated tests don't match my code structure A: The scripts analyze AST patterns and may need guidance:
--style flag to match your preferred BDD styleQ: Mutation testing takes too long A: Mutation testing is resource-intensive:
--quick-mutation flag for subset testingQ: Can't find tests for my file A: The analyzer uses naming conventions:
my_module.py → Test: test_my_module.py--target to focus on specific directories--verbose flag for more informationresearch
Generate diverse solution candidates with category-spanning ideation methods and rotation. Use when stuck on a design or fighting repetitive LLM output.
tools
--- name: validate-pr description: Use when you need a diff-derived test plan for a PR: reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions. alwaysApply: false category: validation tags: - pr - validation - test-plan - diff - revert-test - evidence tools: [] usage_patterns: - diff-derived-test-plan - revert-test-quality-check - evidence-capture complexity: intermediate model_hint: standard estimated_tokens: 650
development
Contract for the project decision journal (tradeoffs and lessons-learned logs). Use when recording a decision, tradeoff, or lesson, or building a consumer hook.
development
Ramps implementation ambition a notch only after the prior increment is understood. Use when building a feature you must understand, not just ship.