plugins/sanctum/skills/validate-pr/SKILL.md
--- name: validate-pr description: Use when you need a diff-derived test plan for a PR: reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions. alwaysApply: false category: validation tags: - pr - validation - test-plan - diff - revert-test - evidence tools: [] usage_patterns: - diff-derived-test-plan - revert-test-quality-check - evidence-capture complexity: intermediate model_hint: standard estimated_tokens: 650
npx skillsauth add athola/claude-night-market plugins/sanctum/skills/validate-prInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate and self-execute a validation plan matched to what actually changed in a PR. Replaces generic "tests pass" with area-targeted evidence and revert-test quality checks that prove tests catch regressions.
/fix-pr Step 5 (Validate), before Step 6 (Complete)--scope minor with only formatting or doc changes (no logic changed)--skip-validate passed to /fix-prfetch diff -> group by area -> generate steps -> execute -> revert-test -> table
# Get changed file list from the PR
PR_NUMBER=<number from invocation or current branch>
CHANGED=$(gh pr diff "$PR_NUMBER" --name-only)
# Fallback when no PR number:
# CHANGED=$(git diff "origin/$(git rev-parse --abbrev-ref HEAD@{upstream})...HEAD" \
# --name-only 2>/dev/null)
Group changed files into areas using ripgrep (grep if rg unavailable):
RUST_FILES=$(echo "$CHANGED" | rg '\.rs$|Cargo\.(toml|lock)$' || true)
PY_FILES=$(echo "$CHANGED" | rg '\.py$|pyproject\.toml$|requirements.*\.txt$' || true)
SH_FILES=$(echo "$CHANGED" | rg '\.sh$|\.githooks' || true)
GRAMMAR_FILES=$(echo "$CHANGED" | rg '\.(lark|peg|g4)$' || true)
Area routing table:
| Area | File patterns | Verification type |
|------|---------------|-------------------|
| Rust | *.rs, Cargo.toml, Cargo.lock | cargo build + per-crate test |
| Python | *.py, pyproject.toml | pytest per changed module |
| Shell | *.sh, .githooks/* | shellcheck |
| Grammar | *.lark, *.peg, *.g4 | language-specific lint |
| Build/config | *.yaml, *.json, *.toml | parse check |
For each non-empty area, generate and run at least one verification step.
Assign [E1], [E2], ... labels to each captured output.
# Build with default features
cargo build --workspace 2>&1
# Evidence: [En] → "0 errors, 0 warnings"
# Build with --all-features
cargo build --workspace --all-features 2>&1
# Evidence: [En+1]
# Per-crate test for each changed crate
# Extract crate directory from changed path, e.g. crates/token-types/src/lib.rs
CHANGED_CRATES=$(echo "$RUST_FILES" \
| rg -o '(?:crates|src)/[^/]+' \
| sort -u \
| xargs -I{} basename {})
for CRATE in $CHANGED_CRATES; do
cargo test -p "$CRATE" 2>&1
done
# Targeted test per changed module
for PY_FILE in $PY_FILES; do
MODULE=$(basename "${PY_FILE%.py}")
TEST_FILE="tests/test_${MODULE}.py"
if [[ -f "$TEST_FILE" ]]; then
uv run pytest "$TEST_FILE" -v 2>&1
fi
done
# Or project-specific runner if Makefile target exists
make test 2>&1 || uv run pytest tests/ -v 2>&1
for SH_FILE in $SH_FILES; do
[[ -f "$SH_FILE" ]] && shellcheck "$SH_FILE" 2>&1
done
# YAML files
for YML in $(echo "$CHANGED" | rg '\.ya?ml$' || true); do
[[ -f "$YML" ]] && python3 -c "import yaml; yaml.safe_load(open('$YML'))" \
&& echo "PASS: $YML" || echo "FAIL: $YML"
done
# JSON files
for JSON_F in $(echo "$CHANGED" | rg '\.json$' || true); do
[[ -f "$JSON_F" ]] && python3 -m json.tool "$JSON_F" > /dev/null \
&& echo "PASS: $JSON_F" || echo "FAIL: $JSON_F"
done
Prove at least one test is a genuine guard, not a dead assertion.
Safety: abort if the working tree has uncommitted changes.
if ! git diff --exit-code > /dev/null 2>&1; then
echo "[RT] SKIP: working tree dirty: revert-test unsafe"
# Mark INCONCLUSIVE and continue
fi
Algorithm (one representative fix):
#[test] in the same crate that exercises a changed function.tests/test_<module>.py for a changed <module>.py.git checkout -- <file> (git-based restore, safe on interrupt).Revert-test output format:
[RT-1] Target: <file>:<line>: <description of fix>
[RT-2] Broke fix: <edit description>
[RT-3] Ran: <test command> → <test name> FAILED (expected)
[RT-4] Restored: git checkout -- <file>
[RT-5] Ran: <test command> → <test name> PASSED
Result: PASS: test is a genuine guard
When no covering test exists:
Revert-test: INCONCLUSIVE: no covering test for <changed area>
Recommendation: add a test for <changed function or behaviour>
After all area checks and the revert-test:
# Rust workspace
cargo test --workspace 2>&1
# Python project
uv run pytest tests/ -v 2>&1
# Mixed project: run both
cargo test --workspace 2>&1 && uv run pytest tests/ -v 2>&1
Capture full output as final evidence [En].
### validate-pr: <PR title or number>
| Area | Step | Evidence | Result |
|------|------|----------|--------|
| Rust: token-types | cargo build --workspace | [E1] 0 errors | PASS |
| Rust: token-types | cargo test -p token-types | [E2] 12 passed | PASS |
| Rust: token-types | cargo build --all-features | [E3] 0 errors | PASS |
| Shell: hooks/pre-commit | shellcheck | [E4] 0 issues | PASS |
| Revert-test: lib.rs:45 | break/fail/restore | [RT-1..5] genuine guard | PASS |
| Final: cargo test --workspace | full suite | [E5] 694 passed, 0 failed | PASS |
**Totals**: 6 steps: 6 PASS, 0 FAIL, 0 INCONCLUSIVE
When --post is given, post the summary table as a PR comment:
gh pr comment "$PR_NUMBER" --body "$(cat /tmp/validate-pr-summary.md)"
Skip posting when invoked from /fix-pr: results feed into the Gate 3
summary comment instead.
When any step produces FAIL:
/fix-pr: halt before Step 6 (Complete). The user must
fix the failures or pass --skip-validate to /fix-pr to bypass.INCONCLUSIVE results are reported but do not halt the workflow.
gh pr diff --name-only returned a non-empty file list (diff fetched)[E1], [E2], etc.) with the
actual command output, not fabricated/fix-pr before Step 6 when called from fix-prresearch
Generate diverse solution candidates with category-spanning ideation methods and rotation. Use when stuck on a design or fighting repetitive LLM output.
development
Contract for the project decision journal (tradeoffs and lessons-learned logs). Use when recording a decision, tradeoff, or lesson, or building a consumer hook.
development
Ramps implementation ambition a notch only after the prior increment is understood. Use when building a feature you must understand, not just ship.
testing
Verifies a package exists before install, defending against hallucination and slopsquatting. Use when adding, recommending, or installing a package.