plugins/sanctum/skills/validate-mr/SKILL.md
Use when you need a diff-derived test plan for an MR — reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions.
npx skillsauth add athola/claude-night-market validate-mrInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate and self-execute a validation plan matched to what actually changed in an MR. Replaces generic "tests pass" with area-targeted evidence and revert-test quality checks that prove tests catch regressions.
/fix-pr Step 5 (Validate), before Step 6 (Complete)--scope minor with only formatting or doc changes (no logic changed)--skip-validate passed to /fix-prfetch diff -> group by area -> generate steps -> execute -> revert-test -> table
# Get changed file list from the MR
MR_NUMBER=<number from invocation or current branch>
CHANGED=$(gh pr diff "$MR_NUMBER" --name-only)
# Fallback when no MR number:
# CHANGED=$(git diff "origin/$(git rev-parse --abbrev-ref HEAD@{upstream})...HEAD" \
# --name-only 2>/dev/null)
Group changed files into areas using ripgrep (grep if rg unavailable):
RUST_FILES=$(echo "$CHANGED" | rg '\.rs$|Cargo\.(toml|lock)$' || true)
PY_FILES=$(echo "$CHANGED" | rg '\.py$|pyproject\.toml$|requirements.*\.txt$' || true)
SH_FILES=$(echo "$CHANGED" | rg '\.sh$|\.githooks' || true)
GRAMMAR_FILES=$(echo "$CHANGED" | rg '\.(lark|peg|g4)$' || true)
Area routing table:
| Area | File patterns | Verification type |
|------|---------------|-------------------|
| Rust | *.rs, Cargo.toml, Cargo.lock | cargo build + per-crate test |
| Python | *.py, pyproject.toml | pytest per changed module |
| Shell | *.sh, .githooks/* | shellcheck |
| Grammar | *.lark, *.peg, *.g4 | language-specific lint |
| Build/config | *.yaml, *.json, *.toml | parse check |
For each non-empty area, generate and run at least one verification step.
Assign [E1], [E2], ... labels to each captured output.
# Build with default features
cargo build --workspace 2>&1
# Evidence: [En] → "0 errors, 0 warnings"
# Build with --all-features
cargo build --workspace --all-features 2>&1
# Evidence: [En+1]
# Per-crate test for each changed crate
# Extract crate directory from changed path, e.g. crates/token-types/src/lib.rs
CHANGED_CRATES=$(echo "$RUST_FILES" \
| rg -o '(?:crates|src)/[^/]+' \
| sort -u \
| xargs -I{} basename {})
for CRATE in $CHANGED_CRATES; do
cargo test -p "$CRATE" 2>&1
done
# Targeted test per changed module
for PY_FILE in $PY_FILES; do
MODULE=$(basename "${PY_FILE%.py}")
TEST_FILE="tests/test_${MODULE}.py"
if [[ -f "$TEST_FILE" ]]; then
uv run pytest "$TEST_FILE" -v 2>&1
fi
done
# Or project-specific runner if Makefile target exists
make test 2>&1 || uv run pytest tests/ -v 2>&1
for SH_FILE in $SH_FILES; do
[[ -f "$SH_FILE" ]] && shellcheck "$SH_FILE" 2>&1
done
# YAML files
for YML in $(echo "$CHANGED" | rg '\.ya?ml$' || true); do
[[ -f "$YML" ]] && python3 -c "import yaml; yaml.safe_load(open('$YML'))" \
&& echo "PASS: $YML" || echo "FAIL: $YML"
done
# JSON files
for JSON_F in $(echo "$CHANGED" | rg '\.json$' || true); do
[[ -f "$JSON_F" ]] && python3 -m json.tool "$JSON_F" > /dev/null \
&& echo "PASS: $JSON_F" || echo "FAIL: $JSON_F"
done
Prove at least one test is a genuine guard, not a dead assertion.
Safety: abort if the working tree has uncommitted changes.
if ! git diff --exit-code > /dev/null 2>&1; then
echo "[RT] SKIP: working tree dirty — revert-test unsafe"
# Mark INCONCLUSIVE and continue
fi
Algorithm (one representative fix):
#[test] in the same crate that exercises a changed function.tests/test_<module>.py for a changed <module>.py.git checkout — <file> (git-based restore, safe on interrupt).Revert-test output format:
[RT-1] Target: <file>:<line> — <description of fix>
[RT-2] Broke fix: <edit description>
[RT-3] Ran: <test command> → <test name> FAILED (expected)
[RT-4] Restored: git checkout -- <file>
[RT-5] Ran: <test command> → <test name> PASSED
Result: PASS — test is a genuine guard
When no covering test exists:
Revert-test: INCONCLUSIVE — no covering test for <changed area>
Recommendation: add a test for <changed function or behaviour>
After all area checks and the revert-test:
# Rust workspace
cargo test --workspace 2>&1
# Python project
uv run pytest tests/ -v 2>&1
# Mixed project: run both
cargo test --workspace 2>&1 && uv run pytest tests/ -v 2>&1
Capture full output as final evidence [En].
### validate-mr: <MR title or number>
| Area | Step | Evidence | Result |
|------|------|----------|--------|
| Rust: token-types | cargo build --workspace | [E1] 0 errors | PASS |
| Rust: token-types | cargo test -p token-types | [E2] 12 passed | PASS |
| Rust: token-types | cargo build --all-features | [E3] 0 errors | PASS |
| Shell: hooks/pre-commit | shellcheck | [E4] 0 issues | PASS |
| Revert-test: lib.rs:45 | break/fail/restore | [RT-1..5] genuine guard | PASS |
| Final: cargo test --workspace | full suite | [E5] 694 passed, 0 failed | PASS |
**Totals**: 6 steps — 6 PASS, 0 FAIL, 0 INCONCLUSIVE
When --post is given, post the summary table as a PR comment:
gh pr comment "$MR_NUMBER" --body "$(cat /tmp/validate-mr-summary.md)"
Skip posting when invoked from /fix-pr — results feed into the Gate 3
summary comment instead.
When any step produces FAIL:
/fix-pr: halt before Step 6 (Complete). The user must
fix the failures or pass --skip-validate to /fix-pr to bypass.INCONCLUSIVE results are reported but do not halt the workflow.
gh pr diff --name-only returned a non-empty file list (diff fetched)[E1], [E2], etc.) with the
actual command output, not fabricated/fix-pr before Step 6 when called from fix-prtools
Detect friction signals; graduate patterns into rules. Use for session retrospectives.
development
Curate the web-capture index. Use when the capture backlog grows, captures sit unprocessed at seedling/pending, or to surface stored research during work.
testing
Probe memory/summary clarity via dual anchor questions: task progress, info gaps. Use when verifying session state or summary before handoff or compression.
testing
Runs parallel prose and craft review agents against a voice profile. Use when checking generated content for AI patterns and voice drift before publishing.