Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

athola/validate-mr

Name: validate-mr
Author: athola

plugins/sanctum/skills/validate-mr/SKILL.md

npx skillsauth add athola/claude-night-market validate-mr

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

validate-mr: Diff-Derived Test Plan

Generate and self-execute a validation plan matched to what actually changed in an MR. Replaces generic "tests pass" with area-targeted evidence and revert-test quality checks that prove tests catch regressions.

When To Use

End of /fix-pr Step 5 (Validate), before Step 6 (Complete)
Standalone after any MR fix, to generate targeted validation evidence
When you need proof that revert-tests are genuine guards

When NOT To Use

--scope minor with only formatting or doc changes (no logic changed)
No diff available (clean branch, nothing changed)
--skip-validate passed to /fix-pr

Algorithm

fetch diff -> group by area -> generate steps -> execute -> revert-test -> table

Step 1: Fetch Diff and Detect Areas

# Get changed file list from the MR
MR_NUMBER=<number from invocation or current branch>
CHANGED=$(gh pr diff "$MR_NUMBER" --name-only)
# Fallback when no MR number:
# CHANGED=$(git diff "origin/$(git rev-parse --abbrev-ref HEAD@{upstream})...HEAD" \
#   --name-only 2>/dev/null)

Group changed files into areas using ripgrep (grep if rg unavailable):

RUST_FILES=$(echo "$CHANGED"  | rg '\.rs$|Cargo\.(toml|lock)$' || true)
PY_FILES=$(echo "$CHANGED"    | rg '\.py$|pyproject\.toml$|requirements.*\.txt$' || true)
SH_FILES=$(echo "$CHANGED"    | rg '\.sh$|\.githooks' || true)
GRAMMAR_FILES=$(echo "$CHANGED" | rg '\.(lark|peg|g4)$' || true)

Area routing table:

| Area | File patterns | Verification type | |------|---------------|-------------------| | Rust | *.rs, Cargo.toml, Cargo.lock | cargo build + per-crate test | | Python | *.py, pyproject.toml | pytest per changed module | | Shell | *.sh, .githooks/* | shellcheck | | Grammar | *.lark, *.peg, *.g4 | language-specific lint | | Build/config | *.yaml, *.json, *.toml | parse check |

Step 2: Generate and Execute Steps per Area

For each non-empty area, generate and run at least one verification step. Assign [E1], [E2], ... labels to each captured output.

Rust

# Build with default features
cargo build --workspace 2>&1
# Evidence: [En] → "0 errors, 0 warnings"

# Build with --all-features
cargo build --workspace --all-features 2>&1
# Evidence: [En+1]

# Per-crate test for each changed crate
# Extract crate directory from changed path, e.g. crates/token-types/src/lib.rs
CHANGED_CRATES=$(echo "$RUST_FILES" \
  | rg -o '(?:crates|src)/[^/]+' \
  | sort -u \
  | xargs -I{} basename {})
for CRATE in $CHANGED_CRATES; do
  cargo test -p "$CRATE" 2>&1
done

Python

# Targeted test per changed module
for PY_FILE in $PY_FILES; do
  MODULE=$(basename "${PY_FILE%.py}")
  TEST_FILE="tests/test_${MODULE}.py"
  if [[ -f "$TEST_FILE" ]]; then
    uv run pytest "$TEST_FILE" -v 2>&1
  fi
done

# Or project-specific runner if Makefile target exists
make test 2>&1 || uv run pytest tests/ -v 2>&1

Shell

for SH_FILE in $SH_FILES; do
  [[ -f "$SH_FILE" ]] && shellcheck "$SH_FILE" 2>&1
done

Build/config parse check

# YAML files
for YML in $(echo "$CHANGED" | rg '\.ya?ml$' || true); do
  [[ -f "$YML" ]] && python3 -c "import yaml; yaml.safe_load(open('$YML'))" \
    && echo "PASS: $YML" || echo "FAIL: $YML"
done

# JSON files
for JSON_F in $(echo "$CHANGED" | rg '\.json$' || true); do
  [[ -f "$JSON_F" ]] && python3 -m json.tool "$JSON_F" > /dev/null \
    && echo "PASS: $JSON_F" || echo "FAIL: $JSON_F"
done

Step 3: Revert-Test Quality Check

Prove at least one test is a genuine guard, not a dead assertion.

Safety: abort if the working tree has uncommitted changes.

if ! git diff --exit-code > /dev/null 2>&1; then
  echo "[RT] SKIP: working tree dirty — revert-test unsafe"
  # Mark INCONCLUSIVE and continue
fi

Algorithm (one representative fix):

From the changed source files, find one that has a corresponding test.
- Rust: a #[test] in the same crate that exercises a changed function.
- Python: tests/test_<module>.py for a changed <module>.py.
- Shell: a test harness that invokes the changed script.
Identify the specific changed line or block from the diff.
Edit that line to revert the fix to its broken state.
Run the targeted test: confirm it FAILS (expected).
Restore: git checkout — <file> (git-based restore, safe on interrupt).
Run the targeted test again: confirm it PASSES.
If any step cannot complete, mark INCONCLUSIVE with the reason.

Revert-test output format:

[RT-1] Target: <file>:<line> — <description of fix>
[RT-2] Broke fix: <edit description>
[RT-3] Ran: <test command> → <test name> FAILED (expected)
[RT-4] Restored: git checkout -- <file>
[RT-5] Ran: <test command> → <test name> PASSED
Result: PASS — test is a genuine guard

When no covering test exists:

Revert-test: INCONCLUSIVE — no covering test for <changed area>
Recommendation: add a test for <changed function or behaviour>

Step 4: Final Full-Suite Run

After all area checks and the revert-test:

# Rust workspace
cargo test --workspace 2>&1

# Python project
uv run pytest tests/ -v 2>&1

# Mixed project: run both
cargo test --workspace 2>&1 && uv run pytest tests/ -v 2>&1

Capture full output as final evidence [En].

Step 5: Produce Summary Table

### validate-mr: <MR title or number>

| Area | Step | Evidence | Result |
|------|------|----------|--------|
| Rust: token-types | cargo build --workspace | [E1] 0 errors | PASS |
| Rust: token-types | cargo test -p token-types | [E2] 12 passed | PASS |
| Rust: token-types | cargo build --all-features | [E3] 0 errors | PASS |
| Shell: hooks/pre-commit | shellcheck | [E4] 0 issues | PASS |
| Revert-test: lib.rs:45 | break/fail/restore | [RT-1..5] genuine guard | PASS |
| Final: cargo test --workspace | full suite | [E5] 694 passed, 0 failed | PASS |

**Totals**: 6 steps — 6 PASS, 0 FAIL, 0 INCONCLUSIVE

Step 6: Posting (--post flag only)

When --post is given, post the summary table as a PR comment:

gh pr comment "$MR_NUMBER" --body "$(cat /tmp/validate-mr-summary.md)"

Skip posting when invoked from /fix-pr — results feed into the Gate 3 summary comment instead.

Failure Behaviour

When any step produces FAIL:

Surface the failures in the summary table with the evidence reference.
When called from /fix-pr: halt before Step 6 (Complete). The user must fix the failures or pass --skip-validate to /fix-pr to bypass.
When called standalone: report failures and exit with non-zero status.

INCONCLUSIVE results are reported but do not halt the workflow.

Exit Criteria

[ ] gh pr diff --name-only returned a non-empty file list (diff fetched)
[ ] Every detected area has at least one row in the summary table
[ ] Every row shows an Evidence reference ([E1], [E2], etc.) with the actual command output, not fabricated
[ ] Revert-test attempted for at least one area with a covering test, or documented as INCONCLUSIVE with reason
[ ] Final full-suite run appears in the summary table
[ ] Summary table is present with columns: Area, Step, Evidence, Result
[ ] Any FAIL result halts /fix-pr before Step 6 when called from fix-pr
[ ] Working tree is clean after skill completes (git checkout restore confirmed successful for any revert-test mutation)

athola/validate-mr

plugins/sanctum/skills/validate-mr/SKILL.md

Use when you need a diff-derived test plan for an MR — reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions.

294 stars

testing

Updated May 30, 2026

$ install --global

skillsauth

npx skillsauth add athola/claude-night-market validate-mr

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 30, 2026, 4:11 AM147.8s1 file scanned

SKILL.md

name:: validate-mr
description:: Use when you need a diff-derived test plan for an MR — reads the
alwaysApply:: false
category:: validation
tools:: []
complexity:: intermediate
model_hint:: standard
estimated_tokens:: 650
progressive_loading:: false
- leyline:: git-platform
- imbue:: proof-of-work
role:: entrypoint

validate-mr: Diff-Derived Test Plan

When To Use

End of /fix-pr Step 5 (Validate), before Step 6 (Complete)
Standalone after any MR fix, to generate targeted validation evidence
When you need proof that revert-tests are genuine guards

When NOT To Use

--scope minor with only formatting or doc changes (no logic changed)
No diff available (clean branch, nothing changed)
--skip-validate passed to /fix-pr

Algorithm

fetch diff -> group by area -> generate steps -> execute -> revert-test -> table

Step 1: Fetch Diff and Detect Areas

# Get changed file list from the MR
MR_NUMBER=<number from invocation or current branch>
CHANGED=$(gh pr diff "$MR_NUMBER" --name-only)
# Fallback when no MR number:
# CHANGED=$(git diff "origin/$(git rev-parse --abbrev-ref HEAD@{upstream})...HEAD" \
#   --name-only 2>/dev/null)

Group changed files into areas using ripgrep (grep if rg unavailable):

RUST_FILES=$(echo "$CHANGED"  | rg '\.rs$|Cargo\.(toml|lock)$' || true)
PY_FILES=$(echo "$CHANGED"    | rg '\.py$|pyproject\.toml$|requirements.*\.txt$' || true)
SH_FILES=$(echo "$CHANGED"    | rg '\.sh$|\.githooks' || true)
GRAMMAR_FILES=$(echo "$CHANGED" | rg '\.(lark|peg|g4)$' || true)

Area routing table:

Step 2: Generate and Execute Steps per Area

For each non-empty area, generate and run at least one verification step. Assign [E1], [E2], ... labels to each captured output.

Rust

# Build with default features
cargo build --workspace 2>&1
# Evidence: [En] → "0 errors, 0 warnings"

# Build with --all-features
cargo build --workspace --all-features 2>&1
# Evidence: [En+1]

# Per-crate test for each changed crate
# Extract crate directory from changed path, e.g. crates/token-types/src/lib.rs
CHANGED_CRATES=$(echo "$RUST_FILES" \
  | rg -o '(?:crates|src)/[^/]+' \
  | sort -u \
  | xargs -I{} basename {})
for CRATE in $CHANGED_CRATES; do
  cargo test -p "$CRATE" 2>&1
done

Python

# Targeted test per changed module
for PY_FILE in $PY_FILES; do
  MODULE=$(basename "${PY_FILE%.py}")
  TEST_FILE="tests/test_${MODULE}.py"
  if [[ -f "$TEST_FILE" ]]; then
    uv run pytest "$TEST_FILE" -v 2>&1
  fi
done

# Or project-specific runner if Makefile target exists
make test 2>&1 || uv run pytest tests/ -v 2>&1

Shell

for SH_FILE in $SH_FILES; do
  [[ -f "$SH_FILE" ]] && shellcheck "$SH_FILE" 2>&1
done

Build/config parse check

# YAML files
for YML in $(echo "$CHANGED" | rg '\.ya?ml$' || true); do
  [[ -f "$YML" ]] && python3 -c "import yaml; yaml.safe_load(open('$YML'))" \
    && echo "PASS: $YML" || echo "FAIL: $YML"
done

# JSON files
for JSON_F in $(echo "$CHANGED" | rg '\.json$' || true); do
  [[ -f "$JSON_F" ]] && python3 -m json.tool "$JSON_F" > /dev/null \
    && echo "PASS: $JSON_F" || echo "FAIL: $JSON_F"
done

Step 3: Revert-Test Quality Check

Prove at least one test is a genuine guard, not a dead assertion.

Safety: abort if the working tree has uncommitted changes.

if ! git diff --exit-code > /dev/null 2>&1; then
  echo "[RT] SKIP: working tree dirty — revert-test unsafe"
  # Mark INCONCLUSIVE and continue
fi

Algorithm (one representative fix):

From the changed source files, find one that has a corresponding test.
- Rust: a #[test] in the same crate that exercises a changed function.
- Python: tests/test_<module>.py for a changed <module>.py.
- Shell: a test harness that invokes the changed script.
Identify the specific changed line or block from the diff.
Edit that line to revert the fix to its broken state.
Run the targeted test: confirm it FAILS (expected).
Restore: git checkout — <file> (git-based restore, safe on interrupt).
Run the targeted test again: confirm it PASSES.
If any step cannot complete, mark INCONCLUSIVE with the reason.

Revert-test output format:

[RT-1] Target: <file>:<line> — <description of fix>
[RT-2] Broke fix: <edit description>
[RT-3] Ran: <test command> → <test name> FAILED (expected)
[RT-4] Restored: git checkout -- <file>
[RT-5] Ran: <test command> → <test name> PASSED
Result: PASS — test is a genuine guard

When no covering test exists:

Revert-test: INCONCLUSIVE — no covering test for <changed area>
Recommendation: add a test for <changed function or behaviour>

Step 4: Final Full-Suite Run

After all area checks and the revert-test:

# Rust workspace
cargo test --workspace 2>&1

# Python project
uv run pytest tests/ -v 2>&1

# Mixed project: run both
cargo test --workspace 2>&1 && uv run pytest tests/ -v 2>&1

Capture full output as final evidence [En].

Step 5: Produce Summary Table

### validate-mr: <MR title or number>

| Area | Step | Evidence | Result |
|------|------|----------|--------|
| Rust: token-types | cargo build --workspace | [E1] 0 errors | PASS |
| Rust: token-types | cargo test -p token-types | [E2] 12 passed | PASS |
| Rust: token-types | cargo build --all-features | [E3] 0 errors | PASS |
| Shell: hooks/pre-commit | shellcheck | [E4] 0 issues | PASS |
| Revert-test: lib.rs:45 | break/fail/restore | [RT-1..5] genuine guard | PASS |
| Final: cargo test --workspace | full suite | [E5] 694 passed, 0 failed | PASS |

**Totals**: 6 steps — 6 PASS, 0 FAIL, 0 INCONCLUSIVE

Step 6: Posting (--post flag only)

When --post is given, post the summary table as a PR comment:

gh pr comment "$MR_NUMBER" --body "$(cat /tmp/validate-mr-summary.md)"

Skip posting when invoked from /fix-pr — results feed into the Gate 3 summary comment instead.

Failure Behaviour

When any step produces FAIL:

Surface the failures in the summary table with the evidence reference.
When called from /fix-pr: halt before Step 6 (Complete). The user must fix the failures or pass --skip-validate to /fix-pr to bypass.
When called standalone: report failures and exit with non-zero status.

INCONCLUSIVE results are reported but do not halt the workflow.

Exit Criteria

[ ] gh pr diff --name-only returned a non-empty file list (diff fetched)
[ ] Every detected area has at least one row in the summary table
[ ] Every row shows an Evidence reference ([E1], [E2], etc.) with the actual command output, not fabricated
[ ] Revert-test attempted for at least one area with a covering test, or documented as INCONCLUSIVE with reason
[ ] Final full-suite run appears in the summary table
[ ] Summary table is present with columns: Area, Step, Evidence, Result
[ ] Any FAIL result halts /fix-pr before Step 6 when called from fix-pr
[ ] Working tree is clean after skill completes (git checkout restore confirmed successful for any revert-test mutation)

Related Skills

athola/architecture-paradigm-domain-driven

data-ai

VerifiedTrustedCommunity

Models a business in its own language. Use when the domain has real business rules to capture.

323SKILL.mdUpdated Jul 15, 2026

athola/architecture-paradigm-domain-driven

athola/ideate

research

VerifiedTrustedCommunity

Generate diverse solution candidates with category-spanning ideation methods and rotation. Use when stuck on a design or fighting repetitive LLM output.

323SKILL.mdUpdated Jun 8, 2026

athola/validate-pr

development

VerifiedTrustedCommunity

Generates and self-executes a diff-derived test plan for a PR. Use when validating PR changes before merge. Do not use for code review; use sanctum:pr-review.

323SKILL.mdUpdated Jun 8, 2026

athola/graduated-implementation

development

VerifiedTrustedCommunity

Ramps implementation ambition a notch only after the prior increment is understood. Use when building a feature you must understand, not just ship.

323SKILL.mdUpdated Jun 8, 2026

athola/graduated-implementation

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/athola/claude-night-market.git

# Copy into Claude Code skills folder (global)
cp -r claude-night-market/plugins/sanctum/skills/validate-mr ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

athola/claude-night-market

294 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT