skills/slopwatch/SKILL.md
Use Slopwatch to detect LLM reward hacking in .NET code changes. Run after every code modification to catch disabled tests, suppressed warnings, empty catch blocks, and other shortcuts that mask real problems.
npx skillsauth add aaronontheweb/dotnet-skills dotnet-slopwatchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill constantly. Every time an LLM (including Claude) makes changes to:
Run slopwatch to validate the changes don't introduce "slop."
"Slop" refers to shortcuts LLMs take that make tests pass or builds succeed without actually solving the underlying problem. These are reward hacking behaviors - the LLM optimizes for apparent success rather than real fixes.
| Pattern | Example | Why It's Bad |
|---------|---------|--------------|
| Disabled tests | [Fact(Skip="flaky")] | Hides failures instead of fixing them |
| Warning suppression | #pragma warning disable CS8618 | Silences compiler without fixing issue |
| Empty catch blocks | catch (Exception) { } | Swallows errors, hides bugs |
| Arbitrary delays | await Task.Delay(1000); | Masks race conditions, makes tests slow |
| Project-level suppression | <NoWarn>CS1591</NoWarn> | Disables warnings project-wide |
| CPM bypass | Version="1.0.0" inline | Undermines central package management |
Never accept these patterns. If an LLM introduces slop, reject the change and require a proper fix.
Add to .config/dotnet-tools.json:
{
"version": 1,
"isRoot": true,
"tools": {
"slopwatch.cmd": {
"version": "0.2.0",
"commands": ["slopwatch"],
"rollForward": false
}
}
}
Then restore:
dotnet tool restore
dotnet tool install --global Slopwatch.Cmd
Before using slopwatch on an existing project, create a baseline of current issues:
# Initialize baseline from existing code
slopwatch init
# This creates .slopwatch/baseline.json
git add .slopwatch/baseline.json
git commit -m "Add slopwatch baseline"
Why baseline? Legacy code may have existing issues. The baseline ensures slopwatch only catches new slop being introduced, not pre-existing technical debt.
Run slopwatch after any LLM-generated code modification:
# Analyze for new issues (uses baseline)
slopwatch analyze
# Use strict mode - fail on warnings too
slopwatch analyze --fail-on warning
Do not ignore it. Instead:
# Example: LLM disabled a test
❌ SW001 [Error]: Disabled test detected
File: tests/MyApp.Tests/OrderTests.cs:45
Pattern: [Fact(Skip="Test is flaky")]
# Correct response: Ask for actual fix
"This test was disabled instead of fixed. Please investigate why
it's flaky and fix the underlying timing/race condition issue."
Only update the baseline when slop is truly justified and documented:
# Add current detections to baseline (use sparingly!)
slopwatch analyze --update-baseline
Justification examples:
Document why in a code comment when updating baseline.
Add slopwatch as a hook to automatically validate every edit. Create or update .claude/settings.json:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit|MultiEdit",
"hooks": [
{
"type": "command",
"command": "slopwatch analyze -d . --hook",
"timeout": 60000
}
]
}
]
}
}
The --hook flag:
Add slopwatch to your CI pipeline as a quality gate:
jobs:
slopwatch:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '9.0.x'
- name: Install Slopwatch
run: dotnet tool install --global Slopwatch.Cmd
- name: Run Slopwatch
run: slopwatch analyze -d . --fail-on warning
- task: DotNetCoreCLI@2
displayName: 'Install Slopwatch'
inputs:
command: 'custom'
custom: 'tool'
arguments: 'install --global Slopwatch.Cmd'
- script: slopwatch analyze -d . --fail-on warning
displayName: 'Slopwatch Analysis'
| Rule | Severity | What It Catches |
|------|----------|-----------------|
| SW001 | Error | Disabled tests (Skip=, Ignore, #if false) |
| SW002 | Warning | Warning suppression (#pragma warning disable, SuppressMessage) |
| SW003 | Error | Empty catch blocks that swallow exceptions |
| SW004 | Warning | Arbitrary delays in tests (Task.Delay, Thread.Sleep) |
| SW005 | Warning | Project file slop (NoWarn, TreatWarningsAsErrors=false) |
| SW006 | Warning | CPM bypass (VersionOverride, inline Version attributes) |
Create .slopwatch/slopwatch.json to customize:
{
"minSeverity": "warning",
"rules": {
"SW001": { "enabled": true, "severity": "error" },
"SW002": { "enabled": true, "severity": "warning" },
"SW003": { "enabled": true, "severity": "error" },
"SW004": { "enabled": true, "severity": "warning" },
"SW005": { "enabled": true, "severity": "warning" },
"SW006": { "enabled": true, "severity": "warning" }
},
"exclude": [
"**/Generated/**",
"**/obj/**",
"**/bin/**"
]
}
For maximum protection during LLM coding sessions, elevate all rules to errors:
{
"minSeverity": "warning",
"rules": {
"SW001": { "enabled": true, "severity": "error" },
"SW002": { "enabled": true, "severity": "error" },
"SW003": { "enabled": true, "severity": "error" },
"SW004": { "enabled": true, "severity": "error" },
"SW005": { "enabled": true, "severity": "error" },
"SW006": { "enabled": true, "severity": "error" }
}
}
The goal is to prevent the gradual accumulation of technical debt that occurs when LLMs optimize for "make the test pass" rather than "fix the actual problem."
# First time setup
slopwatch init
git add .slopwatch/baseline.json
# After every LLM code change
slopwatch analyze
# Strict mode (recommended)
slopwatch analyze --fail-on warning
# With stats (performance debugging)
slopwatch analyze --stats
# Update baseline (rare, document why)
slopwatch analyze --update-baseline
# JSON output for tooling
slopwatch analyze --output json
The only valid reasons to update baseline or disable a rule:
| Scenario | Action | Required | |----------|--------|----------| | Third-party forces pattern | Update baseline | Code comment explaining why | | Generated code (not editable) | Add to exclude list | Document in config | | Intentional rate limiting delay | Update baseline | Code comment, not in test | | Legacy code cleanup | One-time baseline update | PR description |
Invalid reasons:
development
Write modern, high-performance C# code using records, pattern matching, value objects, async/await, Span<T>/Memory<T>, and best-practice API design patterns. Emphasizes functional-style programming with C# 12+ features.
development
Design stable, compatible public APIs using extend-only design principles. Manage API compatibility, wire compatibility, and versioning for NuGet packages and distributed systems.
development
Snapshot test email templates using Verify to catch regressions. Validates rendered HTML output matches approved baseline. Works with MJML templates and any email renderer.
testing
Write integration tests using TestContainers for .NET with xUnit. Covers infrastructure testing with real databases, message queues, and caches in Docker containers instead of mocks.