skills/stop-slop/SKILL.md
Dual-mode slop detection for code and content. ADLC v2 spec.
npx skillsauth add bigeasyfreeman/adlc skills/stop-slopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Dual-mode slop guidance for generated-output surfaces. Mode 1 names deterministic code checks that may run when project config enables them. Mode 2 defines rubric-based content and product-output evaluation. ADLC core gates the Build Brief contract with
bin/adlc slop-gate; project-specific linters or content checkers can plug in behind that contract.
Stop Slop is an output-side quality gate, not a prompt-improvement checklist. A better prompt can still produce bad output; ADLC only trusts output after it has been scored against a benchmark.
For any generated-output surface, the benchmark is:
{
"slop_quality_gate": {
"applicability": "required | not_applicable",
"reason": "string",
"mode": "code | content | product_output | agent_output | mixed",
"threshold": 0.7,
"metrics": ["rubric_score | exact_match | schema_validity | semantic_similarity | test_strength"],
"eval_cases": [
{
"id": "SLOP-001",
"source": "golden | human_edit | council_rejection | runtime_failure | production_sample | incident | support_ticket | analytics_drop | other",
"input": "string",
"expected_quality": "string",
"golden_output": "optional string",
"rubric": "optional string",
"metric": "string",
"threshold": 0.7
}
],
"baseline_score": 0.82,
"regression_tolerance": 0.03,
"failure_action": "block | revise | human_approval | monitor",
"case_promotion_sources": ["human_edit", "council_rejection", "production_sample"]
}
}
Run the loop in three places:
Default thresholds:
test-strength (coverage >= 0.8, mutation_kill_rate >= 0.6, no material surviving mutants).score >= 0.70 unless the Build Brief sets a stricter threshold.regression_tolerance blocks unless failure_action = human_approval and the human approval is captured.Every failure should produce a candidate eval case before it becomes a new style rule. Candidate sources include human edits, council rejections, runtime failures, production samples, incidents, support tickets, and analytics drops tied to generated-output quality.
See docs/specs/slop-eval-loop.md.
Detects structural code problems that indicate incomplete, lazy, or AI-generated code. This mode runs only when project configuration or the task's slop_quality_gate opts into code-output checks.
Regex rules are path-aware. A match is a hard failure only in shipped executable source where the pattern leaves incomplete behavior. Tests, fixtures, examples, and docs may use TODO/FIXME or placeholders when the surrounding verifier proves they are intentional.
pass\s*(#.*)?$ # bare pass statements
TODO\b # TODO comments in shipped executable source only
FIXME\b # FIXME comments in shipped executable source only
raise NotImplementedError
\.\.\.\s*$ # ellipsis as function body
AST rules (Python):
pass or Expr(Constant(Ellipsis)) nodepass node with no docstringAST rules (TypeScript/JavaScript):
{}new Error("not implemented")Ceiling: 50 SLOC per function (default). Configurable via .stop-slop.yml:
code_slop:
max_function_sloc: 50
Measurement: count non-blank, non-comment, non-decorator lines inside the function body. Exceeding the ceiling triggers a warning. Exceeding 2x the ceiling is a hard failure.
Flag any two code blocks (3+ lines) with >80% token similarity within the same file or across files in the same commit. Uses normalized token comparison (strip whitespace, normalize variable names to positional placeholders).
Threshold: 80% similarity = warning. 95% similarity = hard failure (copy-paste detected).
Regex/AST detection for comprehensions that return the loop variable unchanged:
# Flagged:
[x for x in items]
[item for item in collection]
{k: v for k, v in d.items()}
# Not flagged (these transform):
[x.name for x in items]
[x for x in items if x.active]
Detect patterns where the comparison is redundant:
# Flagged:
if x == True:
if x == False:
if x == None:
if len(items) > 0:
if len(items) != 0:
if bool(x):
# Fix:
if x:
if not x:
if x is None:
if items:
if items:
if x:
Regex patterns for values that belong in configuration:
https?://[^\s"']+ # URLs (except in config/, .env, tests/)
:\d{4,5}[/"'\s] # port numbers
timeout\s*=\s*\d+ # hardcoded timeouts
sleep\(\s*\d+ # hardcoded sleep values
Exemptions: files matching **/config/**, **/.env*, **/test*/**, **/fixture*/**, **/mock*/**.
Detect functions/classes that are defined but never imported or called elsewhere in the project. Applies to:
For any new function added in a commit, check that a corresponding test file or test function exists. Naming conventions checked:
test_<function_name> in tests/ or test_*.py<function_name>.test.ts or <function_name>.spec.tsdescribe("<FunctionName>" block in test filesMissing test coverage for new functions = warning. Missing test file entirely = hard failure.
Hard failures block the commit/delivery. Warnings accumulate:
Detects AI writing patterns in prose, product output, and agent output when the
Build Brief marks a generated-output surface active. It does not run on every
.md or .txt file by default.
slop-judge to catch residual generic filler, passive evasion, and tautology.{
"mode": "general | outreach | product_output | agent_output",
"regex_screen": "pass",
"content": "string",
"audience": "internal | external",
"rubric": ["specific criterion"],
"threshold": 0.7,
"baseline_score": 0.82,
"regression_tolerance": 0.03
}
{
"verdict": "pass | revise",
"score": 0.0,
"threshold": 0.7,
"criterion_scores": [
{
"criterion": "specificity",
"score": 0.0,
"reason": "string"
}
],
"regression_delta": 0.0,
"rationale": "string",
"signals": ["generic_filler", "passive_evasion", "tautology"],
"new_eval_case_candidate": {
"source": "council_rejection | human_edit | runtime_failure | production_sample | other",
"input": "optional string",
"bad_output": "string",
"expected_quality": "string",
"metric": "rubric_score",
"threshold": 0.7
}
}
These dimensions guide scoring. They are not blanket bans in ADLC core.
1. Active voice. Human subjects. Find the actor. Make them the subject. "The team shipped it" not "It was shipped." "You read the data and concluded" not "The data tells us."
2. Reduce filler adverbs. Flag repeated filler adverbs when they weaken specificity. Do not fail a piece because one legitimate adverb appears.
3. No throat-clearing openers. State the point. Not "Here's the thing:" or "Let me be clear:" or "The uncomfortable truth is." Start with the content.
4. Avoid empty binary contrasts. Flag "Not X. But Y." only when the contrast adds no information.
5. No rhetorical setups ("Here's what I mean:"). "Here's what I mean:" becomes the meaning itself. "Think about it:" becomes trust the reader. "What if I told you" becomes just tell them.
6. No vague declaratives. "The implications are significant" says nothing. Name the implication. "The stakes are high" says nothing. Name the stake.
7. No dramatic fragmentation (staccato lists). "Speed. Quality. Cost." becomes "Speed, quality, cost." Complete sentences. No staccato drama.
8. Specificity over abstraction. Put the reader in the room. Concrete scenes over narrator-from-a-distance. "You sit down on Monday and realize your pipeline is empty" beats "Teams often struggle with pipeline consistency."
Repeated filler adverbs can lower the rubric score when they replace concrete detail.
| Pattern | Fix | |---------|-----| | "Not X. But Y." | Say Y. | | "X isn't the problem. Y is." | "Y is the problem." | | "The answer isn't X. It's Y." | "Y." | | "Not just X but also Y" | "X and Y" or just "Y" | | "stops being X and starts being Y" | "becomes Y" |
| Pattern | Fix | |---------|-----| | "the data tells us" | "we read the data and concluded" | | "the culture shifts" | "people changed how they work" | | "the market rewards" | "buyers pay for" | | "the decision emerges" | "X decided" | | "the conversation moves toward" | "they steered toward" |
| Dimension | What it measures | Signs of failure | |-----------|-----------------|-----------------| | Directness | How much filtering/softening exists | Qualifiers, hedges, throat-clearing before the point | | Rhythm | Sentence length variation | Three equal-length sentences in a row; staccato fragmentation | | Trust | Authenticity; no false sincerity | "I promise," performative emphasis, manufactured drama | | Authenticity | Specificity and concrete detail | Abstract claims, narrator-from-a-distance, vague declaratives | | Density | Compression; no filler | Phrases that add words without adding meaning |
Score bands:
Thresholds:
Below threshold = revise internally. At or above threshold = proceed to human review.
Before: "Here's the thing: building products is hard. Not because the technology is complex. Because people are complex. Let that sink in." After: "Building products is hard. Technology is manageable. People aren't."
Before: "In today's fast-paced landscape, we need to lean into discomfort and navigate uncertainty with clarity. This matters because your competition isn't waiting." After: "Move faster. Your competition is."
Before: "Speed. Quality. Cost. You can only pick two. That's it. That's the tradeoff." After: "Speed, quality, cost — pick two."
Before: "What if I told you that the best teams don't optimize for productivity? Here's what I mean: they optimize for learning. Think about it." After: "The best teams optimize for learning, not productivity."
Before: "It turns out that most teams struggle with alignment. The uncomfortable truth is that nobody wants to admit they're confused. And that's okay." After: "Teams struggle with alignment. Nobody admits confusion."
After stop-slop scoring, a project may optionally verify against a local brand
foundation, for example charters/<project>-brand-foundation.md:
Stop-slop clears AI-output quality patterns. Brand checks are project-local overlays, not ADLC core requirements.
Optional .stop-slop.yml in project root:
code_slop:
max_function_sloc: 50
duplicate_threshold: 0.80
exclude:
- "**/migrations/**"
- "**/generated/**"
- "**/vendor/**"
content_slop:
threshold_general: 35
threshold_outreach: 38
exclude:
- "**/CHANGELOG.md"
- "**/LICENSE*"
# Build Brief contract check, implemented in this repo
bin/adlc slop-gate --build-brief ./build-brief.json --json
# Optional project-provided Mode 1: Code slop check
stop-slop code --input ./src/ --output ./code-report.json
# Mode 2: Content slop score
stop-slop content --input ./draft.md --output ./score.json
# Content slop score and revise (up to 2 passes)
stop-slop content --input ./draft.md --revise --passes 2 --output ./revised.md
# Outreach mode (higher threshold: 38/50)
stop-slop content --input ./outreach.md --mode outreach
# Optional project-provided full check
your-stop-slop-runner all --commit HEAD
development
Discovers and records repo-local approved build paths so agents reuse proven patterns instead of inventing parallel architectures.
development
Scoped maintenance for docs/solutions entries when stale signals, refactors, or explicit user scope require refresh.
documentation
Conditionally captures verified reusable ADLC learnings into docs/solutions after successful closeout.
development
Uses Graphify as ADLC's graph-backed research layer and Beads as an optional dependency-aware task memory layer. Produces evidence for compatibility, reuse, accuracy, dark-code hotspots, and long-horizon handoff.