plugins/gauntlet/skills/challenge/SKILL.md
Presents adaptive codebase challenge questions with multiple-choice and trace exercises. Use when testing contributor knowledge of the codebase.
npx skillsauth add athola/claude-night-market challengeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Present challenges from the knowledge base and evaluate answers.
Before generating a challenge, register the in-loop variation provider so we do not call out to the Anthropic API just to spawn a sibling Claude (issue #464). Outside Claude Code this is a no-op and the default Anthropic provider remains active.
from gauntlet.providers.in_loop import (
register_in_loop_provider_if_inside_claude_code,
)
register_in_loop_provider_if_inside_claude_code()
Load state: read .gauntlet/knowledge.json and developer
progress
Check for pending challenge: if
.gauntlet/state/pending_challenge.json exists, evaluate the
developer's most recent message as an answer before generating
a new one
Generate challenge: use adaptive weighting to select a knowledge entry and challenge type
Present challenge: show the question with context
Evaluate answer: score the response (pass/partial/fail)
Record result: update developer progress and streak
On pass: write pass token if from pre-commit gate. Show next challenge if in session.
On fail: show correct answer with explanation. Present a new challenge.
| Result | Score | Streak | |--------|-------|--------| | Pass | 1.0 | +1 | | Partial | 0.5 | reset | | Fail | 0.0 | reset |
tools
Detect friction signals; graduate patterns into rules. Use for session retrospectives.
testing
Use when you need a diff-derived test plan for an MR — reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions.
development
Curate the web-capture index. Use when the capture backlog grows, captures sit unprocessed at seedling/pending, or to surface stored research during work.
testing
Probe memory/summary clarity via dual anchor questions: task progress, info gaps. Use when verifying session state or summary before handoff or compression.