claude/skills/self-eval/SKILL.md
Run the Self-Evaluation Checklist against your last response or recent changes. Use this skill when the user says "self-eval", "self-evaluate", "check my response", "quality check", "evaluate response", or when you want to proactively verify response quality before finishing. Also trigger when the user expresses doubt about your output ("did you actually test that?", "are you sure?", "that seems incomplete"). This skill cross-references with /de-slop and /tdd-assertions for items that have dedicated tooling.
npx skillsauth add paulnsorensen/dotfiles self-evalInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Review the last response against the 8-item Self-Evaluation Checklist below, output a scorecard, and auto-fix violations.
<certain>, <speculative>, or <don't know> (wrapped in backticks so the tag renders as literal text)./de-slop on changed files./tdd-assertions on test code.Determine what to evaluate:
Evaluate all 8 checklist items. Output a compact scorecard:
## Self-Evaluation
| # | Check | Result | Notes |
|---|-------------------|--------|-------|
| 1 | Sycophancy | PASS | |
| 2 | Premature complete | FAIL | Left TODO on line 42 |
| 3 | Dismissing failures| PASS | |
| 4 | Hedging | PASS | |
| 5 | Scope reduction | WARN | Dropped retry logic, acknowledged |
| 6 | False confidence | PASS | |
| 7 | AI slop | DEFER | Running /de-slop |
| 8 | Weak assertions | DEFER | Running /tdd-assertions |
Use PASS, FAIL, WARN (acknowledged deviation), or DEFER (delegating to specialized skill).
/de-slop on the changed files. Mark DEFER until results return, then update to PASS/FAIL./tdd-assertions on test files. Mark DEFER until results return, then update to PASS/FAIL./diff for smoke testing.Only invoke these if the item is relevant — no code changes means items 7-8 are automatic PASS.
For each FAIL:
After fixes, output the updated scorecard with a one-line summary:
tools
Reconstruct what a past coding-agent session was doing so you can resume it — goal, files touched, last verified state, and the next step — by querying the session logs. Use when the user says "what was I working on", "recover that session", "reconstruct where I left off", "resume my last session", "what did that session change", "rebuild context from logs", or invokes /work-recovery. Report-only — it never scores or judges. Do NOT use for usage scoring (that is /skill-improver, /tool-efficiency, /prompt-analytics) or one-off interactive log queries (that is /session-analytics).
development
Curate this repo's hallouminate wiki (.hallouminate/wiki/, the repo:dotfiles:wiki corpus) — add or update architecture pages, per-harness docs, and gotchas. Use when the user says "update the wiki", "document this in the wiki", "refresh the harness docs", "add a wiki page", "curate the wiki", "the wiki is stale", or invokes /wiki-curator. Also use at session end to write back a non-obvious decision or gotcha worth preserving. Grounds the existing wiki first, follows one-topic-per-file conventions, verifies every external doc URL before writing, and reindexes. Do NOT use for general code search (that is cheez-search) or for editing AGENTS.md command reference.
tools
Audit how a tool, command, or MCP server is actually used across coding-agent sessions and produce calibrated recommendations — tool-vs-task fit, error forensics, fix recommendations, permission friction, MCP health, and token economics. Use when the user says "tool efficiency", "am I using X efficiently", "audit tool usage", "why does X keep failing", "how do I fix this error", "what should I change", "permission friction", "is this MCP worth it", "tool error rate", "fix recommendations", or invokes /tool-efficiency. Do NOT use for auditing a skill or agent definition (that is /skill-improver) or for one-off interactive log queries (that is /session-analytics).
tools
Analyze how prompts and skill routing behave across coding-agent sessions and produce calibrated recommendations — prompt-pattern analysis, routing accuracy, and knowledge gaps. Use when the user says "analyze my prompts", "prompt patterns", "is routing working", "which skill should have fired", "knowledge gaps", "what do I keep asking", or invokes /prompt-analytics. Do NOT use for auditing a single skill/agent definition (that is /skill-improver), tool/MCP efficiency (that is /tool-efficiency), or one-off interactive log queries (that is /session-analytics).