lab/autoresearch/SKILL.md
Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.
npx skillsauth add oliver-kriska/claude-elixir-phoenix lab:autoresearchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Iteratively improve plugin skills via the autoresearch pattern: propose one mutation -> eval -> keep/revert -> repeat.
/lab:autoresearch # Targeted: attack weakest skill+dimension
/lab:autoresearch --skill review # Focus on one skill
/lab:autoresearch --strategy sweep # Process all skills alphabetically
/lab:autoresearch --dry-run # Show what would change, don't commit
For overnight runs:
/loop 5m /lab:autoresearch --strategy sweep --max-iterations 200
keep or revert command (never skip)All eval/git/journal operations go through ONE script. Do NOT run these manually.
# Find the weakest skill+dimension
python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted
# Score a skill (before mutation, to get baseline)
python3 lab/autoresearch/scripts/run-iteration.py score <skill-name>
# After mutation: score + checks + compare → verdict (KEEP or REVERT)
python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>
# Act on verdict:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
--desc "what changed" --asi '{"hypothesis": "why", "mechanism": "how"}'
python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
--desc "what was attempted" --asi '{"hypothesis": "why", "regression": "what broke", "avoid": "do not retry this"}'
# Check overall progress
python3 lab/autoresearch/scripts/run-iteration.py status
lab/autoresearch/program.md (goals, mutable surface, rules)lab/autoresearch/ideas.md if it exists (deferred optimizations)python3 lab/autoresearch/scripts/run-iteration.py statusRun: python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted
Parse the JSON: skill, dimension, failing_checks. If all_perfect → STOP.
lab/eval/evals/{skill}.jsonideas.md for deferred ideas about this skill${CLAUDE_SKILL_DIR}/references/mutation-strategies.mdpython3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>verdict fieldIf verdict is KEEP:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
--desc "..." --asi '{"hypothesis": "...", "mechanism": "..."}'
If verdict is REVERT:
python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
--desc "..." --asi '{"hypothesis": "...", "regression": "...", "avoid": "..."}'
If during analysis you discovered a promising optimization you can't act on now:
lab/autoresearch/ideas.md as a bullet${CLAUDE_SKILL_DIR}/references/mutation-strategies.md — mutation type catalog${CLAUDE_SKILL_DIR}/references/state-management.md — git protocol, journalinglab/autoresearch/program.md — research agenda (read every iteration)tools
Scope or freeze which files Claude can edit during debugging, a refactor, or review. Use when edits should stay in specific dirs, or for a read-only investigate lock. Backed by a sentinel + PreToolUse hook.
development
Ash Framework — resources, actions, policies, aggregates, calculations, AshPhoenix.Form, LiveView, migrations. Use when generating resources via mix ash.codegen, editing changes, checks, types, validations, or domain code interfaces.
development
Reduce mix output noise (5-15% token savings) by installing rtk filters that compress mix test/credo/dialyzer/compile output before it reaches Claude. Use when long mix output floods context.
development
Narrow bare rescue in Elixir so real errors like KeyError and typos propagate instead of being swallowed. Use to audit rescues and refactor error handling.