Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

oliver-kriska/lab:autoresearch

Name: lab:autoresearch
Author: oliver-kriska

lab/autoresearch/SKILL.md

npx skillsauth add oliver-kriska/claude-elixir-phoenix lab:autoresearch

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Autoresearch — Plugin Skill Self-Improvement

Iteratively improve plugin skills via the autoresearch pattern: propose one mutation -> eval -> keep/revert -> repeat.

Usage

/lab:autoresearch                           # Targeted: attack weakest skill+dimension
/lab:autoresearch --skill review            # Focus on one skill
/lab:autoresearch --strategy sweep          # Process all skills alphabetically
/lab:autoresearch --dry-run                 # Show what would change, don't commit

For overnight runs:

/loop 5m /lab:autoresearch --strategy sweep --max-iterations 200

Iron Laws

ONE mutation per iteration — if description needs "and", split into two
NEVER mutate read-only files — check program.md before every write
EVAL is deterministic — always use the wrapper script, never LLM-judge
REVERT on regression OR checks failure — no exceptions
LOG every iteration — use keep or revert command (never skip)
CHECK ideas.md before proposing — don't rediscover known optimizations

Wrapper Script Commands

All eval/git/journal operations go through ONE script. Do NOT run these manually.

# Find the weakest skill+dimension
python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

# Score a skill (before mutation, to get baseline)
python3 lab/autoresearch/scripts/run-iteration.py score <skill-name>

# After mutation: score + checks + compare → verdict (KEEP or REVERT)
python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>

# Act on verdict:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "what changed" --asi '{"hypothesis": "why", "mechanism": "how"}'

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "what was attempted" --asi '{"hypothesis": "why", "regression": "what broke", "avoid": "do not retry this"}'

# Check overall progress
python3 lab/autoresearch/scripts/run-iteration.py status

Core Loop (ONE iteration)

Step 1: Read State

Read lab/autoresearch/program.md (goals, mutable surface, rules)
Read lab/autoresearch/ideas.md if it exists (deferred optimizations)
Run: python3 lab/autoresearch/scripts/run-iteration.py status

Step 2: Select Target

Run: python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

Parse the JSON: skill, dimension, failing_checks. If all_perfect → STOP.

Step 3: Read + Propose

Read target SKILL.md and its references/ listing
Read eval definition from lab/eval/evals/{skill}.json
Check ideas.md for deferred ideas about this skill
Check recent journal entries for prior failures on this skill (avoid repeats)
Consult ${CLAUDE_SKILL_DIR}/references/mutation-strategies.md
Propose exactly ONE change targeting the failing checks

Step 4: Apply + Evaluate

Apply the mutation via Edit tool
Run: python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>
Parse JSON → check verdict field

Step 5: Keep or Revert

If verdict is KEEP:

python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "mechanism": "..."}'

If verdict is REVERT:

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "regression": "...", "avoid": "..."}'

Step 6: Ideas Backlog

If during analysis you discovered a promising optimization you can't act on now:

Append it to lab/autoresearch/ideas.md as a bullet
On next resume: prune stale/tried ideas, experiment with the rest

Step 7: Continue or Stop

All targets >= 0.95? Print "AUTORESEARCH_COMPLETE"
Max iterations reached? Print "AUTORESEARCH_COMPLETE"
50 consecutive discards? Print "AUTORESEARCH_STUCK"
Otherwise: immediately start Step 1 again

References

${CLAUDE_SKILL_DIR}/references/mutation-strategies.md — mutation type catalog
${CLAUDE_SKILL_DIR}/references/state-management.md — git protocol, journaling
lab/autoresearch/program.md — research agenda (read every iteration)

oliver-kriska/lab:autoresearch

lab/autoresearch/SKILL.md

Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.

340 stars

tools

Updated May 28, 2026

$ install --global

skillsauth

npx skillsauth add oliver-kriska/claude-elixir-phoenix lab:autoresearch

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 28, 2026, 6:48 AM113.5s11 files scanned

SKILL.md

name:: lab:autoresearch
description:: >
effort:: high
argument-hint:: [--skill NAME] [--strategy targeted|sweep|random] [--dry-run] [--max-iterations N]
disable-model-invocation:: true

Autoresearch — Plugin Skill Self-Improvement

Iteratively improve plugin skills via the autoresearch pattern: propose one mutation -> eval -> keep/revert -> repeat.

Usage

/lab:autoresearch                           # Targeted: attack weakest skill+dimension
/lab:autoresearch --skill review            # Focus on one skill
/lab:autoresearch --strategy sweep          # Process all skills alphabetically
/lab:autoresearch --dry-run                 # Show what would change, don't commit

For overnight runs:

/loop 5m /lab:autoresearch --strategy sweep --max-iterations 200

Iron Laws

ONE mutation per iteration — if description needs "and", split into two
NEVER mutate read-only files — check program.md before every write
EVAL is deterministic — always use the wrapper script, never LLM-judge
REVERT on regression OR checks failure — no exceptions
LOG every iteration — use keep or revert command (never skip)
CHECK ideas.md before proposing — don't rediscover known optimizations

Wrapper Script Commands

All eval/git/journal operations go through ONE script. Do NOT run these manually.

# Find the weakest skill+dimension
python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

# Score a skill (before mutation, to get baseline)
python3 lab/autoresearch/scripts/run-iteration.py score <skill-name>

# After mutation: score + checks + compare → verdict (KEEP or REVERT)
python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>

# Act on verdict:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "what changed" --asi '{"hypothesis": "why", "mechanism": "how"}'

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "what was attempted" --asi '{"hypothesis": "why", "regression": "what broke", "avoid": "do not retry this"}'

# Check overall progress
python3 lab/autoresearch/scripts/run-iteration.py status

Core Loop (ONE iteration)

Step 1: Read State

Read lab/autoresearch/program.md (goals, mutable surface, rules)
Read lab/autoresearch/ideas.md if it exists (deferred optimizations)
Run: python3 lab/autoresearch/scripts/run-iteration.py status

Step 2: Select Target

Run: python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

Parse the JSON: skill, dimension, failing_checks. If all_perfect → STOP.

Step 3: Read + Propose

Read target SKILL.md and its references/ listing
Read eval definition from lab/eval/evals/{skill}.json
Check ideas.md for deferred ideas about this skill
Check recent journal entries for prior failures on this skill (avoid repeats)
Consult ${CLAUDE_SKILL_DIR}/references/mutation-strategies.md
Propose exactly ONE change targeting the failing checks

Step 4: Apply + Evaluate

Apply the mutation via Edit tool
Run: python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>
Parse JSON → check verdict field

Step 5: Keep or Revert

If verdict is KEEP:

python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "mechanism": "..."}'

If verdict is REVERT:

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "regression": "...", "avoid": "..."}'

Step 6: Ideas Backlog

If during analysis you discovered a promising optimization you can't act on now:

Append it to lab/autoresearch/ideas.md as a bullet
On next resume: prune stale/tried ideas, experiment with the rest

Step 7: Continue or Stop

All targets >= 0.95? Print "AUTORESEARCH_COMPLETE"
Max iterations reached? Print "AUTORESEARCH_COMPLETE"
50 consecutive discards? Print "AUTORESEARCH_STUCK"
Otherwise: immediately start Step 1 again

References

${CLAUDE_SKILL_DIR}/references/mutation-strategies.md — mutation type catalog
${CLAUDE_SKILL_DIR}/references/state-management.md — git protocol, journaling
lab/autoresearch/program.md — research agenda (read every iteration)

Related Skills

oliver-kriska/assigns

tools

VerifiedTrustedCommunity

Compatibility alias for the Elixir/Phoenix plugin's LiveView assigns audit. Invoke explicitly with /lv:assigns.

505SKILL.mdUpdated Jul 26, 2026

oliver-kriska/assigns

oliver-kriska/trace

development

VerifiedTrustedCommunity

Trace Elixir call trees from entry points via mix xref. Use when debugging data flow, planning signature changes, or understanding how a bug reaches code.

505SKILL.mdUpdated Jul 26, 2026

oliver-kriska/n1-check

tools

VerifiedTrustedCommunity

Compatibility alias for the Elixir/Phoenix plugin's N+1 query checker. Invoke explicitly with /ecto:n1-check.

505SKILL.mdUpdated Jul 26, 2026

oliver-kriska/n1-check

oliver-kriska/constraint-debug

tools

VerifiedTrustedCommunity

Compatibility alias for the Elixir/Phoenix plugin's Ecto constraint debugger. Invoke explicitly with /ecto:constraint-debug.

505SKILL.mdUpdated Jul 26, 2026

oliver-kriska/constraint-debug

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/oliver-kriska/claude-elixir-phoenix.git

# Copy into Claude Code skills folder (global)
cp -r claude-elixir-phoenix/lab/autoresearch ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

oliver-kriska/claude-elixir-phoenix

340 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT