Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jeffvincent/autoresearcher-generator

Name: autoresearcher-generator
Author: jeffvincent

skills/autoresearcher-generator/SKILL.md

npx skillsauth add jeffvincent/claude-config autoresearcher-generator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

autoresearcher-generator

Scaffolds a karpathy/autoresearch-style loop that iteratively improves a Claude skill's prompt by generating outputs, evaluating them against user-defined constraints, and having an "improver" subagent propose targeted edits to the skill prompt.

When to use

The user asks for a self-improving loop / eval harness / autoresearcher for one of their skills. Typical triggers:

"Build an autoresearcher for my X skill"
"Make a self-improving loop for skill Y"
"I want X to get better automatically"

What you need from the user

Ask for exactly two things if not provided:

The skill to improve — path to its Skill.md (or SKILL.md).
3–5 constraints the generated outputs must satisfy. These are the eval rubric. Each constraint should be a single sentence describing a property the output must have.

If the user gives fewer than 3 constraints, ask for more. If more than 5, ask them to pick the top 5. Do NOT silently proceed with a bad input set — the constraint list is the soul of the loop.

After you have both, propose 1–2 additional constraints you think are worth adding (based on reading the skill). The user approves or rejects before you scaffold.

What you will build

A standalone project directory (default location: ~/autoresearch-<skill-slug>/, but ask the user to confirm). The directory contains:

autoresearch-<slug>/
├── program.md          # loop driver instructions — rendered from template
├── Skill.md            # COPY of the target skill; mutable; edited by the loop
├── eval.mjs            # programmatic checks synthesized from user constraints
├── judge-prompt.md     # qualitative checks synthesized from user constraints
├── inputs/pool.tsv     # rotating set of inputs for the skill (user confirms)
├── results.tsv         # per-cycle log (gitignored)
├── scratch/            # generated outputs per cycle (gitignored)
├── .gitignore
└── README.md

It is its own git repo on a dedicated branch. The live skill at the original path is NEVER modified by the loop — only the copy in the project directory is.

Steps you follow

Step 1: Read the skill

Read the user's target Skill.md in full. You need to understand what the skill produces and what "good output" looks like so you can (a) suggest extra constraints and (b) classify each constraint as programmatic vs qualitative.

Step 2: Classify constraints

For each user constraint, decide:

Programmatic — can be checked by a deterministic script. Examples: "output is valid JSON", "contains section X", "line count ≤ N", "all shapes have bound text", "file is under 2MB".
Qualitative — needs an LLM judge. Examples: "covers the high-level concept", "tone matches voice", "flow is clear", "colors used meaningfully".

A constraint may require BOTH (e.g. "not too much text" → programmatic char count + judge for subjective overload). In that case, implement both and require both to pass.

Step 3: Ask for inputs

The loop needs inputs to feed the skill. For the Excalidraw case it was "notes to diagram." For other skills, it could be prompts, source files, data files, etc. Ask the user to point at a directory or hand-pick 3–5 inputs. Do NOT proceed without real inputs — synthetic ones lead to a skill that overfits.

Step 4: Confirm target directory and N (samples per cycle)

Default directory: ~/autoresearch-<slug>/ where slug is derived from the skill name.
Default N: 5 samples per cycle. Ask the user if they want to change it.
Default cycle cap: 30.
Default time cap per cycle: 5 minutes.

Step 5: Scaffold

Create the directory and write files using the templates in templates/ as starting points. You synthesize eval.mjs and judge-prompt.md from the user's constraints — do not copy them verbatim from the Excalidraw example.

After writing files:

git init in the new directory.
git checkout -b autoresearch/<slug>-initial.
Stage everything and commit: baseline: scaffold for <skill-name>.

Step 6: Hand off

Print a short summary:

Path to the new project.
The 9-ish checks that will be enforced (programmatic + judge).
How to launch: open a fresh Claude Code session in that directory and say "Read program.md and run the autoresearch loop."
Reminder that the live skill is untouched; user promotes manually via diff + cp after the loop.

Non-negotiables you must enforce

Never edit the live skill. Only the copy inside the scaffolded directory.
Never include eval.mjs, judge-prompt.md, or inputs/pool.tsv in the set of files the loop is allowed to edit. These are the fixed yardstick. Say so explicitly in program.md.
The loop must use subagents for generation and evaluation. Do not have the main loop agent generate outputs in its own context — that burns context and the loop stops after 2–3 cycles. Parallel Task subagents are load-bearing.
30-cycle hard cap. Even if results aren't perfect.
Revert on regression. If a cycle's edit makes the pass count worse on the same input, git reset --hard HEAD~1.

Templates

Templates live in templates/ next to this SKILL.md:

program.md.tmpl — loop driver, with {{PLACEHOLDERS}} for slug, N, constraints summary, input source description.
eval.mjs.tmpl — skeleton with helper functions; you fill in the actual checks based on the user's constraints.
judge-prompt.md.tmpl — skeleton rubric; you fill in the qualitative criteria.
README.md.tmpl — user-facing readme.

Read the templates, substitute the placeholders, and synthesize the per-skill eval logic. Do not just copy templates blindly — the checks are the whole point.

Reference implementation

A working example of a scaffolded autoresearch project lives at ~/Projects/Knowledge System/autoresearch-excalidraw/. When in doubt about structure or tone, look there. That project targets the Excalidraw Creator skill with 9 checks (5 programmatic, 4 qualitative). It is the canonical pattern.

jeffvincent/autoresearcher-generator

skills/autoresearcher-generator/SKILL.md

Generate a self-improving "autoresearch" loop for any Claude skill. Given a skill to improve and 3-5 quality constraints, scaffolds a standalone project (modeled on karpathy/autoresearch) that iterates on the skill's prompt by generating N samples per cycle, evaluating them against the constraints, and editing the skill to fix recurring failures — up to 30 cycles or until results are consistently perfect. Use when the user says something like "make an autoresearcher for my X skill" or "build a self-improving loop for skill Y".

5 stars

development

Updated Apr 21, 2026

$ install --global

skillsauth

npx skillsauth add jeffvincent/claude-config autoresearcher-generator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 22, 2026, 1:59 AM127.2s5 files scanned

SKILL.md

name:: autoresearcher-generator
description:: Generate a self-improving "autoresearch" loop for any Claude skill. Given a skill to improve and 3-5 quality constraints, scaffolds a standalone project (modeled on karpathy/autoresearch) that iterates on the skill's prompt by generating N samples per cycle, evaluating them against the constraints, and editing the skill to fix recurring failures — up to 30 cycles or until results are consistently perfect. Use when the user says something like "make an autoresearcher for my X skill" or "build a self-improving loop for skill Y".

autoresearcher-generator

When to use

The user asks for a self-improving loop / eval harness / autoresearcher for one of their skills. Typical triggers:

"Build an autoresearcher for my X skill"
"Make a self-improving loop for skill Y"
"I want X to get better automatically"

What you need from the user

Ask for exactly two things if not provided:

The skill to improve — path to its Skill.md (or SKILL.md).
3–5 constraints the generated outputs must satisfy. These are the eval rubric. Each constraint should be a single sentence describing a property the output must have.

If the user gives fewer than 3 constraints, ask for more. If more than 5, ask them to pick the top 5. Do NOT silently proceed with a bad input set — the constraint list is the soul of the loop.

After you have both, propose 1–2 additional constraints you think are worth adding (based on reading the skill). The user approves or rejects before you scaffold.

What you will build

A standalone project directory (default location: ~/autoresearch-<skill-slug>/, but ask the user to confirm). The directory contains:

autoresearch-<slug>/
├── program.md          # loop driver instructions — rendered from template
├── Skill.md            # COPY of the target skill; mutable; edited by the loop
├── eval.mjs            # programmatic checks synthesized from user constraints
├── judge-prompt.md     # qualitative checks synthesized from user constraints
├── inputs/pool.tsv     # rotating set of inputs for the skill (user confirms)
├── results.tsv         # per-cycle log (gitignored)
├── scratch/            # generated outputs per cycle (gitignored)
├── .gitignore
└── README.md

It is its own git repo on a dedicated branch. The live skill at the original path is NEVER modified by the loop — only the copy in the project directory is.

Steps you follow

Step 1: Read the skill

Step 2: Classify constraints

For each user constraint, decide:

Programmatic — can be checked by a deterministic script. Examples: "output is valid JSON", "contains section X", "line count ≤ N", "all shapes have bound text", "file is under 2MB".
Qualitative — needs an LLM judge. Examples: "covers the high-level concept", "tone matches voice", "flow is clear", "colors used meaningfully".

A constraint may require BOTH (e.g. "not too much text" → programmatic char count + judge for subjective overload). In that case, implement both and require both to pass.

Step 3: Ask for inputs

Step 4: Confirm target directory and N (samples per cycle)

Default directory: ~/autoresearch-<slug>/ where slug is derived from the skill name.
Default N: 5 samples per cycle. Ask the user if they want to change it.
Default cycle cap: 30.
Default time cap per cycle: 5 minutes.

Step 5: Scaffold

After writing files:

git init in the new directory.
git checkout -b autoresearch/<slug>-initial.
Stage everything and commit: baseline: scaffold for <skill-name>.

Step 6: Hand off

Print a short summary:

Path to the new project.
The 9-ish checks that will be enforced (programmatic + judge).
How to launch: open a fresh Claude Code session in that directory and say "Read program.md and run the autoresearch loop."
Reminder that the live skill is untouched; user promotes manually via diff + cp after the loop.

Non-negotiables you must enforce

Never edit the live skill. Only the copy inside the scaffolded directory.
Never include eval.mjs, judge-prompt.md, or inputs/pool.tsv in the set of files the loop is allowed to edit. These are the fixed yardstick. Say so explicitly in program.md.
The loop must use subagents for generation and evaluation. Do not have the main loop agent generate outputs in its own context — that burns context and the loop stops after 2–3 cycles. Parallel Task subagents are load-bearing.
30-cycle hard cap. Even if results aren't perfect.
Revert on regression. If a cycle's edit makes the pass count worse on the same input, git reset --hard HEAD~1.

Templates

Templates live in templates/ next to this SKILL.md:

program.md.tmpl — loop driver, with {{PLACEHOLDERS}} for slug, N, constraints summary, input source description.
eval.mjs.tmpl — skeleton with helper functions; you fill in the actual checks based on the user's constraints.
judge-prompt.md.tmpl — skeleton rubric; you fill in the qualitative criteria.
README.md.tmpl — user-facing readme.

Read the templates, substitute the placeholders, and synthesize the per-skill eval logic. Do not just copy templates blindly — the checks are the whole point.

Reference implementation

Related Skills

jeffvincent/Caption Video

tools

VerifiedTrustedCommunity

Render a video clip with captions overlaid, using the Remotion captioner at `/Users/jvincent/Projects/remotion-captioner/`. Use when user provides a video file and wants to add captions/subtitles, mentions "caption this video", "add captions", "burn in subtitles", or provides a video + SRT file pair.

5SKILL.mdUpdated Apr 23, 2026

jeffvincent/Caption Video

jeffvincent/wistia-uploader

development

VerifiedTrustedCommunity

Upload video files to Wistia projects using the Data API. Use when user wants to upload videos to their Wistia account for hosting, transcription, or sharing.

5SKILL.mdUpdated Apr 21, 2026

jeffvincent/wistia-uploader

jeffvincent/voice-memo-transcriber

development

VerifiedTrustedCommunity

Transcribe voice memos to text using Whisper. Use when user provides audio/video files (.m4a, .mp3, .mov, etc.) and asks to transcribe them into text and SRT format with timestamps.

5SKILL.mdUpdated Apr 21, 2026

jeffvincent/voice-memo-transcriber

jeffvincent/skills/voice-authenticity

testing

VerifiedTrustedCommunity

# Voice Authenticity Reviewer ## Purpose Review any written content for alignment with authentic speaking and writing voice using analyzed patterns from 7 meeting transcripts and strategic memos. ## When to Use This Skill - Before sharing strategic memos with leadership - Before sending important emails - When drafting presentation scripts - When reviewing documentation for external sharing - As part of Writing /produce-memo workflow (Step 6) - Anytime voice authenticity verification is needed

5SKILL.mdUpdated Apr 21, 2026

jeffvincent/skills/voice-authenticity

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jeffvincent/claude-config.git

# Copy into Claude Code skills folder (global)
cp -r claude-config/skills/autoresearcher-generator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jeffvincent/claude-config

5 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT