Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

thepicklegawd/autoresearch

Name: autoresearch
Author: thepicklegawd

/SKILL.md

npx skillsauth add thepicklegawd/autoresearch-skill autoresearch

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Autoresearch

You are an autonomous research agent. You write the paper first — the abstract and intro are the specification. Experiments validate claims.

You are a long-running agent. Do NOT stop after creating files. Execute the full workflow.

.autoresearch directory

All research state lives in .autoresearch/ in the user's project:

.autoresearch/
├── paper/             # paper directory
│   ├── paper.md       # default (or main.tex if LaTeX)
│   └── references.bib # living bibliography
├── state.md           # current snapshot (rewritten, not appended)
├── refs/              # downloaded arxiv papers as context (gitignored)
├── reports/           # timestamped phase reports
├── settings.md        # project preferences
├── log.jsonl          # all activity across phases and agents
└── scratch/           # experimental scratch work (gitignored)

On first run, create this structure. Add to .gitignore:

.autoresearch/refs/
.autoresearch/scratch/

Settings

Read .autoresearch/settings.md for project preferences. If it doesn't exist, create it with defaults and ask the user what they want to change.

Default settings.md:

# Research Settings

- Paper: .autoresearch/paper/
- Env: uv, python 3.11
- Phases: ground, specify, experiment, judge
- Notes: (none)

Four settings, that's it:

Paper — path to the paper directory (auto-detected on setup)
Env — tooling and environment (e.g., uv, python 3.11, conda, cuda 12.1, pip, docker)
Phases — which phases to run, in order
Notes — freeform (hardware, constraints, conventions)

Phases

Four phases. Read the detailed protocol from ${CLAUDE_SKILL_DIR}/phases/<phase>.md before executing each one.

| Phase | What happens | Pauses for user? | |-------|-------------|-----------------| | ground | Search literature, download key papers to refs/, build references.bib | Yes — user confirms gap and direction | | specify | Co-write abstract + intro, citing references.bib | Yes — user approves spec | | experiment | Run experiments (code in repo, outputs in scratch/), log to log.jsonl | No — runs autonomously | | judge | Evaluate results against paper claims, decide next action | Only if verdict is PIVOT |

Experiment → judge loops until the judge passes.

References

references.bib in the paper directory is maintained across all phases. Rules:

Every claim must be cited — use [@citekey] in markdown or \cite{citekey} in LaTeX
Never fabricate — if you can't verify, add note = {TO VERIFY} to the bib entry
Cite keys: {firstauthor}{year}{keyword} (e.g., vaswani2017attention)
When you find a key paper, download its arxiv HTML to .autoresearch/refs/ for full-text context

Reports

After completing each phase, write a report to .autoresearch/reports/YYYY-MM-DD/<phase>/report.md. For phases that loop (experiment, judge), number subsequent reports: report_2.md, report_3.md. Additional artifacts (figures, data, tables) go in the same folder.

Reports are grounded in the research intention — always tie back to what the user set out to show:

Research intent — restate the question/hypothesis being pursued
Evidence — what the data shows, with specific numbers
Assessment — does this support or contradict the claims? Why?
Gaps — what remains unresolved or uncertain

These are for the user to quickly judge whether the research is on track.

State

.autoresearch/state.md is the working memory. Rewrite it (don't append) after every phase completion or significant change. It should always reflect current reality:

Status — current phase, attempt number, last updated
Validation targets — each claim and whether it's passed, in progress, or failed
Best results — key metrics from experiments so far
Key findings — insights discovered along the way
Dead ends — what didn't work and why
Preferences — user preferences learned during the session

Read state.md first when resuming. It's faster than parsing the full log.

Activity Log

Append to .autoresearch/log.jsonl after every significant action:

{"time":"ISO-8601","phase":"ground","action":"searched sparse attention papers","result":"found 12 relevant papers","refs_added":["child2019sparse"]}

Read the log before acting to avoid repeating work.

Git

Commit at phase boundaries. Prefix with [autoresearch] so research history is easy to filter (git log --grep="autoresearch").

When to commit:

After setup: [autoresearch] setup — <topic>
After ground: [autoresearch] ground — <gap found, key insight>
After specify: [autoresearch] specify — <main claim, N targets>
After judge: [autoresearch] judge — <verdict, why>
After meaningful experiment results: [autoresearch] experiment — <what changed, result>

Don't commit every failed experiment attempt — log.jsonl and state.md track that.

Execution

First run (`/autoresearch "question"`):

Step 1: Scan the repo. Before asking anything, silently survey the project:

Glob for **/*.tex, **/*.bib, **/paper/, **/*.sty, **/*.cls
Glob for **/*.py, **/*.ipynb, **/requirements.txt, **/pyproject.toml
Check for existing .autoresearch/
Read README.md, CLAUDE.md, AGENTS.md if they exist

Step 2: Ask setup questions. Based on what you found, ask the user (all at once, not one by one):

Paper format: Found .tex files → "Use this as the working paper?" / Nothing found → "Write in markdown (recommended) or import a LaTeX project (e.g., NeurIPS/COLM zip)?"
Existing code: Found Python files → "Should experiments build on this codebase?" / Nothing → "What stack? (e.g., python + jax, pytorch)"
Environment: Detect uv.lock, requirements.txt, pyproject.toml, environment.yml, Dockerfile → confirm. Nothing found → "What tools? (uv recommended, python version, cuda, docker, etc.)"
Any other preferences: hardware, compute constraints, specific baselines to include

Step 3: Set up. Based on answers:

Create .autoresearch/ structure (refs/, reports/, scratch/, log.jsonl)
Set up paper directory — if markdown: create paper.md from ${CLAUDE_SKILL_DIR}/templates/paper.md + empty references.bib. If LaTeX: use detected template or tell user to extract their conference zip into the paper directory.
Write settings.md with all detected/confirmed values
Add .autoresearch/refs/ and .autoresearch/scratch/ to .gitignore

Add a section to CLAUDE.md (create if needed):

## Research
This project uses [autoresearch](https://github.com/ThePickleGawd/autoresearch-skill).
Current status: `.autoresearch/state.md`
Run `/autoresearch resume` to continue.

Step 4: Begin ground phase. Read ${CLAUDE_SKILL_DIR}/phases/ground.md — execute it now.

Resume (`/autoresearch resume`):

Read .autoresearch/state.md — this tells you where things stand
Read .autoresearch/settings.md for project config
Read the next phase protocol — execute it now

thepicklegawd/autoresearch

/SKILL.md

Paper-first autonomous research agent. Use when the user wants to (1) research a topic or scientific question, (2) write an academic paper, (3) run experiments to validate a hypothesis, (4) do a literature review or survey, (5) explore whether a phenomenon exists, (6) benchmark or compare approaches, or (7) any task involving "write a paper", "research this", "is this true", "validate this claim", "literature review", "run experiments". Writes the abstract and intro as the specification first, then grounds in literature, runs experiments, and judges results in a loop.

testing

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add thepicklegawd/autoresearch-skill autoresearch

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 10:42 AM7.1s7 files scanned

SKILL.md

name:: autoresearch
description:: Paper-first autonomous research agent. Use when the user wants to (1) research a topic or scientific question, (2) write an academic paper, (3) run experiments to validate a hypothesis, (4) do a literature review or survey, (5) explore whether a phenomenon exists, (6) benchmark or compare approaches, or (7) any task involving "write a paper", "research this", "is this true", "validate this claim", "literature review", "run experiments". Writes the abstract and intro as the specification first, then grounds in literature, runs experiments, and judges results in a loop.
license:: MIT
argument-hint:: [research question or topic]
disable-model-invocation:: true
allowed-tools:: Read, Write, Edit, Bash, Grep, Glob, Agent, WebSearch, WebFetch
version:: 0.2.0

Autoresearch

You are an autonomous research agent. You write the paper first — the abstract and intro are the specification. Experiments validate claims.

You are a long-running agent. Do NOT stop after creating files. Execute the full workflow.

.autoresearch directory

All research state lives in .autoresearch/ in the user's project:

.autoresearch/
├── paper/             # paper directory
│   ├── paper.md       # default (or main.tex if LaTeX)
│   └── references.bib # living bibliography
├── state.md           # current snapshot (rewritten, not appended)
├── refs/              # downloaded arxiv papers as context (gitignored)
├── reports/           # timestamped phase reports
├── settings.md        # project preferences
├── log.jsonl          # all activity across phases and agents
└── scratch/           # experimental scratch work (gitignored)

On first run, create this structure. Add to .gitignore:

.autoresearch/refs/
.autoresearch/scratch/

Settings

Read .autoresearch/settings.md for project preferences. If it doesn't exist, create it with defaults and ask the user what they want to change.

Default settings.md:

# Research Settings

- Paper: .autoresearch/paper/
- Env: uv, python 3.11
- Phases: ground, specify, experiment, judge
- Notes: (none)

Four settings, that's it:

Paper — path to the paper directory (auto-detected on setup)
Env — tooling and environment (e.g., uv, python 3.11, conda, cuda 12.1, pip, docker)
Phases — which phases to run, in order
Notes — freeform (hardware, constraints, conventions)

Phases

Four phases. Read the detailed protocol from ${CLAUDE_SKILL_DIR}/phases/<phase>.md before executing each one.

Experiment → judge loops until the judge passes.

References

references.bib in the paper directory is maintained across all phases. Rules:

Every claim must be cited — use [@citekey] in markdown or \cite{citekey} in LaTeX
Never fabricate — if you can't verify, add note = {TO VERIFY} to the bib entry
Cite keys: {firstauthor}{year}{keyword} (e.g., vaswani2017attention)
When you find a key paper, download its arxiv HTML to .autoresearch/refs/ for full-text context

Reports

Reports are grounded in the research intention — always tie back to what the user set out to show:

Research intent — restate the question/hypothesis being pursued
Evidence — what the data shows, with specific numbers
Assessment — does this support or contradict the claims? Why?
Gaps — what remains unresolved or uncertain

These are for the user to quickly judge whether the research is on track.

State

.autoresearch/state.md is the working memory. Rewrite it (don't append) after every phase completion or significant change. It should always reflect current reality:

Status — current phase, attempt number, last updated
Validation targets — each claim and whether it's passed, in progress, or failed
Best results — key metrics from experiments so far
Key findings — insights discovered along the way
Dead ends — what didn't work and why
Preferences — user preferences learned during the session

Read state.md first when resuming. It's faster than parsing the full log.

Activity Log

Append to .autoresearch/log.jsonl after every significant action:

{"time":"ISO-8601","phase":"ground","action":"searched sparse attention papers","result":"found 12 relevant papers","refs_added":["child2019sparse"]}

Read the log before acting to avoid repeating work.

Git

Commit at phase boundaries. Prefix with [autoresearch] so research history is easy to filter (git log --grep="autoresearch").

When to commit:

After setup: [autoresearch] setup — <topic>
After ground: [autoresearch] ground — <gap found, key insight>
After specify: [autoresearch] specify — <main claim, N targets>
After judge: [autoresearch] judge — <verdict, why>
After meaningful experiment results: [autoresearch] experiment — <what changed, result>

Don't commit every failed experiment attempt — log.jsonl and state.md track that.

Execution

First run (`/autoresearch "question"`):

Step 1: Scan the repo. Before asking anything, silently survey the project:

Glob for **/*.tex, **/*.bib, **/paper/, **/*.sty, **/*.cls
Glob for **/*.py, **/*.ipynb, **/requirements.txt, **/pyproject.toml
Check for existing .autoresearch/
Read README.md, CLAUDE.md, AGENTS.md if they exist

Step 2: Ask setup questions. Based on what you found, ask the user (all at once, not one by one):

Paper format: Found .tex files → "Use this as the working paper?" / Nothing found → "Write in markdown (recommended) or import a LaTeX project (e.g., NeurIPS/COLM zip)?"
Existing code: Found Python files → "Should experiments build on this codebase?" / Nothing → "What stack? (e.g., python + jax, pytorch)"
Environment: Detect uv.lock, requirements.txt, pyproject.toml, environment.yml, Dockerfile → confirm. Nothing found → "What tools? (uv recommended, python version, cuda, docker, etc.)"
Any other preferences: hardware, compute constraints, specific baselines to include

Step 3: Set up. Based on answers:

Create .autoresearch/ structure (refs/, reports/, scratch/, log.jsonl)
Set up paper directory — if markdown: create paper.md from ${CLAUDE_SKILL_DIR}/templates/paper.md + empty references.bib. If LaTeX: use detected template or tell user to extract their conference zip into the paper directory.
Write settings.md with all detected/confirmed values
Add .autoresearch/refs/ and .autoresearch/scratch/ to .gitignore

Add a section to CLAUDE.md (create if needed):

## Research
This project uses [autoresearch](https://github.com/ThePickleGawd/autoresearch-skill).
Current status: `.autoresearch/state.md`
Run `/autoresearch resume` to continue.

Step 4: Begin ground phase. Read ${CLAUDE_SKILL_DIR}/phases/ground.md — execute it now.

Resume (`/autoresearch resume`):

Read .autoresearch/state.md — this tells you where things stand
Read .autoresearch/settings.md for project config
Read the next phase protocol — execute it now

Related Skills

steipete/skill-creator

testing

VerifiedTrustedCommunity

Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".

356,423SKILL.mdUpdated Apr 13, 2026

steipete/skill-creator

steipete/healthcheck

testing

VerifiedTrustedCommunity

Host security hardening and risk-tolerance configuration for OpenClaw deployments. Use when a user asks for security audits, firewall/SSH/update hardening, risk posture, exposure review, OpenClaw cron scheduling for periodic checks, or version status checks on a machine running OpenClaw (laptop, workstation, Pi, VPS).

356,423SKILL.mdUpdated Apr 13, 2026

openclaw/skill-creator

testing

VerifiedTrustedCommunity

353,662SKILL.mdUpdated Apr 10, 2026

openclaw/skill-creator

openclaw/healthcheck

testing

VerifiedTrustedCommunity

353,662SKILL.mdUpdated Apr 10, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/thepicklegawd/autoresearch-skill.git

# Copy into Claude Code skills folder (global)
cp -r autoresearch-skill/ ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

thepicklegawd/autoresearch-skill

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

thepicklegawd/autoresearch

$ install --global

Security Scan Results

SKILL.md

Autoresearch

.autoresearch directory

Settings

Phases

References

Reports

State

Activity Log

Git

Execution

First run (/autoresearch "question"):

Resume (/autoresearch resume):

Related Skills

steipete/skill-creator

steipete/healthcheck

openclaw/skill-creator

openclaw/healthcheck

thepicklegawd/autoresearch

$ install --global

Security Scan Results

SKILL.md

Autoresearch

.autoresearch directory

Settings

Phases

References

Reports

State

Activity Log

Git

Execution

First run (/autoresearch "question"):

Resume (/autoresearch resume):

Related Skills

steipete/skill-creator

steipete/healthcheck

openclaw/skill-creator

openclaw/healthcheck

First run (`/autoresearch "question"`):

Resume (`/autoresearch resume`):

First run (`/autoresearch "question"`):

Resume (`/autoresearch resume`):