moralespanitz

learn

Use when user asks what something means, says "explain", "I don't understand", "teach me", "what is X", or asks about a term mid-session.

data-ai4

discover

Use when user wants to test multiple angles on an idea, run parallel hypothesis lanes, or explore a gap from different entry points.

testing4

loop

Use when user has a hypothesis and a repo and wants to run experiments. Triggered by "start experiments", "run the loop", "test this hypothesis".

testing4

# autoresearch skill > Adapted from [karpathy/autoresearch](https://github.com/karpathy/autoresearch) program.md. > This skill teaches Research Loop's Empirical agent how to run autonomous > nanochat/GPT training experiments using the autoresearch setup. ## What this skill is for You are the Empirical Agent operating on the `karpathy/autoresearch` codebase. Your job is to autonomously experiment with `train.py` to minimize `val_bpb` (validation bits per byte — lower is better). ## Repository

development4

execution

Use when experiments are running or just completed, or user shares results and wants to decide what to do next.

testing4

internal/embed/claude/skills/getting-started

# SKILL: Getting Started > The bootstrap entry point for the Research Loop framework. > Every research session starts here. This skill tells you what skills exist, when to use them, and what the mandatory workflow is. --- ## You have skills. They give you superpowers. <session-start-hook> <EXTREMELY_IMPORTANT> You are operating inside Research Loop — an Agent OS for scientific research. RIGHT NOW, read this file fully before doing anything else. **Core rules:** 1. You have skills. They g

development4

discover

Use when user wants to test multiple angles on an idea, run parallel hypothesis lanes, or explore a gap from different entry points.

testing4

skills/getting-started

# SKILL: Getting Started > The bootstrap entry point for the Research Loop framework. > Every research session starts here. This skill tells you what skills exist, when to use them, and what the mandatory workflow is. --- ## You have skills. They give you superpowers. <session-start-hook> <EXTREMELY_IMPORTANT> You are operating inside Research Loop — an Agent OS for scientific research. RIGHT NOW, read this file fully before doing anything else. **Core rules:** 1. You have skills. They g

development4

idea-selection

Use when user wants to find gaps, evaluate ideas, or decide what's worth pursuing. Triggered by "what hasn't been tried", "is this a good idea", "find the gap".

development4

status

Use when user asks "where are we", "show me the status", "what do we have", "what's pending", "what should I do next", or wants to see the research decision tree.

testing4

idea-selection

Use when user wants to find gaps, evaluate ideas, or decide what's worth pursuing. Triggered by "what hasn't been tried", "is this a good idea", "find the gap".

development4

plan

Use when user has a selected hypothesis or route and needs a concrete execution plan broken into tasks. Triggered by "what are the next steps", "how do I start", "plan this out".

tools4

plan

Use when user has a selected hypothesis or route and needs a concrete execution plan broken into tasks. Triggered by "what are the next steps", "how do I start", "plan this out".

tools4

research-loop

Use when user mentions research, a topic, papers, experiments, gaps, or hypotheses. Entry point — load this first.

research4

skills/review-prep

# SKILL: Review Preparation > What to do after the paper is written: surviving rejection, positioning for awards, and persisting without losing the thread. > Based on Carlini (2026) — "How to win a best paper award." --- ## When to use this skill Use this skill when a paper is complete and being prepared for submission, when a rejection arrives, and when evaluating whether a rejected paper is worth resubmitting or should be killed. Invoke with: `/review-prep` or `@reviewer: prepare for subm

research4

status

Use when user asks "where are we", "show me the status", "what do we have", "what's pending", "what should I do next", or wants to see the research decision tree.

testing4

execution

Use when experiments are running or just completed, or user shares results and wants to decide what to do next.

testing4

research-loop

Use when user mentions research, a topic, papers, experiments, gaps, or hypotheses. Entry point — load this first.

research4

figure-agent

Publication-quality figure generation for research papers. Decision agent selects figure type (code plot vs architecture diagram). Generates Matplotlib/Seaborn code for quantitative figures with iterative improvement loop. Style-matches conference templates (NeurIPS, ICML, ICLR). Use when the paper-pipeline reaches the figure generation phase, or when a user requests figures for an existing draft.

development4

experiment-sandbox

Experiment sandbox execution for Research Loop. Supports four modes: local (venv), Docker (isolated containers), SSH remote (GPU compute on servers), and Colab (Google Drive bridge). Provides experiment harness templates, code validation, metric collection, deterministic seeding, and compute budget enforcement. Use before running experiments generated by the paper-pipeline.

development4

learn

Use when user asks what something means, says "explain", "I don't understand", "teach me", "what is X", or asks about a term mid-session.

data-ai4

internal/embed/claude/skills/writing-papers

# SKILL: Writing Papers > How to write a paper that people actually read — with singular focus, a story that lands, self-contained figures, and a conclusion that does not waste its moment. > Based on Carlini (2026) — "How to win a best paper award." --- ## When to use this skill Use this skill when transitioning from the execution phase to drafting. Also use it when reviewing a draft in progress. For full end-to-end pipeline runs (topic to export), use `paper-pipeline` instead — this skill i

devops4

skills/writing-papers

# SKILL: Writing Papers > How to write a paper that people actually read — with singular focus, a story that lands, self-contained figures, and a conclusion that does not waste its moment. > Based on Carlini (2026) — "How to win a best paper award." --- ## When to use this skill Use this skill when transitioning from the execution phase to drafting. Also use it when reviewing a draft in progress. For full end-to-end pipeline runs (topic to export), use `paper-pipeline` instead — this skill i

devops4

explore

Use when user wants to explore a topic, find papers, map a field, or understand the research landscape. Not for explaining concepts — use learn for that.

data-ai4

bootstrap

Mandatory activation layer — loads on any conversation start. Establishes skill-loading protocol, Red Flags, priority rules, and HARD-GATE enforcement for all research-loop skills.

research4

paper-pipeline

End-to-end paper generation pipeline ported from AutoResearchClaw (Aiming Lab). 14 phases covering topic initiation through export/publish, with human- in-the-loop gates and quality gating at each handoff. Use this when the user wants a full paper pipeline run — topic to submission-ready manuscript. Delegates to researcher/reviewer/writer/verifier subagents for stage execution and to autonomous-iteration for experiment optimization loops.

testing4

loop

Use when user has a hypothesis and a repo and wants to run experiments. Triggered by "start experiments", "run the loop", "test this hypothesis".

testing4

autonomous-iteration

Use when user mentions autonomous iteration, metric-driven optimization, $research-loop plan, $research-loop debug, $research-loop fix, $research-loop security, $research-loop ship, $research-loop scenario, $research-loop predict, $research-loop learn, $research-loop reason, $research-loop probe, or mentions "research-loop" with a goal/metric. Autonomous Goal-directed Iteration — apply Karpathy's autoresearch principles: modify, verify, keep/discard, repeat. Supports bounded mode via Iterations: N inline config.

development4

deep-research

Run a thorough, multi-phase deep research investigation on a topic with subagent dispatch, provenance tracking, and integrity verification.

data-ai4

deep-research

Run a thorough, multi-phase deep research investigation on a topic with subagent dispatch, provenance tracking, and integrity verification.

data-ai4

internal/embed/claude/skills/autoresearch

# autoresearch skill > Adapted from [karpathy/autoresearch](https://github.com/karpathy/autoresearch) program.md. > This skill teaches Research Loop's Empirical agent how to run autonomous > nanochat/GPT training experiments using the autoresearch setup. ## What this skill is for You are the Empirical Agent operating on the `karpathy/autoresearch` codebase. Your job is to autonomously experiment with `train.py` to minimize `val_bpb` (validation bits per byte — lower is better). ## Repository

development4

bootstrap

Mandatory activation layer — loads on any conversation start. Establishes skill-loading protocol, Red Flags, priority rules, and HARD-GATE enforcement for all research-loop skills.

research4

explore

Use when user wants to explore a topic, find papers, map a field, or understand the research landscape. Not for explaining concepts — use learn for that.

data-ai4

literature-review

Run a structured literature review on a topic using parallel search, evidence tables with quality scoring, and primary-source synthesis.

testing4

replication

Plan and execute a structured replication workflow for a paper, claim, or benchmark with environment selection and integrity checks.

testing4

experiment-sandbox

Experiment sandbox execution for Research Loop. Supports four modes: local (venv), Docker (isolated containers), SSH remote (GPU compute on servers), and Colab (Google Drive bridge). Provides experiment harness templates, code validation, metric collection, deterministic seeding, and compute budget enforcement. Use before running experiments generated by the paper-pipeline.

development4

internal/embed/claude/skills/review-prep

# SKILL: Review Preparation > What to do after the paper is written: surviving rejection, positioning for awards, and persisting without losing the thread. > Based on Carlini (2026) — "How to win a best paper award." --- ## When to use this skill Use this skill when a paper is complete and being prepared for submission, when a rejection arrives, and when evaluating whether a rejected paper is worth resubmitting or should be killed. Invoke with: `/review-prep` or `@reviewer: prepare for subm

research4

autonomous-iteration

Use when user mentions autonomous iteration, metric-driven optimization, $research-loop plan, $research-loop debug, $research-loop fix, $research-loop security, $research-loop ship, $research-loop scenario, $research-loop predict, $research-loop learn, $research-loop reason, $research-loop probe, or mentions "research-loop" with a goal/metric. Autonomous Goal-directed Iteration — apply Karpathy's autoresearch principles: modify, verify, keep/discard, repeat. Supports bounded mode via Iterations: N inline config.

development4

figure-agent

Publication-quality figure generation for research papers. Decision agent selects figure type (code plot vs architecture diagram). Generates Matplotlib/Seaborn code for quantitative figures with iterative improvement loop. Style-matches conference templates (NeurIPS, ICML, ICLR). Use when the paper-pipeline reaches the figure generation phase, or when a user requests figures for an existing draft.

development4

replication

Plan and execute a structured replication workflow for a paper, claim, or benchmark with environment selection and integrity checks.

testing4

paper-pipeline

End-to-end paper generation pipeline ported from AutoResearchClaw (Aiming Lab). 14 phases covering topic initiation through export/publish, with human- in-the-loop gates and quality gating at each handoff. Use this when the user wants a full paper pipeline run — topic to submission-ready manuscript. Delegates to researcher/reviewer/writer/verifier subagents for stage execution and to autonomous-iteration for experiment optimization loops.

testing4

literature-review

Run a structured literature review on a topic using parallel search, evidence tables with quality scoring, and primary-source synthesis.

testing4

learn

discover

loop

skills/autoresearch

execution

internal/embed/claude/skills/getting-started

discover

skills/getting-started

idea-selection

status

idea-selection

plan

plan

research-loop

skills/review-prep

status

execution

research-loop

figure-agent

experiment-sandbox

learn

internal/embed/claude/skills/writing-papers

skills/writing-papers

explore

bootstrap

paper-pipeline

loop

autonomous-iteration

deep-research

deep-research

internal/embed/claude/skills/autoresearch

bootstrap

explore

literature-review

replication

experiment-sandbox

internal/embed/claude/skills/review-prep

autonomous-iteration

figure-agent

replication

paper-pipeline

literature-review

Adoption

moralespanitz

learn

discover

loop

skills/autoresearch

execution

internal/embed/claude/skills/getting-started

discover

skills/getting-started

idea-selection

status

idea-selection

plan

plan

research-loop

skills/review-prep

status

execution

research-loop

figure-agent

experiment-sandbox

learn

internal/embed/claude/skills/writing-papers

skills/writing-papers

explore

bootstrap

paper-pipeline

loop

autonomous-iteration

deep-research

deep-research

internal/embed/claude/skills/autoresearch

bootstrap

explore

literature-review

replication