skills/project-init/SKILL.md
Initialize an ML research project control root. Use for paper/code/slides repos, shared memory, GitHub Project alignment, agent guidance, worktree policy, and lifecycle handoffs.
npx skillsauth add a-green-hand-jack/ml-research-skills project-initInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Initialize a research project as a control root, not just as two sibling repos.
Use this skill when the user wants a new ML research project where agents should work from <ProjectName>/ while paper/, code/, and optional slides/ remain independent component repositories.
Pair this skill with:
research-project-memory to bootstrap cross-component memoryinit-latex-project to create the paper repoinit-python-project to create the code reporesearch-slide-deck-builder to create or maintain the optional slides repo using the external progress-slides templatenew-workspace to create code experiment worktrees or paper version worktreesremote-project-control when code runs on SSH/HPC serversremote-project-control or safe-git-ops when GitHub CLI, repo remotes, or GitHub Project API operations are involvedsafe-git-ops before non-trivial Git workDefault shape:
<ProjectName>/
├── PROJECT.md
├── AGENTS.md
├── CLAUDE.md
├── memory/
│ ├── project.yaml
│ ├── component-index.yaml
│ ├── current-status.md
│ ├── decision-log.md
│ ├── claim-board.md
│ ├── evidence-board.md
│ ├── provenance-board.md
│ ├── risk-board.md
│ ├── action-board.md
│ ├── handoff-board.md
│ ├── phase-dashboard.md
│ └── source-visibility-board.md
├── paper/ # independent LaTeX git repo
├── code/ # independent Python/ML git repo
├── code-worktrees/ # sibling worktree root for code repo branches
├── paper-worktrees/ # sibling worktree root for paper venue/arXiv/camera-ready versions
├── .uv-envs/ # ignored shared uv environments for code repo/worktrees
├── reference/ # project-local sources, cards, processing state, and project-use notes
├── slides/ # optional independent git repo
├── reviewer/ # reviewer simulation state
├── rebuttal/ # real review and response state
├── artifact/ # artifact-evaluation and release handoff state
└── docs/
├── overview.md
├── designs/
├── experiments/
├── updates/
├── audits/
└── timelines/
This structure is the textual source of truth for the project anatomy visual maintained in the repository README and asset/project-anatomy.png. If the layout changes, update the skill text and visual documentation together.
Do not create a top-level experiments/ directory by default. Experiment execution, run summaries, result reports, and raw artifact pointers belong inside code/ or the relevant code worktree.
Root-level docs/ is still useful, but it is project-level documentation, not a replacement for code-side evidence. Use it for staged method designs, cross-component experiment plans, project overviews, audits, timelines, and handoffs that coordinate paper/, code/, slides/, review, rebuttal, and artifact work.
Recommended code-side evidence paths:
code/docs/results/ # stable result summaries, table notes, figure notes
code/docs/reports/ # experiment-report-writer outputs
code/docs/runs/ # run registry, job pointers, config and commit pointers
<ProjectName>/ is the agent control root.paper/, code/, and slides/ are component repos, not mere folders.code/ by default. Use the sibling root code-worktrees/ so Git, IDEs, search tools, and agents do not confuse worktrees with normal source files.<ProjectName>/.uv-envs/code, invoked through absolute UV_PROJECT_ENVIRONMENT and uv run from the active worktree. Create a separate stage environment only when dependencies, Python/CUDA stack, destructive package testing, or real concurrent sync risk requires it.paper/ by default. Use the sibling root paper-worktrees/ for venue retargeting, arXiv releases, rebuttal paper edits, and camera-ready branches.paper/main is linked to Overleaf or visible to coauthors, treat it as author-visible, not private; keep agent-private files out of that branch.docs/ stores project-level design and planning artifacts; code docs store code-side implementation, run, and result details.memory/, component .agent/, or repo-native evidence docs.Ask for these fields in one message:
reference/, connect existing PDF/source folder, connect an initial source bundle, or skipgh auth status is validuv, ruff, mypy, pytest, pre-committex-fmt, submit-paper, Overleaf/GitHub compile evidencegit, gh, GitHub Project/PR checksgitleaks, shellcheck, shfmt, actionlint, nbstripout, taplo, yamllint, lychee<ProjectName>/code-worktrees/<ProjectName>/paper-worktrees/<ProjectName>/.uv-envs/codepaper/main is linked to Overleaf/GitHubpaper/main is author-visible or agent-private.agent/, AGENTS/CLAUDE guidance, raw CSVs, plotting scripts, and internal result docs should liveWait for the user's answers before creating files.
Create:
<parent-dir>/<ProjectName>/
├── memory/
├── docs/overview.md
├── docs/designs/
├── docs/experiments/
├── docs/updates/
├── docs/audits/
├── docs/timelines/
├── reference/.agent/runs/
├── reference/sources/
├── reference/cards/
├── reference/project-use/
├── reference/notes/
├── reference/summaries/
├── reviewer/.agent/
├── rebuttal/.agent/
├── artifact/.agent/
├── .uv-envs/
├── code-worktrees/
└── paper-worktrees/
Create optional slides/ only when the user wants a slides component now.
If root Git is enabled, initialize it at <ProjectName>/ and add a root .gitignore that ignores component repos and worktrees unless the user explicitly wants submodules:
/paper/
/code/
/slides/
/.uv-envs/
/code-worktrees/
/paper-worktrees/
If the user wants submodules, use submodule commands deliberately rather than relying on accidental nested Git behavior.
If the user wants GitHub/GitLab repositories created during setup, first check the hosting CLI authentication such as gh auth status. If authentication fails, finish the local workspace setup and record Git remote creation as a blocker; do not let repo creation failure obscure the project initialization result.
If the user wants a GitHub Project board:
gh project ... commands, check gh auth status; if the token lacks the project scope, ask the user to approve or run gh auth refresh -s project.@me / a-green-hand-jack, or an organization.Useful CLI patterns:
gh project create --owner <owner> --title "<ProjectName>"
gh project view <number> --owner <owner>
gh project link <number> --owner <owner> --repo <owner>/<repo>
Recommended fields:
Component: root, code, paper, slides, reviewer, rebuttal, artifactWorkstream: method, experiment, writing, review, release, infraStatus: Backlog, Ready, In Progress, Blocked, Review, Done, ParkedPriority: P0, P1, P2, P3Target: venue, milestone, deadline, arXiv, camera-readyClaim ID, Evidence ID, Worktree, BlockerRecommended views: Roadmap, Board, Experiments, Paper, Risks, and Worktrees.
Use research-project-memory templates or equivalent files to create:
memory/project.yamlmemory/component-index.yamlmemory/current-status.mdmemory/decision-log.mdmemory/claim-board.mdmemory/evidence-board.mdmemory/provenance-board.mdmemory/risk-board.mdmemory/action-board.mdmemory/handoff-board.mdmemory/phase-dashboard.mdmemory/source-visibility-board.mdThe component index should record:
components:
code:
path: code
worktree_root: code-worktrees
worktree_index_path: code/.agent/worktree-index.md
shared_uv_environment: .uv-envs/code
owns:
- algorithm implementation
- experiment execution
- code-side result reports
- server execution state
paper:
path: paper
worktree_root: paper-worktrees
worktree_index_path: paper/.agent/worktree-index.md
default_visibility_tier: author-visible
source_visibility_board: memory/source-visibility-board.md
owns:
- paper claims and narrative
- figures and tables selected for submission
- venue/arXiv/camera-ready versions
- source visibility and cleanup policy
slides:
path: slides
status: optional
reviewer:
path: reviewer
rebuttal:
path: rebuttal
artifact:
path: artifact
Root memory should store pointers to code-side evidence, not duplicate detailed run logs.
Record default toolchain gates in memory/project.yaml:
toolchain_gates:
policy: check-before-mutate
code:
environment_check: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv sync"
format_check: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run ruff format --check src tests experiments scripts"
lint_check: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run ruff check src tests experiments scripts"
type_check: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run mypy src"
test_check: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run pytest tests -v"
local_gate_runner: "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run pre-commit run --all-files"
mutate_only_when_requested:
- "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run ruff format src tests experiments scripts"
- "UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code uv run ruff check --fix src tests experiments scripts"
optional_hygiene:
secret_scan: "gitleaks dir --no-banner --redact ."
shell_lint: "shellcheck jobs/*.sh scripts/*.sh"
shell_format_check: "shfmt -d jobs scripts"
notebook_output_check: "nbstripout --dry-run notebooks/*.ipynb"
github_actions_lint: "actionlint .github/workflows/*.yml"
toml_format_check: "taplo fmt --check pyproject.toml"
yaml_lint: "yamllint ."
link_check: "lychee --no-progress README.md docs/**/*.md"
paper:
source_format_check: "tex-fmt --check --nowrap --recursive ."
submission_check: "bash <submit-paper-skill-dir>/scripts/check.sh \"$PAPER_DIR\""
compile_truth: "Overleaf/GitHub or local LaTeX compile log"
mutate_only_when_requested:
- "tex-fmt --nowrap --recursive ."
coordination:
git_status_check: "git status --short --branch"
whitespace_check: "git diff --check"
github_auth_check: "gh auth status"
github_pr_check: "gh pr checks"
Preserve existing component-specific tools when connecting an existing repo. For example, if a code repo already uses black, isort, pyright, pre-commit, gitleaks, shellcheck, shfmt, actionlint, nbstripout, or CI-specific commands, record those actual commands instead of forcing the default scaffold policy. For non-ML repos, omit default paths such as experiments or scripts when they do not exist.
If a GitHub Project board exists, record it in memory/project.yaml:
github_project:
enabled: true
owner: <github-user-or-org>
title: <ProjectName>
number: <project-number>
url: <project-url>
sync_policy: issue-pr-links
scope_required: project
Use sync_policy: none when the board exists but agents should not manage it. Do not mirror private research rationale or hidden paper-review risks into GitHub fields unless the user explicitly wants that material visible there.
Write both root agent entrypoints:
<ProjectName>/AGENTS.md for Codex and universal agent guidance<ProjectName>/CLAUDE.md for Claude Code compatibilityThey should stay semantically aligned. Prefer either mirrored content or a short CLAUDE.md that tells Claude Code to follow AGENTS.md as the canonical project-control-root policy.
The root guidance must state:
<ProjectName>/ unless a task is explicitly component-localgit -C code ..., git -C paper ..., and git -C slides ... for component-repo inspection and repo-local commandsproject-push <repo> <remote> <branch> instead of raw git push, git -C <repo> push, cd <repo> && git push, or shell-wrapped push variantscode-worktrees/ by defaultpaper-worktrees/ by default for venue, arXiv, and camera-ready versionsagent-private, author-visible, anonymous-submission, public-preprint, camera-ready-public, and publisher-artifactpaper/main syncs to Overleaf through GitHub, it is author-visible; do not put .agent/, AGENTS.md, CLAUDE.md, raw CSVs, internal result docs, plotting scripts, reviewer strategy, or private paths into that visible sourcetex-fmt is installed, paper formatting gates use tex-fmt --check --nowrap --recursive .; run tex-fmt --nowrap --recursive . only when formatting is requested and review the diff before push/submissionuv sync, uv run ruff format --check, uv run ruff check, uv run mypy src, uv run pytest tests -v, and uv run pre-commit run --all-files with UV_PROJECT_ENVIRONMENT pointing at the shared project-code env unless the code repo documents an existing alternativegitleaks, shellcheck, shfmt, actionlint, nbstripout, taplo, yamllint, and lychee when the relevant files and tools existruff format, ruff check --fix, shfmt -w, nbstripout without --dry-run, and tex-fmt format mode require an explicit request or documented project policy, followed by diff reviewdocs/ is for project-level overviews, staged method designs, cross-component experiment plans, audits, timelines, and handoffscode/docs/ is for code-side result summaries, run records, implementation reports, and server execution notescode/docs/results/, code/docs/reports/, code/docs/runs/, or the same paths inside a code worktree.agent/worktree-status.md and durable decisions live in root memory/code/.agent/worktree-index.md, paper/.agent/worktree-index.md, and root memory/component-index.yamlmemory/ remains the durable research memoryupdate-docs during code changes, not only at release timeadd-git-tag for stable code, paper, artifact, or root milestonesIf creating a new paper repo, use init-latex-project at:
<ProjectName>/paper/
If connecting an existing paper repo, clone or record its path and remote. Then inspect whether it is linked to Overleaf or visible to coauthors before creating or tracking agent-private files.
Ensure the paper workspace has both paper/AGENTS.md and paper/CLAUDE.md when agents will edit it, but treat these as agent-private guidance by default. If the active branch is author-visible, anonymous-submission, public-preprint, or camera-ready-public, keep these files untracked, ignored, or in an agent-private worktree rather than pushing them to the visible source. Keep them aligned with the same paper-local compile, venue, source hygiene, figure, table, and memory rules.
When tex-fmt is available, record it in the paper-local guidance as the source-format checker. Formatting status belongs with paper worktree/source-visibility state; it is not a substitute for Overleaf compile evidence.
If creating a new code repo, use init-python-project at:
<ProjectName>/code/
For ML projects, ensure the code repo has:
code/AGENTS.md
code/CLAUDE.md
code/docs/results/
code/docs/reports/
code/docs/runs/
When initializing or connecting a code repo, record its toolchain gates in code/AGENTS.md, code/CLAUDE.md, and memory/project.yaml. In a project-control-root layout, default new-code gates should run with UV_PROJECT_ENVIRONMENT=<absolute-project-root>/.uv-envs/code so code/ and code-worktrees/* share the same dependency environment. Bare uv sync is fine only for standalone code repos or repos that explicitly choose per-worktree environments. If an existing repo already has CI, pre-commit, black, isort, pyright, gitleaks, shellcheck, shfmt, actionlint, nbstripout, or custom commands, preserve and document those commands.
If connecting an existing code repo, do not force a full scaffold. Add missing high-value memory/docs paths only after reporting gaps.
When the code repo is cloned from an upstream project, do not assume origin is writable. Record whether origin is upstream, a fork, or a newly created project repo, and keep it separate from the root control-plane repo remote.
If requested, create or connect:
<ProjectName>/slides/
Slides may be a separate git repo. Prefer using the external progress-slides template as the slides component instead of inventing a local scaffold:
git clone https://github.com/a-green-hand-jack/progress-slides.git <ProjectName>/slides
After cloning, inspect slides/README.md, slides/package.json, and the existing slide source files before editing. Use research-slide-deck-builder for deck structure, template-compatible source writing, preview/build commands, and slides/.agent/ story, audience, source-evidence, and stale-evidence notes.
Treat slides/ as a multi-deck workspace. Root and slides-local agent guidance should say that stable decks live under slides/decks/<YYYY-MM-DD>-<audience-or-purpose>-<slug>.md; slides/slides.md is only an active/default deck, staging file, or template sample. Record deck history in slides/.agent/deck-index.md, keep optional per-deck notes in slides/.agent/decks/<deck-id>.md, and run Slidev against the target deck file, for example npx slidev decks/2026-05-02-advisor-plan.md.
Default code policy:
main code repo: <ProjectName>/code/
code worktree root: <ProjectName>/code-worktrees/
worktree path: <ProjectName>/code-worktrees/<branch-type>-<branch-name>/
Default paper policy:
main paper repo: <ProjectName>/paper/
paper worktree root: <ProjectName>/paper-worktrees/
worktree path: <ProjectName>/paper-worktrees/<version-type>-<venue-or-name>/
Use paper worktrees for:
Paper version hygiene:
.agent/, AGENTS.md, CLAUDE.md, writing memory, provenance, internal result docs, CSVs, and plotting scripts; should not sync to visible remotes.agent/, AGENTS.md, CLAUDE.md, raw CSVs, internal result docs, plotting scripts, reviewer strategy, private paths, and agent-only notesRecord this in:
memory/project.yamlmemory/component-index.yamlmemory/source-visibility-board.md<ProjectName>/AGENTS.md<ProjectName>/CLAUDE.mdcode/docs/ops/current-status.md when server execution is involvedIf the execution server only has the code repo, record the server-specific worktree root in code/infra/remote-projects.yaml or code/docs/ops/current-status.md. Do not assume the server has paper/ or root project memory.
Create <ProjectName>/PROJECT.md from templates/PROJECT.md. Fill in <ProjectName>, the one-line description, and set the GitHub Project Board status to none, linked, or planned. All other sections are stable defaults — edit only when the project's component layout, visibility policy, or memory structure differs from the defaults.
Report:
Project initialized: <ProjectName>
Control root: <path>
Components:
paper: <created|connected|skipped>
code: <created|connected|skipped>
slides: <created|connected|skipped>
reviewer/rebuttal/artifact state: <created|deferred>
GitHub Project: <created|linked|deferred|not requested>
Toolchain gates: <configured|deferred|inherited from existing repos>
Code worktree root:
<ProjectName>/code-worktrees/
Paper worktree root:
<ProjectName>/paper-worktrees/
Next skills:
research-project-memory -> inspect or update project state
new-workspace -> create a code branch/worktree or paper version worktree
remote-project-control -> configure SSH/HPC execution for code
experiment-design-planner -> plan first experiment matrix
research-slide-deck-builder -> create progress/advisor/lab slides with progress-slides
Before finishing:
AGENTS.md and CLAUDE.mdAGENTS.md and CLAUDE.mdpaper/, code/, and slides/ Git boundaries are cleargh auth status or equivalent hosting CLI auth has been checked before attempting repo creationmemory/project.yaml and PROJECT.md, or explicitly deferred as an action/blockermemory/project.yaml, or existing component-specific gates are documented as inheritedgh token has the project scope before gh project ... commands are attemptedcode-worktrees/ policy is recordedpaper-worktrees/ policy is recordedpaper/main is author-visible through Overleaf/GitHubexperiments/ directory unless the user explicitly requested itdocs/ has a clear project-level role and is not confused with code/docs/code/docs/testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.