skills/sidecar-task-runner/SKILL.md
Run artifact-driven sidecar agent tasks through one-shot Codex CLI sessions. Use when a main agent should delegate bounded scans, drafts, audits, pre-reviews, or mechanical repo tasks to a fast isolated sidecar model such as gpt-5.3-codex-spark while keeping final decisions with the main agent.
npx skillsauth add a-green-hand-jack/ml-research-skills sidecar-task-runnerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill to run bounded helper tasks from a main agent without sharing the main conversation as the task state. The sidecar produces artifacts; the main agent verifies, integrates, rejects, or escalates them.
<installed-skill-dir>/
├── SKILL.md
├── templates/
│ ├── precommit-classifier.md
│ └── personalization-scanner.md
└── scripts/
└── prepare_sidecar_task.py
gpt-5.3-codex-spark for fast local scans, first drafts, pre-reviews, consistency checks, and other low/medium-risk tasks.read-only. Use workspace-write only when the sidecar must write its own artifacts or make explicitly bounded edits.git tag, git push, release upload, job submission, or public issue creation.Use a repo-local directory:
.agent/sidecars/<task-id>/
├── prompt.md
├── input-manifest.md
├── output.md
├── findings.md
├── decision.md
├── model.md
└── model.json
Meanings:
prompt.md: exact prompt given to the sidecar.input-manifest.md: files, commands, diffs, or project state the sidecar may inspect.output.md: raw sidecar final response, usually written with codex exec -o.findings.md: distilled findings if the sidecar writes structured results.decision.md: main-agent decision after reading the sidecar output.model.md: human-readable command, model, sandbox, and token notes.model.json: structured run metadata for token and lifecycle audits.Create the artifact directory with:
python3 <installed-skill-dir>/scripts/prepare_sidecar_task.py \
--repo . \
--task-id <task-id> \
--title "<short task title>" \
--phase tooling \
--task-type audit \
--prompt-file <prompt-file>
If no prompt file exists, pass --prompt "<task instructions>".
For a fast precommit classification sidecar, use the bundled preset:
python3 <installed-skill-dir>/scripts/prepare_sidecar_task.py \
--repo . \
--title "Precommit classifier" \
--phase maintenance \
--task-type audit \
--preset precommit-classifier \
--input "git status --short" \
--input "git diff"
For a personalization scan that extracts reusable preferences from sanitized artifacts without asking the user:
python3 <installed-skill-dir>/scripts/prepare_sidecar_task.py \
--repo . \
--title "Personalization scan" \
--phase maintenance \
--task-type audit \
--preset personalization-scanner \
--input "memory/current-status.md" \
--input ".agent/sidecars/*/decision.md"
For a read-only sidecar:
codex exec --ephemeral \
-m gpt-5.3-codex-spark \
-C . \
-s read-only \
-o .agent/sidecars/<task-id>/output.md \
"$(cat .agent/sidecars/<task-id>/prompt.md)"
For a sidecar that must write its own findings.md or make tightly scoped artifact edits:
codex exec --ephemeral \
-m gpt-5.3-codex-spark \
-C . \
-s workspace-write \
-o .agent/sidecars/<task-id>/output.md \
"$(cat .agent/sidecars/<task-id>/prompt.md)"
Avoid codex resume, codex fork, claude --continue, or claude --resume for first-pass sidecar work. Those are continuation mechanisms, not clean sidecar boundaries.
Every sidecar prompt should state:
Keep prompts narrow. If the task needs several independent analyses, create separate sidecar runs rather than one broad prompt.
Good Spark sidecar tasks:
Use the main agent or a stronger fresh reviewer for:
Use the precommit-classifier preset when a commit/push closeout would otherwise be slowed down by deciding which gates to run. The sidecar inspects only read-only Git state and affected public repo files, then recommends:
The main agent must still stage files, commit, push, reinstall, and report the outcome. A sidecar must not perform external or irreversible actions.
Use the personalization-scanner preset when a main agent wants low-cost automatic memory writeback candidates from trajectories or repo artifacts. The sidecar may inspect only explicitly listed inputs and returns candidate preferences with scope, confidence, evidence, suggested target, and privacy notes.
The main agent must still decide what to write, keep private facts out of public repo memory, and perform any project-memory or skill-repo edits. The sidecar must not ask the user, quote raw logs, or promote public skill rules by itself. After accepting sidecar output, the main agent writes: accepted findings to the relevant memory board (claim/risk/action/decision), gate decision to decision.md, and sidecar token metadata stays in .agent/sidecars/<task-id>/model.json — not in project memory.
Because --ephemeral may avoid normal persisted session logs, record run metadata in model.json. If Codex CLI reports token usage in the terminal, copy the numeric fields into model.json or model.md before running token-usage-auditor.
If exact usage is unavailable, leave usage values as null and record:
Token usage: unavailable from ephemeral run output.
Never store raw prompts from private chat logs just to recover token counts.
testing
Bootstrap project-local ml-research-skills. Use from global installs when creating a new ML research project, enabling this collection in an existing ML research repo, or deciding whether to install the full bundle locally. Route to project-init for new projects; do not handle paper or experiment work directly.
development
Route project operations tasks — git, memory, bootstrap, remote, workspace, code review, timeline, ops — to the correct skill. Use when the task involves commits, pushes, worktrees, project memory, enabling project-local skills, SSH/server coordination, sidecar runners, or audits. Do not solve the ops task directly.
testing
Route ML/AI paper writing tasks to the correct skill — contract planning, prose drafting, section writing, consistency editing, review simulation, rebuttal, submission, or citation work. Use when the task involves writing, revising, reviewing, or submitting a paper instead of guessing between paper-writing-assistant, paper-writing-contract-planner, paper-reviewer-simulator, auto-paper-improvement-loop, or citation skills. Do not draft prose directly.
data-ai
Project-local router for ML research skill selection. Use inside an initialized ML research project, or while maintaining this skill repo, when the user describes an ML research/paper/experiment/discovery/ops/release workflow and may not know the skill; route to a domain router or high-signal leaf. Do not use for generic non-ML projects.