.claude/skills/sandstorm/SKILL.md
Use this skill any time the user wants to work with Sandstorm agent stacks. This includes: creating, spinning up, or starting stacks; dispatching tasks or work to an inner Claude agent in a stack; checking stack status, task progress, or whether a task is done; viewing diffs, changes, or task output from a stack; pushing or publishing code changes from a stack to git; tearing down, cleaning up, or removing stacks; viewing container logs. Trigger whenever the user mentions 'stack' with a number or ID (like 'stack 1', 'stack 2', 'stack 3'), says 'sandstorm', refers to an 'isolated environment' for development, or asks to send work to an agent. Also trigger for multi-stack operations and any reference to agent workspaces. Do NOT trigger for general Docker, docker-compose authoring, CI/CD pipelines, or direct code editing unrelated to stacks.
npx skillsauth add onomojo/sandstorm-desktop sandstormInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Sandstorm manages isolated Docker agent stacks. Each stack is a full clone of the project repo with its own containers, ports, and an inner Claude agent.
IMPORTANT: Always use the MCP tools (mcp__sandstorm-tools__*) to manage stacks. Never use CLI commands (sandstorm up, sandstorm task, etc.) directly — the MCP tools go through the Electron app's control plane, which tracks stacks in the database and keeps the UI in sync.
| Tool | Description |
|------|-------------|
| mcp__sandstorm-tools__create_stack | Create and start a new stack |
| mcp__sandstorm-tools__list_stacks | List all stacks with status and services |
| mcp__sandstorm-tools__dispatch_task | Send a task to inner Claude in a stack |
| mcp__sandstorm-tools__get_task_status | Check task state (running, completed, failed, idle) |
| mcp__sandstorm-tools__get_task_output | View latest task output |
| mcp__sandstorm-tools__get_diff | View uncommitted changes in a stack |
| mcp__sandstorm-tools__get_logs | View container logs for a stack |
| mcp__sandstorm-tools__push_stack | Commit and push changes from a stack |
| mcp__sandstorm-tools__teardown_stack | Tear down a stack (stops containers, cleans up) |
branch to main. Omit the branch parameter — it defaults to the stack name, which becomes a feature branch. Setting branch: "main" causes pushes to go directly to main, bypassing code review. This has caused loss of work..sandstorm/scripts/create-pr.sh to open a pull request from the feature branch.Before calling create_stack or dispatch_task for a ticket:
.sandstorm/scripts/fetch-ticket.sh <N>
If the script doesn't exist, warn the user: "No fetch-ticket script configured. Run sandstorm init to set up a ticket provider.".sandstorm/spec-quality-gate.mdcreate_stack with gateApproved: trueIf you skip this step, the backend will reject the call with GATE_CHECK_REQUIRED and you will need to run the gate check anyway. Always run it proactively.
If the user explicitly says to skip the gate (e.g., "just start it", "skip the gate"), use forceBypass: true instead.
create_stack — name, projectDir, ticket, gateApproved: true (do NOT pass branch — it defaults to the stack name)dispatch_task — pass the verbatim ticket body as the task prompt (see Verbatim Ticket Dispatch below)get_task_status — check when doneget_diff — review changespush_stack — commit and push to the feature branch.sandstorm/scripts/create-pr.sh --title "Fix auth token expiry" --body "Fixes #28"
Do NOT tear down stacks — only the user decides when to tear down.
When the user says "start issue #N", "spin up a stack for issue #N", or any variation:
.sandstorm/scripts/fetch-ticket.sh <N>
task parameter to create_stack or dispatch_taskIf the user gives verbal instructions that differ from the ticket, update the ticket first using the project's update-ticket script:
.sandstorm/scripts/update-ticket.sh <N> "<updated body>"
Then fetch and dispatch the updated body. The ticket is the single source of truth.
Why this matters:
# 1. Fetch the ticket body verbatim
TICKET_BODY=$(.sandstorm/scripts/fetch-ticket.sh 28)
# 2. Create stack with the unmodified ticket body as the task
mcp__sandstorm-tools__create_stack({
name: "issue-28-fix-auth-bug",
projectDir: "/path/to/project",
ticket: "28",
gateApproved: true,
task: <TICKET_BODY — the full, unmodified ticket text>
})
NEVER pass branch: "main". Omit the branch parameter entirely.
The task parameter dispatches work immediately after creation — no need to call dispatch_task separately.
mcp__sandstorm-tools__get_task_status({ stackId: "issue-28-fix-auth-bug" })
mcp__sandstorm-tools__get_task_output({ stackId: "issue-28-fix-auth-bug" })
mcp__sandstorm-tools__get_diff({ stackId: "issue-28-fix-auth-bug" })
mcp__sandstorm-tools__dispatch_task({ stackId: "issue-28-fix-auth-bug", prompt: "Looks good but add limit/offset params" })
mcp__sandstorm-tools__get_diff({ stackId: "issue-28-fix-auth-bug" })
mcp__sandstorm-tools__push_stack({ stackId: "issue-28-fix-auth-bug", message: "Fix auth token expiry with user-facing error" })
.sandstorm/scripts/create-pr.sh --title "Fix auth token expiry" --body "Fixes #28"
create_stack| Parameter | Required | Description |
|-----------|----------|-------------|
| name | Yes | Stack name — becomes the stack ID (e.g., fix-auth-bug) |
| projectDir | Yes | Absolute path to the project directory |
| ticket | No | Associated ticket ID (e.g., PROJ-123) |
| branch | No | Git branch name (defaults to stack name). NEVER set to main. |
| description | No | Short description of the work |
| runtime | No | docker (default) or podman |
| task | No | Task to dispatch immediately after creation. When working on a ticket, this MUST be the verbatim ticket body — never a summary or rewrite. |
| gateApproved | No | Set to true after running /spec-check and getting user approval. Required when ticket is set or task references a ticket. |
| forceBypass | No | Set to true to skip the spec quality gate. Only when the user explicitly requests it. |
The stack builds in the background. Use get_task_status or list_stacks to check when it's ready.
dispatch_task| Parameter | Required | Description |
|-----------|----------|-------------|
| stackId | Yes | Target stack ID |
| prompt | Yes | Task description for inner Claude. When working on a ticket, this MUST be the verbatim ticket body. |
| gateApproved | No | Set to true after running /spec-check and getting user approval. Required when the stack has a ticket or the prompt references a ticket. |
| forceBypass | No | Set to true to skip the spec quality gate. Only when the user explicitly requests it. |
Write goal-oriented prompts. Describe WHAT to achieve, not HOW. The inner Claude has the full repo context.
When dispatching for a ticket: Pass the ticket body verbatim. Do not summarize, rewrite, or add implementation details.
For follow-up tasks (not initial ticket dispatch): goal-oriented prompts are fine.
Good follow-up: "Looks good but add limit/offset params to the list endpoint"
Bad initial dispatch: Rewriting "Fix auth bug" into "Open src/auth.ts, find line 42, change the catch block..."
get_task_statusReturns the current task state for a stack: running, completed, failed, or idle.
get_task_output| Parameter | Required | Description |
|-----------|----------|-------------|
| stackId | Yes | Target stack ID |
| lines | No | Number of lines to return (default: 50) |
get_diffReturns the full git diff of uncommitted changes in the stack's workspace. Always review before pushing.
get_logs| Parameter | Required | Description |
|-----------|----------|-------------|
| stackId | Yes | Target stack ID |
| service | No | Service name (e.g., claude, app). Omit for all services. |
push_stack| Parameter | Required | Description |
|-----------|----------|-------------|
| stackId | Yes | Target stack ID |
| message | No | Commit message |
teardown_stackStops containers, removes workspace, archives stack to history. This is irreversible. Always check for unpushed changes first with get_diff.
sandstorm CLI commands or docker commands.get_task_status as a one-shot check when the user asks. Never write sleep/poll loops.get_diff before push_stack.get_diff before teardown_stack to avoid losing work.main as the branch. Omit the branch parameter so stacks always work on feature branches.push_stack, use .sandstorm/scripts/create-pr.sh to open a pull request./spec-check before dispatching work for a ticket. The backend enforces this — calls without gateApproved: true will be rejected with GATE_CHECK_REQUIRED.development
Use this skill when the user reports a stack appears broken, stuck, looping, failed, or otherwise not working — and wants to understand WHY (not just restart it). Trigger phrases include: 'stack N doesn't seem to be working', 'stack N isn't working', 'stack N failed for some reason / why did stack N fail / what went wrong with stack N', 'stack N seems stuck / got stuck / stuck in an infinite loop / keeps looping / did N loops and failed', 'take a look at stack N, something went wrong / it's broken / something's clearly wrong', 'stack N hit NEEDS HUMAN INTERVENTION / keeps failing / errored out', 'figure out what's happening / going on with stack N', 'give me a summary of what happened with stack N / diagnose stack N'. The skill reads the stack's dual-loop artifacts (phase timings, review verdicts, execution summaries) from inside its container and returns one structured report — avoiding the 40+ Bash-exploration sub-turns the orchestrator would otherwise make. Make sure to use this skill for ANY request that involves diagnosing a stack's failure, loop behavior, or why it stopped working — even when the user phrases it gently (e.g., 'doesn't seem to be working', 'not sure what's going on', 'can you take a look') as long as there is a failure or malfunction signal present. Falling back to raw Bash exploration of the stack's internals costs 1M+ tokens. Do NOT trigger for: status-only 'is stack N done?' / 'what's the status of stack N' (that's check-and-resume-stack), diff/logs inspection on a working stack with no failure signal (stack-inspect), or creating a new stack.
testing
Use this skill ONLY when the user has EXPLICITLY asked to tear down, destroy, remove, or dismantle a named Sandstorm stack. Trigger phrases include: 'tear down stack X', 'destroy stack X', 'remove stack X', 'dismantle stack X', 'clean up stack X and all its containers', 'I'm done with stack X, kill it'. This skill stops containers, removes the workspace, and archives the stack — it is IRREVERSIBLE and can lose unpushed work. Do NOT trigger on ambiguous phrases like 'clean up', 'reset', 'start over', 'remove the old one', 'stack is broken', or anything that might imply teardown without literal user words like tear down / destroy / delete. Do NOT trigger for: stopping containers (that's pause, not teardown), checking status, failure recovery, or as a precursor to creating a new stack. When in doubt, ASK the user before running.
tools
Use this skill whenever the user wants to record/link/associate a pull request with an existing Sandstorm stack. Trigger phrases include: 'record PR #N for stack X', 'set PR for stack X to #N', 'link PR https://github.com/.../pull/N to stack X', 'save the PR info on stack X', 'stack X's PR is #N'. Use this after a PR has been opened externally (via gh CLI, the GitHub UI, or push_stack's downstream flow) and the user wants the Sandstorm registry to know about it — the stack status flips to pr_created and the URL/number are stored. Do NOT trigger for: creating the PR itself (that's a separate gh CLI / push flow), tearing down the stack, checking stack status, or unrelated PR operations like merging or closing.
testing
Use this skill whenever the user wants to see DETAILED output, logs, or uncommitted changes from a specific Sandstorm stack. Trigger phrases include: 'show me the output of stack X', 'what did stack X log', 'show the task output for X', 'show container logs for stack X', 'what changed in stack X', 'show me the diff in stack X', 'what's happening inside stack X', 'dump stack X's output', 'get logs for stack X's claude container'. The skill covers three read-only probes — task output, container logs, and uncommitted diff — as subcommands. Do NOT trigger for: a quick status check (that's check-and-resume-stack), listing all stacks (that's list-stacks), or anything that modifies state. Prefer the narrower subcommand (output / logs / diff) over 'all' when the user is specific about what they want.