sandstorm-cli/skills/sandstorm-spec/SKILL.md
Use this skill whenever the user wants to run the Sandstorm spec quality gate on a ticket, check whether a ticket is ready for agent dispatch, or iterate on a ticket that failed the gate. Trigger phrases include: 'run spec check on ticket N', '/spec-check N', 'is ticket 123 ready', 'check the spec for 178', 'refine the spec for N with these answers', 'here are my answers: ...', 'add my answers and re-check ticket N', 'iterate on the gaps for 42'. This skill wraps the spec_check and spec_refine workflows with a single deterministic script so the orchestrator doesn't have to carry the evaluation logic in its own context. Prefer this skill over the generic sandstorm skill whenever the user's intent is spec-gate evaluation or refinement on a named ticket. Do NOT trigger for: dispatching tasks, creating stacks, reviewing code diffs, or any workflow that isn't specifically about running the spec quality gate.
npx skillsauth add onomojo/sandstorm-desktop sandstorm-specInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Wraps the Sandstorm spec quality gate (spec_check and spec_refine) behind a single script. Two subcommands.
Use check when the user is asking whether a ticket is ready, running a fresh spec check, or there's no prior question-and-answer context in this conversation.
Use refine when the user is providing answers to gaps that were surfaced by a previous check. Look for cues like "here are my answers," "the answers are," or a direct reply to a gap question you just surfaced.
For check:
bash "$SANDSTORM_SKILLS_DIR/sandstorm-spec/scripts/sandstorm-spec.sh" check <ticket-id>
For refine (user's answers piped verbatim on stdin):
echo "<user's answers verbatim>" | bash "$SANDSTORM_SKILLS_DIR/sandstorm-spec/scripts/sandstorm-spec.sh" refine <ticket-id>
The script prints a JSON payload from the underlying spec agent. Relay its key fields to the user:
passed: true → report "spec gate passed" and stop.passed: false with questions → present the questions to the user and wait for their reply. On the reply, invoke refine.ERROR, relay the error.spec_check or spec_refine MCP tools directly. This skill is the only path.development
Use this skill when the user reports a stack appears broken, stuck, looping, failed, or otherwise not working — and wants to understand WHY (not just restart it). Trigger phrases include: 'stack N doesn't seem to be working', 'stack N isn't working', 'stack N failed for some reason / why did stack N fail / what went wrong with stack N', 'stack N seems stuck / got stuck / stuck in an infinite loop / keeps looping / did N loops and failed', 'take a look at stack N, something went wrong / it's broken / something's clearly wrong', 'stack N hit NEEDS HUMAN INTERVENTION / keeps failing / errored out', 'figure out what's happening / going on with stack N', 'give me a summary of what happened with stack N / diagnose stack N'. The skill reads the stack's dual-loop artifacts (phase timings, review verdicts, execution summaries) from inside its container and returns one structured report — avoiding the 40+ Bash-exploration sub-turns the orchestrator would otherwise make. Make sure to use this skill for ANY request that involves diagnosing a stack's failure, loop behavior, or why it stopped working — even when the user phrases it gently (e.g., 'doesn't seem to be working', 'not sure what's going on', 'can you take a look') as long as there is a failure or malfunction signal present. Falling back to raw Bash exploration of the stack's internals costs 1M+ tokens. Do NOT trigger for: status-only 'is stack N done?' / 'what's the status of stack N' (that's check-and-resume-stack), diff/logs inspection on a working stack with no failure signal (stack-inspect), or creating a new stack.
testing
Use this skill ONLY when the user has EXPLICITLY asked to tear down, destroy, remove, or dismantle a named Sandstorm stack. Trigger phrases include: 'tear down stack X', 'destroy stack X', 'remove stack X', 'dismantle stack X', 'clean up stack X and all its containers', 'I'm done with stack X, kill it'. This skill stops containers, removes the workspace, and archives the stack — it is IRREVERSIBLE and can lose unpushed work. Do NOT trigger on ambiguous phrases like 'clean up', 'reset', 'start over', 'remove the old one', 'stack is broken', or anything that might imply teardown without literal user words like tear down / destroy / delete. Do NOT trigger for: stopping containers (that's pause, not teardown), checking status, failure recovery, or as a precursor to creating a new stack. When in doubt, ASK the user before running.
tools
Use this skill whenever the user wants to record/link/associate a pull request with an existing Sandstorm stack. Trigger phrases include: 'record PR #N for stack X', 'set PR for stack X to #N', 'link PR https://github.com/.../pull/N to stack X', 'save the PR info on stack X', 'stack X's PR is #N'. Use this after a PR has been opened externally (via gh CLI, the GitHub UI, or push_stack's downstream flow) and the user wants the Sandstorm registry to know about it — the stack status flips to pr_created and the URL/number are stored. Do NOT trigger for: creating the PR itself (that's a separate gh CLI / push flow), tearing down the stack, checking stack status, or unrelated PR operations like merging or closing.
testing
Use this skill whenever the user wants to see DETAILED output, logs, or uncommitted changes from a specific Sandstorm stack. Trigger phrases include: 'show me the output of stack X', 'what did stack X log', 'show the task output for X', 'show container logs for stack X', 'what changed in stack X', 'show me the diff in stack X', 'what's happening inside stack X', 'dump stack X's output', 'get logs for stack X's claude container'. The skill covers three read-only probes — task output, container logs, and uncommitted diff — as subcommands. Do NOT trigger for: a quick status check (that's check-and-resume-stack), listing all stacks (that's list-stacks), or anything that modifies state. Prefer the narrower subcommand (output / logs / diff) over 'all' when the user is specific about what they want.