toolkit/packages/skills/swarm-shutdown/SKILL.md
Gracefully shutdown a swarm team
npx skillsauth add stevengonsalvez/agents-in-a-box swarm-shutdownInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Gracefully shutdown a swarm team with optional force flag.
/swarm-shutdown <team-id> [--force]
| Argument | Required | Description |
|----------|----------|-------------|
| <team-id> | Yes | The swarm team ID to shutdown |
| --force | No | Skip graceful shutdown, kill immediately |
When you receive this command:
Broadcast Shutdown Message
source {{HOME_TOOL_DIR}}/utils/swarm-lib.sh
echo "Initiating graceful shutdown of $TEAM_ID..."
# Notify all agents
swarm_broadcast "$TEAM_ID" "system" "shutdown" \
'{"reason": "graceful shutdown requested", "checkpoint": true}'
echo "Shutdown message broadcast to all agents"
Wait for Checkpoint
echo "Waiting 10 seconds for agents to checkpoint..."
sleep 10
Verify Agent Status
# Check if agents have acknowledged
TEAM_DIR="{{HOME_TOOL_DIR}}/swarm/$TEAM_ID"
for inbox in "$TEAM_DIR"/inbox/*.jsonl; do
AGENT=$(basename "$inbox" .jsonl)
LAST_MSG=$(tail -1 "$inbox" 2>/dev/null | jq -r '.type' || echo "")
if [ "$LAST_MSG" = "checkpoint" ]; then
echo " + $AGENT checkpointed"
else
echo " o $AGENT (no checkpoint confirmation)"
fi
done
Kill tmux Sessions
TEAM_JSON=$(cat "$TEAM_DIR/team.json")
# Kill leader
LEADER_SESSION=$(echo "$TEAM_JSON" | jq -r '.leader.session')
if tmux has-session -t "$LEADER_SESSION" 2>/dev/null; then
tmux kill-session -t "$LEADER_SESSION"
echo " + Killed leader: $LEADER_SESSION"
fi
# Kill agents
for session in $(echo "$TEAM_JSON" | jq -r '.members[].session'); do
if tmux has-session -t "$session" 2>/dev/null; then
tmux kill-session -t "$session"
echo " + Killed agent: $session"
fi
done
Update Team State
swarm_update_team "$TEAM_ID" '{"status": "shutdown", "shutdown_at": "'"$(date -Iseconds)"'"}'
Report
echo ""
echo "========================================"
echo " Swarm Shutdown Complete: $TEAM_ID"
echo "========================================"
echo ""
echo " Team data preserved at:"
echo " $TEAM_DIR/"
echo ""
echo " To archive: /swarm-archive $TEAM_ID"
echo " To restart: /swarm-create with same epic"
echo ""
Skip the graceful steps and kill immediately:
echo "Force shutdown of $TEAM_ID..."
# Kill all sessions without waiting
swarm_shutdown "$TEAM_ID" --force
echo "Force shutdown complete"
================================================================
SWARM SHUTDOWN: swarm-1738585396
================================================================
Phase 1: Broadcasting shutdown...
-> Sent to leader
-> Sent to agent-1
-> Sent to agent-2
-> Sent to agent-3
Phase 2: Waiting for checkpoints (10s)...
==================== 100%
Phase 3: Verifying checkpoints...
+ leader: checkpointed
+ agent-1: checkpointed
+ agent-2: checkpointed
o agent-3: no response (will force kill)
Phase 4: Terminating sessions...
+ Killed: swarm-1738585396-leader
+ Killed: swarm-1738585396-agent-1
+ Killed: swarm-1738585396-agent-2
+ Killed: swarm-1738585396-agent-3
Phase 5: Updating team state...
+ Status set to: shutdown
================================================================
SHUTDOWN COMPLETE
================================================================
Team data preserved at: {{HOME_TOOL_DIR}}/swarm/swarm-1738585396/
Options:
Archive team: swarm-lib.sh archive swarm-1738585396
View final state: cat {{HOME_TOOL_DIR}}/swarm/swarm-1738585396/team.json
Restart: /swarm-create --epic bd-epic-123
================================================================
================================================================
FORCE SHUTDOWN: swarm-1738585396
================================================================
Killing sessions immediately...
+ Killed: swarm-1738585396-leader
+ Killed: swarm-1738585396-agent-1
+ Killed: swarm-1738585396-agent-2
Status updated to: shutdown
================================================================
When agents receive a shutdown message, they should:
Complete or Pause Current Work
Update Beads
# Mark task as paused/blocked
bd update $CURRENT_TASK --status blocked --reason "Swarm shutdown"
Send Checkpoint Confirmation
swarm_send_message "$TEAM_ID" "$AGENT_NAME" "system" "checkpoint" \
'{"task": "'"$CURRENT_TASK"'", "progress": 75, "state": "saved"}'
Clean Up
# Move team to .archive/ directory
swarm-lib.sh archive swarm-1738585396
# Permanently remove team data
rm -rf {{HOME_TOOL_DIR}}/swarm/swarm-1738585396
# Create new swarm from same epic
/swarm-create --epic bd-epic-platform-rebuild --agents 3
| Error | Resolution | |-------|------------| | Team not found | Verify team ID exists | | Sessions already dead | Proceeds with state update only | | Permission denied | Check tmux socket permissions | | Checkpoint timeout | Force flag used automatically after timeout |
documentation
Report reflect drain spend over a time window — tokens split by cached (cache_read), uncached writes (cache_creation), and io (input+output), with a $ estimate, grouped by day / outcome / model / transcript. Reads the drainer's cost log and surfaces outlier runs and cache-reuse health (the 41.5M-token failure mode = low cache reuse + high cache writes). Use to answer "what is reflection costing me" for the last day / week.
development
Show fleet status — every claude session running on the host, merged across ainb + claude-peers broker + background jobs. Use when you need to enumerate sessions before composing an action, see which sessions have a peer registered (broker-routable) vs tmux-only, check the `summary` of each session, or pipe the list into jq for filtering. Default output: text table. Pass --format json for LLM consumption.
testing
Ordered multi-step prompts to fleet targets, ack-gated between steps via JSONL assistant-turn-end detection. Use for cycles like disconnect→reconnect→verify, or any flow where step N+1 requires step N to have completed first. The skill BLOCKS until each target's transcript shows the next assistant turn finishing OR per-step timeout fires (default 300s).
development
Center control panel — enumerate every claude session that is blocked waiting on something: a user answer (AskUserQuestion fired), an API error retry, an idle assistant turn-end with no follow-up, or an explicit WAITING: marker. Returns rich JSON with signal kind + context per session. Use this when you've stepped away from the fleet and want one place to see everything that wants your attention and answer it.