.claude/skills/swarm-local-e2e/SKILL.md
Guide for running local E2E tests with API server, Docker lead/worker containers, task creation, log verification, UI dashboard, and cleanup
npx skillsauth add desplega-ai/agent-swarm swarm-local-e2eInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run full end-to-end tests of the agent swarm locally with a real API server and Docker containers.
This skill should be invoked in two modes:
User-requested QA: The user asks you to run E2E tests, verify a feature, or QA a specific flow. Follow the steps below targeting what they asked for.
Automated change verification: After implementing changes that touch the API, runner, polling, task lifecycle, session logs, Docker entrypoint, or worker/lead behavior — use this skill proactively to verify the changes work end-to-end. Determine what's testable based on the diff:
You do not need to run every step — pick the subset relevant to the changes being tested.
open -a OrbStack if needed).env with API_KEY and PORT configured.env.docker-lead with lead config (AGENT_ID, CLAUDE_CODE_OAUTH_TOKEN, MCP_BASE_URL).env.docker with worker config (AGENT_ID, CLAUDE_CODE_OAUTH_TOKEN or OPENROUTER_API_KEY, MCP_BASE_URL)Check .env for the configured port — do not assume 3013:
grep ^PORT= .env
Use this value as $PORT throughout. In worktrees, each worktree may have a different port. Always verify and use the value from .env.
Also verify the Docker env files match:
grep MCP_BASE_URL .env.docker-lead .env.docker
# Both should point to http://host.docker.internal:$PORT
If they don't match, update them before starting containers.
# Kill any existing API process on your port
lsof -ti :$PORT | xargs kill 2>/dev/null
# Clean DB for fresh state
rm -f agent-swarm-db.sqlite agent-swarm-db.sqlite-wal agent-swarm-db.sqlite-shm
# Start API server
bun run start:http &
# Wait ~3s for startup, confirm "MCP HTTP server running on http://localhost:$PORT/mcp"
bun run docker:build:worker
This builds agent-swarm-worker:latest from the current code. Rebuild after every code change.
Use a unique container name to avoid conflicts with other worktrees (e.g. include branch name or feature):
docker run --rm -d \
--name e2e-lead-$(git branch --show-current | tr '/' '-') \
--env-file .env.docker-lead \
-e AGENT_ROLE=lead \
-e MAX_CONCURRENT_TASKS=1 \
-p 3201:3000 \
agent-swarm-worker:latest
Wait ~15s, then verify:
docker logs e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[lead] Polling for triggers (0/1 active)..."
If port 3201 is taken by another worktree, pick a different host port (e.g. -p 3211:3000).
docker run --rm -d \
--name e2e-worker-$(git branch --show-current | tr '/' '-') \
--env-file .env.docker \
-e MAX_CONCURRENT_TASKS=1 \
-p 3203:3000 \
agent-swarm-worker:latest
Wait ~15s, then verify:
docker logs e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -5
# Should see: "[worker] Polling for triggers (0/1 active)..."
Use context-mode execute (not curl directly due to hook restrictions):
const headers = { 'Authorization': 'Bearer $API_KEY', 'Content-Type': 'application/json' };
const agents = await (await fetch('http://localhost:$PORT/api/agents', { headers })).json();
for (const a of agents.agents) {
console.log(`${a.name} | isLead: ${a.isLead} | status: ${a.status} | id: ${a.id}`);
}
Should show both lead and worker registered as idle. Save the agent IDs for task creation.
const t = await (await fetch('http://localhost:$PORT/api/tasks', {
method: 'POST', headers,
body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.', agentId: LEAD_ID })
})).json();
console.log('Task:', t.id, '| status:', t.status);
Important: Use agentId (not assignedTo) to assign tasks. Wrong param silently creates an unassigned task.
const t = await (await fetch('http://localhost:$PORT/api/tasks', {
method: 'POST', headers,
body: JSON.stringify({ task: 'Say hello. Call store-progress with status completed.' })
})).json();
console.log('Pool task:', t.id, '| status:', t.status);
Workers auto-claim unassigned tasks at poll time. Leads do not auto-claim pool tasks.
# Watch lead logs (use your container name)
docker logs -f e2e-lead-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20
# Watch worker logs
docker logs -f e2e-worker-$(git branch --show-current | tr '/' '-') 2>&1 | tail -20
Poll task status:
const t = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>', { headers })).json();
console.log(t.status); // pending → in_progress → completed/failed
const logs = await (await fetch('http://localhost:$PORT/api/tasks/<task-id>/session-logs', { headers })).json();
console.log('Log count:', logs.logs.length);
// Should be > 0 for completed tasks
For log isolation verification (multiple sequential tasks from same agent):
const [l1, l2] = await Promise.all([
fetch('http://localhost:$PORT/api/tasks/<task1>/session-logs', { headers }).then(r => r.json()),
fetch('http://localhost:$PORT/api/tasks/<task2>/session-logs', { headers }).then(r => r.json()),
]);
const s1 = [...new Set(l1.logs.map(l => l.sessionId))];
const s2 = [...new Set(l2.logs.map(l => l.sessionId))];
console.log('Unique sessionIds:', s1[0] !== s2[0]); // Should be true
Start the dashboard to visually verify tasks, logs, and agent status:
cd new-ui && pnpm run dev &
# Defaults to port from APP_URL in .env (check with: grep APP_URL ../.env)
If the UI port is taken by another worktree, start on an alternate:
cd new-ui && pnpm run dev --port 5276
The UI connects to the API via VITE_API_URL (check new-ui/.env or defaults to http://localhost:$PORT).
Use agent-browser or qa-use to automate UI checks:
# Quick visual gut-check with agent-browser
agent-browser --url http://localhost:5175 snapshot
# Or use qa-use to verify specific flows
qa-use explore http://localhost:5175
Things to verify in the UI:
# Stop containers (use your branch-specific names)
docker stop e2e-lead-$(git branch --show-current | tr '/' '-') e2e-worker-$(git branch --show-current | tr '/' '-') 2>/dev/null
# Stop API server
lsof -ti :$PORT | xargs kill 2>/dev/null
# Stop UI dev server (if started)
lsof -ti :5175 | xargs kill 2>/dev/null
ERROR: Cannot connect to the Docker daemon
Fix: open -a OrbStack and wait ~5s.
docker: Error response from daemon: Conflict. The container name "..." is already in use
Another worktree has a container with the same name. Either stop it (docker stop <name>) or use a different name suffix.
agentId (not assignedTo) — wrong param silently creates an unassigned taskin_progress (e.g. from a manual poll call that consumed the trigger)docker restart <container-name>docker logs <container> 2>&1 | grep "capacity"/api/poll (not POST)X-Agent-ID header with a valid agent UUIDlsof -i :3013 # Check what's using the port
If another worktree is running, set a different PORT in .env and update MCP_BASE_URL in .env.docker* to http://host.docker.internal:<new-port>.
completed or failed, not just in_progress)claudeSessionId is set on the task: GET /api/tasks/<id> should show itsession_logs table directlyDirect API cancellation (POST /api/tasks/<id>/cancel) updates the DB but doesn't kill the Claude process inside Docker. Use docker restart <container> to force-stop.
Use simple tasks like "Say hello" for E2E tests. Complex tasks waste time and API credits.
The dashboard auto-polls every 5 seconds. If data looks stale, hard-refresh (Cmd+Shift+R) or check VITE_API_URL points to the correct API port.
tools
# Artifacts — Serving Interactive Web Content ## Quick Start ### Static content ```bash # Create your content in a persisted directory mkdir -p /workspace/personal/artifacts/my-report echo '<h1>My Report</h1>' > /workspace/personal/artifacts/my-report/index.html # Serve it (auto-assigns a free port, creates tunnel) artifact serve /workspace/personal/artifacts/my-report --name "my-report" # -> https://{agentId}-my-report.lt.desplega.ai ``` ### Programmatic (custom Hono server) ```typescript i
testing
Work on a specific task assigned to you in the agent swarm
business
How to manage the user registry — creating users for new Slack/GitHub/GitLab identities, managing aliases, resolving users across platforms. Use when a new human interacts with the swarm or when user identity needs updating.
data-ai
Handle the agent personal todos.md file