hermes-backup/daily/2026-04-28_203212/skills/devops/fastmcp-container-restart-debug/SKILL.md
Debug and fix FastMCP container restart loops in Docker Compose — stdio vs SSE transport, healthcheck pitfalls
npx skillsauth add ariffazil/openclaw-workspace fastmcp-container-restart-debugInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
WEALTH MCP container enters restart loop in Docker Compose. App crashes within seconds of start.
FastMCP's mcp.run() defaults to stdio transport — it reads from stdin. When Docker starts the container without a TTY, stdin immediately receives EOF and the process exits cleanly. Docker interprets this as a crash and restarts the container. Loop.
Secondary issue: Compose healthcheck using pg_isready or ss commands that don't exist inside the WEALTH image.
docker ps --filter name=<container> --format '{{.Names}} {{.Status}}'
# Look for: "Restarting (N) X seconds ago"
docker logs <container> --tail 50
# stdio loop: no output or "INFO: Application startup complete" then silent exit
docker exec <container> sh -c "cat /proc/1/cmdline" | tr '\0' ' '
docker exec <container> sh -c "which ss 2>/dev/null || which netstat 2>/dev/null"
docker run --rm --network=host <image> python -c "from internal.monolith import mcp; mcp.run(transport='sse', show_banner=False)" &
sleep 3
curl http://localhost:8000/sse # should return something (SSE stream or 405)
kill %1 2>/dev/null
Patching if __name__ == "__main__" in the monolith/server entrypoint:
if __name__ == "__main__":
import os
if os.isatty(0):
mcp.run() # stdio for local dev
else:
mcp.run(transport="sse", show_banner=False) # SSE for container
This replaces the original mcp.run() call. FastMCP's transport="sse" makes it bind to HTTP instead of stdio — critical for containers.
After fixing the transport, the healthcheck should test an actual HTTP endpoint:
healthcheck:
test: ["CMD-SHELL", "curl -sf http://localhost:8000/sse -o /dev/null || exit 1"]
interval: 10s
timeout: 5s
retries: 5
Note: curl must be available in the container. If not, remove healthcheck entirely and rely on restart: unless-stopped — the process staying alive is sufficient evidence of health.
pg_isready — not in the app containerss -ltn | grep — ss/netstat often not installed in slim imagescommand: override for transport — escaping import statements in YAML is fragilenetwork_mode: service:X — too blunt, breaks container networkingFastMCP defaults to stdio. Container environments have no TTY → stdin closes → process exits → restart loop. Always explicitly set transport="sse" when deploying FastMCP containers.
development
Governed intelligence skill for AAA as the abstraction, attestation, and abduction control plane across arifOS, APEX, A-FORGE, GEOX, WEALTH, WELL, and the ariffazil profile repository. Use when the user asks to explain or design AAA, route agentic work, reduce chaos/entropy in an arifOS federation task, create AREP/task declarations, classify risk, plan multi-repo changes, review governance boundaries, or translate human intent into evidence-backed, authority-safe, recursively agentic workflows. Provides deterministic F1-F13 floor checking, bounded abduction, and FederationReceipt composition.
development
Check every skill’s “use when” and “do not use when” clauses for collisions, missing negatives, and vague verbs like “help,” “assist,” or “improve.” Load when linting, reviewing, or validating trigger boundaries.
development
Bootstrap, design, and package new skills. Load when capturing user intent for a new skill or drafting its initial instruction framework.
content-media
Diagnose which federation services are up, down, or drifting. Produce a prioritized remediation plan.