supervisord -- process manager

Candy Properties

| Property | Value | |----------|-------| | Dependencies | (none) — system Python comes from the supervisor RPM's own dep | | Install files | charly.yml, supervisord.header.conf (via the embedded init: vocabulary's header_file:) | | Init role | Default init system for container images (set in the embedded init: vocabulary) |

Packages

supervisor (RPM) — process control system. The RPM brings /usr/bin/python3 as its own dependency; supervisord's shebang is /usr/bin/python3. No pixi-python / no conda-forge Python env is needed.
supervisor (pac) — same package name on Arch (extra/ repo), same system-Python dependency.

No `require: python`

This layer declares no require: python. supervisord's runtime is pure system-Python (the supervisor RPM brings /usr/bin/python3 as its own dependency), so it needs no python charly-layer → pixi charly-layer → conda-forge Python env (~500 MB). See CLAUDE.md "Key Rules" → "Don't declare defensive deps" for the general rule.

Arch note: the pac: [supervisor] section is required so that Arch-based images with build: [pac] actually install supervisord. Without it, an Arch image gets a /etc/supervisord.conf with no supervisord binary, and the quadlet's supervisord -n -c /etc/supervisord.conf exits 127 (command not found) at container start. Every Arch-rooted auto-intermediate that composes supervisord needs it.

How `charly` Generates Supervisord Configs

Candies declare processes via the unified service: schema in charly.yml (see /charly-image:layer "Service Declaration"). Each entry is rendered through supervisord's service_schema.service_template in the embedded build vocabulary, which produces a [program:<name>] INI fragment. charly box generate collects all rendered fragments across the candy chain and writes them into /etc/supervisord.conf inside the image, prefixed by the header from templates/supervisord.header.conf (referenced from the embedded init: vocabulary).

# candy/chrome/charly.yml — node-form: the service: list lives in a <candy>-service child node
chrome-service:
  service:
    - name: chrome
      exec: /home/user/.local/bin/chrome-wrapper --force-renderer-accessibility --no-first-run --start-maximized
      restart: always
      user: user
      env:
        # ...
      priority: 50                  # supervisord-specific; ordering hint
      stdout: file:/tmp/supervisor-chrome.log
      scope: system
      enable: true

The render template maps the abstract spec to supervisord INI:

| service: field | Supervisord output | |---|---| | exec | command= | | env | environment=K="V",K2="V2" | | restart: always | autorestart=true (helper supervisordRestart) | | restart: on-failure | autorestart=unexpected | | restart: no / unset | autorestart=false | | working_directory | directory= | | user | user= | | priority | priority= (supervisord-specific) | | stop_timeout | stopwaitsecs= | | stdout: file:/path | stdout_logfile=/path (helper supervisordLog) | | stdout: journal / unset | stdout_logfile=/dev/stdout |

Supervisord does NOT support use_packaged: — it doesn't consume systemd units. Entries with use_packaged: are skipped with a warning when rendering into a supervisord-init image. Use a custom entry (with explicit exec:) on supervisord-init images, or target a systemd init (bootc / target: local) where use_packaged: is honored.

The rendered INI lands at /etc/supervisord.d/<layer>-<name>.conf and is assembled into /etc/supervisord.conf at container-build time via the init system's fragment pipeline (the embedded init.supervisord.fragment_template + stage_fragment_copy).

At runtime, PID 1 is supervisord. The container's lifecycle is the supervisord process's lifecycle — when supervisord exits, the container exits (and the systemd quadlet's Restart=always rebuilds it).

`service:` declaration

Layers declare services under a single service: key (singular; value is a list of ServiceEntry). Two entry shapes: use_packaged: reuses a distro-shipped systemd unit; custom exec (with exec:, env:, restart:, …) renders via the init-system's service_template:.

Supervisord lifecycle directives on ServiceEntry (all optional — render only when set):

| Field | Purpose | |---|---| | kind: eventlistener | Emits [eventlistener:...] instead of [program:...]; use with events:. | | events: | Supervisord eventlistener trigger list (e.g. PROCESS_STATE_FATAL). | | auto_start: | Tri-state bool. false for services hand-started by another program. | | start_retries: | Max restart attempts before FATAL. The selkies [program:chrome] uses 3. | | start_secs: | Seconds the process must stay up to count as "started." | | stop_signal: | TERM (default), INT, HUP, … | | exit_code: | Success codes for restart: no / on-failure. | | priority: | Startup order; lower = earlier. |

See /charly-selkies:selkies-core for the canonical consumer (the supervised [program:chrome] service: restart: always + start_secs/start_retries).

Polymorphism: a candy that runs on BOTH supervisord and systemd

A candy that needs the SAME service to run under supervisord (container/pod targets) AND systemd (host installs / bootc / VMs) must NOT spin up a <name>-host sibling candy. The supported pattern is mixed entries in one service: list: same name:, two entries — one with use_packaged: <unit>.service (or .socket) for the systemd render, the other with custom exec: for the supervisord render. Init system at deploy time picks the matching form; the other entry is silently skipped. See /charly-image:layer "Service Declaration" → "Mixed entries in one candy" for the schema, CLAUDE.md "Init-system polymorphism via mixed service: entries" for the project-wide rule, and /charly-infrastructure:virtualization for the canonical worked example.

It is tempting to copy-paste-and-rename a candy with a -host suffix when the schema already supports polymorphism via mixed entries. If you find yourself reaching for -host, reach for a second service: entry instead.

Service Priority & Ordering

Supervisord starts programs in priority order (ascending). Candies set priorities explicitly so that dependency chains come up in the right order:

| Priority | Typical services | |----------|------------------| | 1-9 | System-level: dbus, pipewire | | 10 | Compositors: sway, labwc | | 15-20 | Desktop extras: waybar, swaync | | 30-50 | Applications: chrome, openclaw, hermes | | 100 | Auxiliary: cdp-proxy, chrome-devtools-mcp | | 200+ | Last-start daemons: event listeners |

Programs with no priority= default to 999 and start last.

autostart / autorestart / startretries

The knobs that shape crash-recovery behavior:

autostart=true (default) — Start at container boot. Use autostart=false only for services that must be hand-started by another program. (Services that just need a Wayland compositor up first don't need this — e.g. the selkies [program:chrome] service in selkies-core stays autostart=true and self-synchronizes by polling for the compositor's wayland-0 socket inside chrome-wrapper.)
autorestart=true (default) — Restart the program when it exits non-zero. Combined with startretries, this creates a crash-retry loop.
startretries=N (default 3) — How many consecutive restarts supervisord will attempt before declaring the program FATAL. Once FATAL, supervisord stops trying.
startsecs=N (default 1) — How long the program must stay up before supervisord considers the start "successful." Crashes within startsecs count against startretries; crashes after reset the counter.

A program that can't stay up for startsecs seconds, startretries times in a row, is marked FATAL and a PROCESS_STATE_FATAL event is emitted — which is what the event listener pattern below catches.

Event Listeners

Supervisord supports a plugin model for reacting to process state changes: event listeners are separate programs that subscribe to events and can take arbitrary action. They run as supervisord programs themselves (using [eventlistener:<name>] instead of [program:<name>]) and communicate with supervisord over stdin/stdout using a simple protocol.

Crash-escalation eventlistener (the pattern)

A PROCESS_STATE_FATAL listener is the supervisord pattern for escalating a program that has exhausted its startretries budget. The classic action is to terminate supervisord (PID 1) so the systemd quadlet's Restart=always rebuilds the whole container — the only way to flush cgroup-level orphan memfd shmem (from e.g. a Chrome crash loop) that a per-program restart can't release:

[eventlistener:crash-escalate]
command=/path/to/crash-escalate-listener
events=PROCESS_STATE_FATAL
autostart=true
autorestart=true
priority=200
user=user

No candy ships such a listener today. The selkies [program:chrome] service (in selkies-core, restart: always, start_secs: 5/start_retries: 3) relies on supervisord's ordinary relaunch for ordinary exits; a genuinely wedged crash loop is cleared by restarting the container (charly update / charly restart), which tears down the cgroup. The chrome candy's security.memory_max/memory_high/memory_swap_max caps bound the blast radius. See /charly-selkies:chrome (Chrome supervision) and /charly-image:layer (Security Declaration → resource caps).

Event listener protocol (for authoring your own)

An event listener is a shell script or Python program that:

Writes READY\n to stdout to tell supervisord it's ready for an event.
Reads a header line from stdin (format: ver:3.0 server:supervisor serial:N pool:name poolserial:M eventname:NAME len:L).
Reads L bytes of payload from stdin.
Takes action (log, email, kill PID 1, restart a service, etc.).
Writes RESULT 2\nOK to stdout.
Loops back to step 1.

Full protocol details: supervisord event listener docs. supervisorctl avail lists both regular programs and event listeners.

Diagnostics

# Inside container
supervisorctl status           # All programs + states
supervisorctl avail            # All defined programs (including stopped)
supervisorctl pid              # Supervisord's own PID — best liveness probe
supervisorctl tail -f chrome stderr  # Live stderr log for one program
supervisorctl restart chrome   # Restart a single program (does not reset startretries counter)

# From host
charly service status <image>      # Wraps supervisorctl status via charly shell
charly service start/stop/restart <image> <name>   # Per-program control
charly logs <image>                # Container-level stdout/stderr (supervisord output)

Declarative-testing liveness check

supervisorctl status returns exit 3 when ANY program is non-RUNNING (FATAL, STOPPED, or EXITED). Candies with autostart=false programs (e.g. hermes ships hermes-whatsapp as autostart=false) will always fail a naive command: supervisorctl status; exit_status: 0 test.

The robust liveness probe is supervisorctl pid, which asks supervisord for its own PID and exits 0 iff the socket responds:

# node-form: each step is its own child node (no plan: list key)
supervisord-step-6:
  check: supervisord's control socket responds
  command: supervisorctl pid
  exit_status: 0
  in_container: true
  context:
    - runtime

This is the deterministic check: step node the current supervisord candy ships (supervisord-step-6). See /charly-check:check Authoring Gotcha #4.

Also note: pgrep is NOT installed by default in minimal images (needs procps-ng). The process: <name>; running: true test verb silently fails when pgrep is absent. Prefer service: (which uses supervisorctl internally) for program-liveness checks. See /charly-check:check Authoring Gotcha #3.

supervisorctl avail | grep -q '^<name>\b' is the idiomatic way to check whether a program is defined (as opposed to whether it's currently running). This is what labwc's autostart uses to decide whether to hand off Chrome to supervisord or fall back to a direct launch — see /charly-selkies:labwc (autostart Chrome-duplication race) for the canonical pattern.

Header File

templates/supervisord.header.conf (at the repo root) defines global [supervisord], [unix_http_server], [supervisorctl], and [rpcinterface:supervisor] sections. It's referenced from the embedded init: vocabulary as the init system's header_file: and prepended to every generated supervisord.conf. Common fields:

[unix_http_server] file=/tmp/supervisor.sock — the socket supervisorctl uses
[supervisord] nodaemon=true logfile=/tmp/supervisord.log pidfile=/tmp/supervisord.pid
[supervisord] user=user — supervisord runs as the container user, not root

Cross-distro coverage

RPM: supervisor (Fedora) · PAC: supervisor (Arch community) · DEB: supervisor (Debian/Ubuntu — package is named supervisor, not supervisord). Full parity; the process manager itself plus the header template is identical across distros. The [supervisord] user=user line in templates/supervisord.header.conf is a fallback only — when user_policy: adopt fires on an ubuntu base image, supervisord inherits the adopted ubuntu identity at runtime via the USER directive, so user=user in the header is harmless (supervisord's user= field is advisory — uid passthrough from the Dockerfile USER directive takes precedence).

Usage

# charly.yml — a box composes candies through a <box>-candy child node
my-image:
  candy:
    base: fedora
  my-image-candy:
    candy:
      - supervisord
      - my-service  # layers with service: entries need supervisord

Adding a service: block to a candy automatically pulls in supervisord via the embedded init: vocabulary's depends_candy. You rarely add supervisord to a box's candy: list manually.

Used In Boxes

Transitive dependency for all boxes with managed services, including: openclaw, jupyter, jupyter-ml, jupyter-ml-notebook, ollama, comfyui, immich, immich-ml, selkies-desktop, selkies-labwc-nvidia, hermes, openwebui, filebrowser.

Running supervisord under systemd (bootc mode)

On non-bootc images, supervisord is container PID 1 (ENTRYPOINT=supervisord emitted by the init system definition in the embedded build vocabulary). On bootc images, systemd is PID 1, which means supervisord needs a systemd unit wrapper — otherwise the whole desktop tier never starts. The wrapper is a systemd user unit at /etc/systemd/user/supervisord.service, systemctl --global enabled so tty1 autologin brings it up (a linger sentinel file keeps it alive across logouts).

Two gotchas specific to running under a systemd user service

Both involve opening /dev/stdout or /dev/fd/1, which resolve to the journal pipe under a systemd user service — and open() on a pipe returns ENXIO.

Main supervisord logfile. The header template (templates/supervisord.header.conf) was changed from logfile=/dev/stdout to logfile=/tmp/supervisord.log. /dev/stdout works when supervisord is PID 1 in a container (fd 1 is real stdio), but fails with OSError: [Errno 6] No such device or address: '/dev/stdout' under a systemd user service where fd 1 is a pipe. Writing to a regular file works everywhere.
Per-program stdout_logfile=/dev/fd/1. Every candy's service: fragment redirects program stdout to /dev/fd/1 so container logs (charly logs <image>) show per-program output. Under a systemd user service this fails with unknown error making dispatchers for <name>: ENXIO for every program. The fix lives in the systemd user unit itself — set StandardOutput=file:/tmp/supervisord-stdout.log so supervisord's fd 1 backs a real file, not a pipe. Existing per-program /dev/fd/1 lines then resolve correctly.

Container-mode logs are unaffected — supervisord is still PID 1 there.

Related Candies

/charly-languages:python -- Optional pixi-python env (NOT a dep of this candy; supervisord uses system python3 from RPM)
/charly-selkies:selkies-core -- owns the supervised [program:chrome] service (restart: always, start_secs/start_retries) for both selkies flavors
/charly-selkies:chrome -- the chrome candy's cgroup resource caps (bound a Chrome crash loop's blast radius)
/charly-infrastructure:traefik -- Reverse proxy (depends on supervisord)
/charly-infrastructure:dbus-layer -- D-Bus session bus (depends on supervisord)
/charly-ollama:ollama, /charly-openclaw:openclaw, /charly-infrastructure:postgresql, /charly-infrastructure:redis, /charly-selkies:sway -- All ship service: blocks

Related Commands

/charly-core:service — Start/stop/restart/status for individual services inside the container
/charly-core:logs — Container-level log access (shows supervisord-aggregated stdout/stderr)
/charly-core:charly-status — Container status including service probe results
/charly-build:generate — Containerfile generation: where service: blocks get written to /etc/supervisord.conf
/charly-image:layer — service: field authoring + security: cgroup resource caps

When to Use This Skill

Use when the user asks about:

Supervisord configuration or process management
Running multiple services in a container
Service startup order (priority ordering)
autostart / autorestart / startretries behavior and crash-loop recovery
Event listeners (the PROCESS_STATE_FATAL crash-escalation pattern)
supervisorctl commands (status, avail, start, tail, restart)
The charly service commands (start/stop/restart/status)
The supervisor RPM package or /tmp/supervisor.sock

supervisord -- process manager

Candy Properties

Packages

supervisor (RPM) — process control system. The RPM brings /usr/bin/python3 as its own dependency; supervisord's shebang is /usr/bin/python3. No pixi-python / no conda-forge Python env is needed.
supervisor (pac) — same package name on Arch (extra/ repo), same system-Python dependency.

No `require: python`

How `charly` Generates Supervisord Configs

# candy/chrome/charly.yml — node-form: the service: list lives in a <candy>-service child node
chrome-service:
  service:
    - name: chrome
      exec: /home/user/.local/bin/chrome-wrapper --force-renderer-accessibility --no-first-run --start-maximized
      restart: always
      user: user
      env:
        # ...
      priority: 50                  # supervisord-specific; ordering hint
      stdout: file:/tmp/supervisor-chrome.log
      scope: system
      enable: true

The render template maps the abstract spec to supervisord INI:

`service:` declaration

Supervisord lifecycle directives on ServiceEntry (all optional — render only when set):

See /charly-selkies:selkies-core for the canonical consumer (the supervised [program:chrome] service: restart: always + start_secs/start_retries).

Polymorphism: a candy that runs on BOTH supervisord and systemd

Service Priority & Ordering

Supervisord starts programs in priority order (ascending). Candies set priorities explicitly so that dependency chains come up in the right order:

Programs with no priority= default to 999 and start last.

autostart / autorestart / startretries

The knobs that shape crash-recovery behavior:

autostart=true (default) — Start at container boot. Use autostart=false only for services that must be hand-started by another program. (Services that just need a Wayland compositor up first don't need this — e.g. the selkies [program:chrome] service in selkies-core stays autostart=true and self-synchronizes by polling for the compositor's wayland-0 socket inside chrome-wrapper.)
autorestart=true (default) — Restart the program when it exits non-zero. Combined with startretries, this creates a crash-retry loop.
startretries=N (default 3) — How many consecutive restarts supervisord will attempt before declaring the program FATAL. Once FATAL, supervisord stops trying.
startsecs=N (default 1) — How long the program must stay up before supervisord considers the start "successful." Crashes within startsecs count against startretries; crashes after reset the counter.

Event Listeners

Crash-escalation eventlistener (the pattern)

[eventlistener:crash-escalate]
command=/path/to/crash-escalate-listener
events=PROCESS_STATE_FATAL
autostart=true
autorestart=true
priority=200
user=user

Event listener protocol (for authoring your own)

An event listener is a shell script or Python program that:

Writes READY\n to stdout to tell supervisord it's ready for an event.
Reads a header line from stdin (format: ver:3.0 server:supervisor serial:N pool:name poolserial:M eventname:NAME len:L).
Reads L bytes of payload from stdin.
Takes action (log, email, kill PID 1, restart a service, etc.).
Writes RESULT 2\nOK to stdout.
Loops back to step 1.

Full protocol details: supervisord event listener docs. supervisorctl avail lists both regular programs and event listeners.

Diagnostics

# Inside container
supervisorctl status           # All programs + states
supervisorctl avail            # All defined programs (including stopped)
supervisorctl pid              # Supervisord's own PID — best liveness probe
supervisorctl tail -f chrome stderr  # Live stderr log for one program
supervisorctl restart chrome   # Restart a single program (does not reset startretries counter)

# From host
charly service status <image>      # Wraps supervisorctl status via charly shell
charly service start/stop/restart <image> <name>   # Per-program control
charly logs <image>                # Container-level stdout/stderr (supervisord output)

Declarative-testing liveness check

The robust liveness probe is supervisorctl pid, which asks supervisord for its own PID and exits 0 iff the socket responds:

# node-form: each step is its own child node (no plan: list key)
supervisord-step-6:
  check: supervisord's control socket responds
  command: supervisorctl pid
  exit_status: 0
  in_container: true
  context:
    - runtime

This is the deterministic check: step node the current supervisord candy ships (supervisord-step-6). See /charly-check:check Authoring Gotcha #4.

Header File

[unix_http_server] file=/tmp/supervisor.sock — the socket supervisorctl uses
[supervisord] nodaemon=true logfile=/tmp/supervisord.log pidfile=/tmp/supervisord.pid
[supervisord] user=user — supervisord runs as the container user, not root

Cross-distro coverage

Usage

# charly.yml — a box composes candies through a <box>-candy child node
my-image:
  candy:
    base: fedora
  my-image-candy:
    candy:
      - supervisord
      - my-service  # layers with service: entries need supervisord

Adding a service: block to a candy automatically pulls in supervisord via the embedded init: vocabulary's depends_candy. You rarely add supervisord to a box's candy: list manually.

Used In Boxes

Running supervisord under systemd (bootc mode)

Two gotchas specific to running under a systemd user service

Both involve opening /dev/stdout or /dev/fd/1, which resolve to the journal pipe under a systemd user service — and open() on a pipe returns ENXIO.

Main supervisord logfile. The header template (templates/supervisord.header.conf) was changed from logfile=/dev/stdout to logfile=/tmp/supervisord.log. /dev/stdout works when supervisord is PID 1 in a container (fd 1 is real stdio), but fails with OSError: [Errno 6] No such device or address: '/dev/stdout' under a systemd user service where fd 1 is a pipe. Writing to a regular file works everywhere.
Per-program stdout_logfile=/dev/fd/1. Every candy's service: fragment redirects program stdout to /dev/fd/1 so container logs (charly logs <image>) show per-program output. Under a systemd user service this fails with unknown error making dispatchers for <name>: ENXIO for every program. The fix lives in the systemd user unit itself — set StandardOutput=file:/tmp/supervisord-stdout.log so supervisord's fd 1 backs a real file, not a pipe. Existing per-program /dev/fd/1 lines then resolve correctly.

Container-mode logs are unaffected — supervisord is still PID 1 there.

Related Candies

/charly-languages:python -- Optional pixi-python env (NOT a dep of this candy; supervisord uses system python3 from RPM)
/charly-selkies:selkies-core -- owns the supervised [program:chrome] service (restart: always, start_secs/start_retries) for both selkies flavors
/charly-selkies:chrome -- the chrome candy's cgroup resource caps (bound a Chrome crash loop's blast radius)
/charly-infrastructure:traefik -- Reverse proxy (depends on supervisord)
/charly-infrastructure:dbus-layer -- D-Bus session bus (depends on supervisord)
/charly-ollama:ollama, /charly-openclaw:openclaw, /charly-infrastructure:postgresql, /charly-infrastructure:redis, /charly-selkies:sway -- All ship service: blocks

Related Commands

/charly-core:service — Start/stop/restart/status for individual services inside the container
/charly-core:logs — Container-level log access (shows supervisord-aggregated stdout/stderr)
/charly-core:charly-status — Container status including service probe results
/charly-build:generate — Containerfile generation: where service: blocks get written to /etc/supervisord.conf
/charly-image:layer — service: field authoring + security: cgroup resource caps

When to Use This Skill

Use when the user asks about:

Supervisord configuration or process management
Running multiple services in a container
Service startup order (priority ordering)
autostart / autorestart / startretries behavior and crash-loop recovery
Event listeners (the PROCESS_STATE_FATAL crash-escalation pattern)
supervisorctl commands (status, avail, start, tail, restart)
The charly service commands (start/stop/restart/status)
The supervisor RPM package or /tmp/supervisor.sock

Adoption

overthinkos/supervisord

$ install --global

Security Scan Results

SKILL.md

supervisord -- process manager

Candy Properties

Packages

No require: python

How charly Generates Supervisord Configs

service: declaration

Polymorphism: a candy that runs on BOTH supervisord and systemd

Service Priority & Ordering

autostart / autorestart / startretries

Event Listeners

Crash-escalation eventlistener (the pattern)

Event listener protocol (for authoring your own)

Diagnostics

Declarative-testing liveness check

Header File

Cross-distro coverage

Usage

Used In Boxes

Running supervisord under systemd (bootc mode)

Two gotchas specific to running under a systemd user service

Related Candies

Related Commands

When to Use This Skill

Related Skills

overthinkos/plugin

overthinkos/cue

overthinkos/egress

overthinkos/check-k8s

overthinkos/supervisord

$ install --global

Security Scan Results

SKILL.md

supervisord -- process manager

Candy Properties

Packages

No require: python

How charly Generates Supervisord Configs

service: declaration

Polymorphism: a candy that runs on BOTH supervisord and systemd

Service Priority & Ordering

autostart / autorestart / startretries

Event Listeners

Crash-escalation eventlistener (the pattern)

Event listener protocol (for authoring your own)

Diagnostics

Declarative-testing liveness check

Header File

Cross-distro coverage

Usage

Used In Boxes

Running supervisord under systemd (bootc mode)

Two gotchas specific to running under a systemd user service

Related Candies

Related Commands

When to Use This Skill

Related Skills

overthinkos/plugin

overthinkos/cue

overthinkos/egress

overthinkos/check-k8s

No `require: python`

How `charly` Generates Supervisord Configs

`service:` declaration

No `require: python`

How `charly` Generates Supervisord Configs

`service:` declaration