ov-selkies/skills/selkies-desktop-layer/SKILL.md
Metalayer composing a full Selkies Wayland streaming desktop with Chrome, Waybar, desktop automation tools, and accessibility introspection. Use when working with the selkies-desktop metalayer composition, labwc desktop, or browser-accessible remote desktops.
npx skillsauth add overthinkos/overthink-plugins selkies-desktopInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Metalayer composing a full Selkies Wayland streaming desktop with Chrome, Waybar, desktop automation tools, accessibility introspection, and XWayland support.
layers:
- pipewire # Audio (PulseAudio compat)
- chrome # Google Chrome with CDP on :9222, Chrome DevTools MCP on :9224
- labwc # Wayland compositor (nested in pixelflux)
- waybar-labwc # Top status bar (Catppuccin Mocha, system monitors)
- desktop-fonts # JetBrains Mono + Nerd Fonts
- swaync # SwayNotificationCenter (notification daemon)
- pavucontrol # PulseAudio volume control GUI
- wl-tools # Desktop automation (wtype, wlrctl, xdotool, wl-clipboard, wlr-randr)
- wl-screenshot-pixelflux # Screenshots via selkies capture bridge
- wl-overlay # Fullscreen overlays via gtk4-layer-shell (for recordings)
- wl-record-pixelflux # Desktop video recording via selkies capture bridge
- a11y-tools # AT-SPI2 accessibility introspection (python3-pyatspi)
- xterm # X11 terminal for XWayland testing
- tmux # Terminal multiplexer (required by ov eval record)
- asciinema # Terminal session recording
- fastfetch # System information display
- selkies # Streaming server (pixelflux + pcmflux + nginx)
A browser-accessible desktop at http://localhost:3000 with:
--force-renderer-accessibilityov eval wl automation: 22 subcommands all working — screenshots (pixelflux), input (wtype, wlrctl), window management (wlrctl toplevel), clipboard (wl-copy/paste), resolution (wlr-randr), accessibility (AT-SPI2), XWayland tools (xdotool, xprop)ov eval cdp click --wl: CSS selector → Wayland pointer click (no VNC needed)ov eval cdp axtree: Chrome accessibility tree via CDPov eval record start --mode desktop (capture bridge → H.264 → ffmpeg MP4, with optional audio)ov eval wl overlay (title cards, lower-thirds, countdowns, highlights, fades — rendered by compositor with true alpha transparency, no post-production needed)XKB_DEFAULT_LAYOUT — German (de), French (fr), Nordic (no), etc. AltGr characters (@, €, \, ~) work via direct scancode injection. See /ov-selkies:labwc| Feature | Status | Notes |
|---------|--------|-------|
| Screenshots (pixelflux-screenshot) | WORKS | Via capture bridge at /tmp/ov-capture.sock |
| Screenshots (grim) | BROKEN | labwc nested in pixelflux can't deliver screencopy frames |
| wtype (keyboard) | WORKS | Wayland virtual keyboard |
| wlrctl pointer (mouse) | WORKS | Move, click, double-click |
| wlrctl toplevel (windows) | WORKS | List, focus, close, fullscreen, minimize. Matches by app_id only (v0.2.2) |
| wlr-randr (resolution) | WORKS | Query and set output resolution |
| wl-clipboard | WORKS | Get/set/clear clipboard |
| xdotool (X11 windows) | WORKS | Needs an X11 app running (xterm). XWayland starts on-demand |
| xprop (X11 properties) | WORKS | Search by --class first, then --name |
| AT-SPI2 (atspi) | WORKS | Uses /usr/bin/python3 (system Python, not pixi) |
| ydotool (drag/scroll) | WORKS | Needs /dev/uinput access |
| CDP click --wl | WORKS | Selector → Wayland pointer (same coordinate space) |
| CDP axtree | WORKS | Chrome accessibility tree with filtering |
| wl-overlay (overlays) | WORKS | True alpha, ~15s screenshot latency in controller mode. Instant in recordings |
| Priority | Service | Creates |
|----------|---------|---------|
| 2 | dbus | D-Bus session bus |
| 5 | pipewire | Audio server |
| 8 | selkies | pixelflux wayland-1 + WebSocket :8081 (process-wide ScreenCapture singleton) |
| 12 | labwc | Desktop on wayland-0 (nested in wayland-1) |
| 14 | swaync | Notification daemon (on wayland-0) |
| 15 | waybar | Top panel (on wayland-0) |
| 18 | nginx | Web UI on :3000 |
Chrome ownership: Chrome is managed exclusively by supervisord, not by labwc's direct exec. labwc's autostart calls supervisorctl start chrome after a TOCTOU-safe supervisorctl avail | grep chrome check to avoid a race that could launch two Chrome processes against the same --user-data-dir. The fix is commit febb9bd; see /ov-selkies:labwc (autostart Chrome-duplication race) for the full analysis and /ov-selkies:chrome (Resource Caps & Circuit Breaker) for the crash-loop supervision pattern paired with the cgroup memory caps.
Capture singleton: the selkies process owns a single process-wide ScreenCapture instance. Screenshot requests (/ov-selkies:wl-screenshot-pixelflux) and recording requests (/ov-selkies:wl-record-pixelflux) both attach to the same capture bridge at /tmp/ov-capture.sock — there is never a second capture process. This is the state enforced by commit 6be85eb after the WaylandBackend leak investigation; the per-frame cleanup fix in commit 7977b91 is the paired memory-management step. See /ov-selkies:selkies (Pixelflux Memory Management) for the leak diagnosis, rollout recipe, and diagnostic commands.
/ov-selkies:selkies-desktop/ov-selkies:selkies-desktop-nvidia/ov-selkies:selkies-desktop-bootc -- Fedora 43 bootc VM with added Tailscale + KeePassXCOn container images, ENTRYPOINT=supervisord with priority-based startup sequences the desktop tier cleanly. On bootc images, supervisord runs under a systemd user service (see /ov-foundation:bootc-config), and a start-order race surfaces that's invisible in container mode:
labwc-wrapper blocks at startup waiting for /tmp/wayland-1 (pixelflux's socket).pixelflux is started by selkies (another supervisord program) which — via its own ordering — comes up alongside labwc, not before it.startsecs=2, supervisord marks labwc RUNNING before pixelflux is ready. labwc-wrapper times out, exits status 1, supervisord restarts it. Meanwhile selkies exits too (labwc isn't up). Both programs cycle every ~15 s.Stable services (traefik, selkies-fileserver, chrome-devtools-mcp, cdp-proxy, sshd, tailscaled, swaync, waybar, dbus, pipewire) keep running; the HTTPS selkies web endpoint on port 3000 and the MCP endpoint on 9224 both stay responsive throughout.
Fix options for a follow-up pass (none implemented yet — this layer is shared between container and bootc modes):
startsecs so supervisord waits for pixelflux before declaring labwc RUNNING.priority: ordering via supervisord's eventlistener hooks — a PROCESS_STATE_RUNNING listener on selkies that supervisorctl starts labwc only after pixelflux publishes its socket.Canonical worked example and diagnostic recipes: /ov-selkies:selkies-desktop-bootc.
Deploy multiple instances with different HTTP proxies. Each instance gets a port offset:
| Offset | Web (3000) | CDP (9222) | MCP (9224) | SSH (2222) | |--------|-----------|-----------|-----------|-----------| | 1 | 3001 | 9231 | 9241 | 2231 | | 2 | 3002 | 9232 | 9242 | 2232 | | N | 300N | 923N | 924N | 223N |
Tunnel must be in deploy.yml for each instance. ov config setup -i <ip> does NOT inherit tunnel from the base entry. After config, manually add tunnel: {provider: tailscale, private: all} to the instance's deploy.yml entry, then re-run ov config setup -i <ip> to regenerate the quadlet with Tailscale serve commands. See /ov-core:deploy for details.
Chrome 147+ CDP: The /json/new endpoint requires the PUT HTTP method (not GET). Use curl -X PUT "http://localhost:<cdp-port>/json/new?<url>" to create new tabs programmatically.
See /ov-selkies:selkies-desktop for full multi-instance deployment examples.
/ov-selkies:selkies — Streaming engine, Pixelflux Memory Management, ScreenCapture singleton, DRINODE auto-detection, keyboard layout support/ov-selkies:labwc — Nested Wayland compositor + autostart Chrome-duplication race fix (commit febb9bd)/ov-selkies:chrome — Chrome browser with CDP proxy, HTTP proxy support, resource caps, and crash-loop circuit breaker/ov-foundation:supervisord — Event listener pattern (chrome-crash-listener) that owns Chrome's PID 1 escalation/ov-selkies:wl-record-pixelflux — Desktop video recording via the shared capture singleton/ov-selkies:wl-screenshot-pixelflux — Screenshots via the shared capture singleton/ov-foundation:fedora-builder — Builder image that compiles patched pixelflux from source (rpmfusion + build-toolchain codec devel libs)/ov-selkies:selkies-desktop — Image that bundles this metalayer/ov-advanced:wl — Wayland automation (screenshots, input, windows)/ov-advanced:cdp — Chrome DevTools Protocol automation/ov-advanced:record — Desktop video recording via capture bridge/ov-core:update — Per-instance update pattern used to roll out pixelflux/Chrome fixes/ov-core:config — Multi-instance deployment, resource caps, tunnel, proxy env vars, NO_PROXY auto-enrichment/ov-core:deploy — Tunnel configuration (deploy.yml-only, instance inheritance gap)/ov-build:layer — layer authoring reference (layer.yml schema, task verbs, service declarations)/ov-build:eval — declarative testing (eval: block, ov eval image, ov eval live)development
Claude Code multi-agent support in Overthink — sub-agents, dynamic workflows, and agent teams, and how each drives the existing `ov eval` disposable beds to test and verify. MUST be invoked before authoring or invoking an ov sub-agent / dynamic workflow / agent team, wiring agent-lifecycle hooks, or asking "which primitive should drive the R10 beds?".
tools
Mounts a virtiofs share tagged `workspace` at /workspace inside a VM guest via a systemd .mount unit. Use when a kind:vm entity shares a host directory into the guest and you need it auto-mounted (and re-mounted at every boot).
development
MUST be invoked before any work involving: the `kind: android` schema kind, a `target: android` deploy, the `apk:` layer package format (installing Android apps declaratively), AndroidDeployTarget, an in-pod emulator OR a remote/physical adb-endpoint device, or nested `pod → android` deployment. The first-class Android device + app surface that sits above `ov eval adb`/`appium`.
tools
Use when committing, branching, pushing, merging, tagging, creating PRs, or approving/merging PRs with gh — the feat/-branch, R10-gated, never-force-push landing workflow across the main repo + the plugins submodule + image/<distro> submodules. Covers sync-to-upstream, branch/worktree pruning, the fork+PR path for contributors without write access, and cross-repo @github landing order.