.claude/skills/interface-implementation/SKILL.md
Checklist and architectural rules for implementing new assistant interfaces (CLI, Slack, web, etc.). Every interface MUST use the Orchestrator to ensure consistent behaviour: system prompt, tools, skills, memory, and the full ReAct loop. Use this skill when adding, reviewing, or fixing an interface.
npx skillsauth add cedricziel/assistant interface-implementationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
All assistant interfaces MUST route user messages through the
Orchestrator (assistant-runtime). Direct LlmProvider::chat_streaming()
calls bypass the system prompt, tools, skills, memory, and ReAct loop —
producing a "dumb chatbot" instead of the full assistant.
User message
-> Interface (transport layer)
-> Orchestrator.submit_turn() <-- REQUIRED
-> Worker dispatches to one of:
run_turn() (default)
run_turn_streaming() (with token sink)
run_turn_with_tools() (with extension tools)
-> System prompt (MemoryLoader)
-> Tool specs (ToolExecutor + extensions)
-> Skills XML (SkillRegistry)
-> ReAct loop (multi-iteration tool calling)
<- TurnResult { answer, attachments }
<- Interface renders / delivers response
An interface that calls llm.chat_streaming() directly violates this
invariant and MUST be fixed.
Interface variantIn crates/core/src/types.rs, add a variant to the Interface enum:
pub enum Interface {
Cli,
Signal,
Mcp,
Slack,
Mattermost,
Web, // <-- new
Scheduler,
}
Every interface needs these components, in order:
AssistantConfig (load from ~/.assistant/config.toml)
-> StorageLayer (SQLite, migrations)
-> SkillRegistry (load embedded + dir-scanned skills)
-> LlmProvider (Ollama / Anthropic / OpenAI)
-> ToolExecutor (storage, llm, registry, config)
-> MessageBus (if config.bus.kind == "nats": NatsMessageBus::connect(&config.bus), else storage.message_bus())
-> Orchestrator (llm, storage, executor, registry, bus, config)
-> executor.set_subagent_runner(orchestrator) // break init cycle
The assistant_runtime::bootstrap module provides shared helpers:
load_config(path) — loads config TOMLskill_dirs(config, project_root) — returns skill search directoriesAutoDenyConfirmation — for non-interactive interfacesThe orchestrator's bus-based processing requires a background worker:
let worker_orch = orchestrator.clone();
tokio::spawn(async move {
worker_orch.run_worker("web-worker").await;
});
Without this, submit_turn() will publish to the bus but nothing will
claim and process the request.
Each logical conversation needs a stable Uuid:
| Interface | Key | Strategy |
| ---------- | ------------------------------- | --------------------------- |
| CLI | session | Uuid::new_v4() at startup |
| Slack | (channel_id, thread_ts) | HashMap |
| Mattermost | (channel_id, root_post_id) | LRU cache (10k) |
| Signal | sender phone number | HashMap |
| Web UI | conversation UUID from database | ConversationStore rows |
The orchestrator creates the conversation record lazily inside
prepare_history() via create_conversation_with_id() (upsert).
| Mode | Registration call | Worker method | Use case |
| ------------------- | ----------------------------- | ----------------------- | ------------------------- |
| Streaming | register_token_sink(id, tx) | run_turn_streaming() | CLI, Signal, Web UI (SSE) |
| Extension tools | register_extensions(id, ..) | run_turn_with_tools() | Slack, Mattermost |
| Fire-and-forget | (none) | run_turn() | Scheduler, MCP |
For web interfaces, streaming via register_token_sink is the natural
fit — pipe tokens to SSE events.
// For streaming:
orchestrator.register_token_sink(conversation_id, token_tx).await;
// For extension tools:
orchestrator.register_extensions(conversation_id, tools, attachments).await;
// Then submit (always):
orchestrator.submit_turn(&user_text, conversation_id, Interface::Web).await?;
submit_turn() publishes to the message bus and blocks (with a 10-minute
timeout) until the worker publishes a TurnResult.
The TurnResult contains:
answer: String — the final text to show the userattachments: Vec<Attachment> — any files producedFor streaming interfaces, tokens arrive via the mpsc::Receiver<String>
in real time. The TurnResult.answer is the authoritative final text
(use it for persistence, not the concatenated tokens).
orchestrator.run_boot(conversation_id, Interface::Web).await?;
This reads ~/.assistant/BOOT.md and submits it as a silent turn if
non-empty. Useful for per-session initialization tasks.
| Feature | Orchestrator | Raw chat_streaming() |
| ---------------------------- | :----------: | :--------------------: |
| System prompt (MemoryLoader) | Yes | No |
| AGENTS.md, SOUL.md, etc. | Yes | No |
| BOOTSTRAP.md (first-run) | Yes | No |
| Skills XML catalog | Yes | No |
| Tool specs + execution | Yes | No |
| ReAct loop (multi-turn) | Yes | No |
| History sanitization | Yes | No |
| OTel tracing spans | Yes | No |
| Confirmation gate | Yes | No |
| Error recovery in history | Yes | No |
Direct LLM call — llm.chat_streaming(SYSTEM_PROMPT, &history, &[], ...)
bypasses everything. The assistant won't know its identity, won't have
tools, and can't execute skills.
Hardcoded system prompt — "You are a helpful assistant." instead
of the MemoryLoader's composed prompt. The assistant loses its
personality, workspace context, and memory.
Empty tool list — &[] as the tool spec. The assistant can't read
files, run commands, search the web, or use any skill.
Missing worker — calling submit_turn() without a spawned
run_worker() task causes a 10-minute timeout.
Skipping set_subagent_runner() — subagent spawning (the
agent-spawn tool) will fail at runtime.
Double-persisting user messages — if the interface saves the user
message to the database AND the orchestrator also saves it (via
prepare_history), the conversation ends up with duplicate user
entries. Let the orchestrator own persistence; the interface should
only render the message for display.
StorageLayer is not Clone — wrap it in Arc<StorageLayer>
early and share via Arc::clone(). The pool: SqlitePool inside
it is Clone, so you can still extract it for direct DB access
(conversation listing, titling, etc.).
The orchestrator's prepare_history() saves the user message and the
ReAct loop saves all assistant / tool messages. Interfaces MUST NOT
duplicate this by saving user or assistant messages themselves.
For the Flutter web frontend, the client optimistically renders the user's message immediately in the UI (before the SSE response arrives). The server does not need to echo the user message back — the SSE stream delivers only assistant tokens.
parse_interface() updateWhen adding a new Interface variant, also update parse_interface()
in crates/runtime/src/orchestrator.rs. This function deserialises
the variant from the message bus; a missing entry silently falls back
to Interface::Cli.
crates/interface-cli/src/main.rs (streaming mode)crates/interface-slack/src/runner.rs (extension tools mode)crates/interface-mattermost/src/runner.rs (extension tools)crates/interface-signal/src/runner.rs (streaming mode)crates/web-ui/src/api/chat.rs (SSE streaming endpoint consumed by the Flutter app at app/lib/features/chat/)All five go through orchestrator.submit_turn().
tools
Enforces OpenAPI spec discipline when working on REST API endpoints in this project. Triggers whenever adding, modifying, or removing HTTP routes, request/response types, or API handlers in the Rust web-ui crate (`crates/web-ui`). Reminds the agent to (1) update the committed `openapi.json` spec, (2) run `make dump-openapi` to re-export the spec from the running server, and (3) run `make generate-flutter-client` to regenerate the Dart/dio client in `app/packages/assistant_api/`. Also applies when changing route parameters, status codes, or authentication on existing endpoints.
tools
Browser automation via @playwright/mcp (Microsoft). Use this when the user wants to navigate websites, fill forms, take screenshots, scrape web content, test web apps, or run any multi-step browser workflow. Requires no display (headless mode supported).
testing
A minimal example WASM skill that returns a greeting. Use to verify that the WASM execution tier is working correctly.
development
Run coding agents (Claude Code, Codex, OpenCode, or others) as background processes for programmatic control. Use when you need non-blocking execution, parallel agents, PR reviews, or long-running coding tasks. Prefer this over direct bash for any task that takes more than ~20 seconds.