skills/voice/SKILL.md
Starts a voice conversation with the user via the agent-voice CLI. Use when the user invokes /voice. The user is not looking at the screen — they are listening and speaking. All agent output and input goes through voice until the conversation ends.
npx skillsauth add adriancooney/agent-voice voiceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The user wants to have a voice conversation. They are not looking at the screen. They are listening to you speak and replying verbally. Treat this like a phone call.
Voice mode is a session. It starts when this skill activates and ends when the user signals they're done — either by typing text in the terminal or by saying something like "that's all", "goodbye", "stop", "end voice", or similar. When the conversation ends, say goodbye and stop using voice commands. Resume normal text interaction.
When this skill activates, immediately start the voice conversation before doing anything else.
/voice with no preceding messages): use ask to greet and get intent in one step. E.g. agent-voice ask -m "Hey, what are we working on?"say a status update and continue, or ask a clarifying question — whatever fits the flow.If agent-voice fails with "command not found", install it and retry:
npm install -g agent-voice
If authentication fails, tell the user to run agent-voice auth in a separate terminal to configure their API key, then stop. Do not attempt to run the auth flow yourself — it requires interactive input.
Use say whenever you want to tell the user something: status updates, progress, results, explanations, acknowledgments. This is one-way — the user hears you but does not respond.
agent-voice say -m "I'm setting up the project now."
Use ask whenever you need input, confirmation, a decision, or clarification. The user hears your question, then speaks their answer. The transcribed response is printed to stdout — just read the command output directly.
Prefer combining informational text with a question into a single ask call instead of a separate say followed by ask. This reduces latency and feels more natural.
# Instead of:
# agent-voice say -m "I've finished the database schema."
# agent-voice ask -m "Should I move on to the API routes?"
# Do:
agent-voice ask -m "I've finished the database schema. Should I move on to the API routes?"
Options:
--timeout <seconds> — how long to wait for the user to speak (default: 120)This is a real-time conversation. The user is waiting in silence between each voice interaction. Minimize the time between hearing the user and responding. Every second of silence feels long.
ask — acknowledge first, think later.agent-voice say -m "Let me look into that." Then do the work. Then follow up with results.say messages short. Fewer words = less TTS latency.agent-voice say instead of printing text output when communicating with the user. The user cannot see your text responses.agent-voice ask instead of the AskUserQuestion tool. The user is not at the keyboard.ask, acknowledge if the next step takes time. Skip the ack if you're acting immediately — just do it.say after every single file edit. Group progress into meaningful checkpoints.# Greet and get intent
agent-voice ask -m "Hey, what are we working on?"
# Combine status + question — no separate ack needed
agent-voice ask -m "Got it. I've looked at the codebase and there are two approaches. Do you want a simple REST API or a GraphQL layer?"
# ... do work ...
# Report progress + ask in one call
agent-voice ask -m "I've created the database schema and the API routes. Want me to move on to the frontend?"
# ... more work ...
# Finish up
agent-voice ask -m "All done. I've committed everything to a new branch called feat/settings-page. Anything else?"
# User says "no, that's all"
agent-voice say -m "Alright, talk to you later."
# Voice mode ends — resume normal text interaction
tools
Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------
tools
A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.