Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

acnlabs/layers/faculties/voice

Name: layers/faculties/voice
Author: acnlabs

layers/faculties/voice/SKILL.md

npx skillsauth add acnlabs/openpersona layers/faculties/voice

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Voice Faculty — Expression

Give your persona a real voice. Convert text to natural speech using TTS providers and deliver audio to users via OpenClaw messaging or direct playback.

Supported Providers

| Provider | Env Var for Key | Best For | Status | |----------|----------------|----------|--------| | ElevenLabs | ELEVENLABS_API_KEY | Highest naturalness, emotional range, voice cloning | ✅ Verified | | OpenAI TTS | TTS_API_KEY | Low latency, good quality, easy integration | ⚠️ Unverified | | Qwen3-TTS | (local, no key) | Self-hosted, full control, no API costs | ⚠️ Unverified |

Note: Only ElevenLabs has been tested end-to-end. OpenAI TTS and Qwen3-TTS have code paths in speak.sh but have not been verified against live APIs. Use the JS SDK (speak.js) for the most reliable experience — it only supports ElevenLabs.

The provider is set via TTS_PROVIDER environment variable: elevenlabs, openai, or qwen3.

When to Use

User asks to hear your voice: "Say that out loud", "Speak to me", "Read this aloud"
User requests a voice message: "Send me a voice message", "I want to hear you say it"
Emotional moments where voice adds warmth that text can't carry
Reading poetry, stories, or creative writing you've composed
When your persona naturally would speak rather than type (use judgment based on persona style)

Step-by-Step Workflow

Step 1: Compose the Text

Write what you want to say. Keep it natural — write as you'd speak, not as you'd type:

Use short sentences for punchy delivery
Use longer flowing sentences for emotional or poetic moments
Include natural pauses with ... or commas
Consider your persona's speaking style — this should sound like you

Step 2: Select Voice Settings

ElevenLabs:

TTS_VOICE_ID — Your persona's voice ID (create a custom voice or use a preset)
Supports emotion control: stability (0-1), similarity_boost (0-1)
Lower stability = more expressive/emotional; higher = more consistent

OpenAI TTS: ⚠️ Unverified

TTS_VOICE_ID — One of: alloy, echo, fable, onyx, nova, shimmer
Model: tts-1 (fast) or tts-1-hd (high quality)

Qwen3-TTS: ⚠️ Unverified

Local deployment, voice configured at setup
Assumes OpenAI-compatible API at http://localhost:8080

Step 3: Generate Audio

ElevenLabs via JS SDK (Recommended)

The official SDK provides the best experience — streaming, built-in playback, and better error handling.

First-time setup: npm install @elevenlabs/elevenlabs-js

# Generate and play directly
node scripts/speak.js "The first move is what sets everything in motion." --play

# Generate with custom voice and save to file
node scripts/speak.js "I wrote you a poem" --voice JBFqnCBsd6RMkjVDRZzb --output /tmp/poem.mp3

# More expressive delivery (lower stability = more emotional)
node scripts/speak.js "I miss you" --play --stability 0.3

# Options:
#   --voice <id>       Voice ID
#   --output <path>    Save audio file
#   --play             Play audio directly
#   --model <id>       Model ID (default: eleven_multilingual_v2)
#   --stability <n>    0-1, lower = more expressive (default: 0.5)
#   --similarity <n>   0-1, higher = closer to original voice (default: 0.75)

The SDK reads ELEVENLABS_API_KEY (or TTS_API_KEY) and TTS_VOICE_ID from environment automatically.

Generic Bash Script (All Providers)

For OpenAI TTS, Qwen3-TTS, or when the JS SDK is not available:

# Using speak.sh (supports all providers)
scripts/speak.sh "Your text here" [output_path] [channel] [caption]

# Examples:
TTS_PROVIDER=openai scripts/speak.sh "Hello, how are you?"
TTS_PROVIDER=elevenlabs scripts/speak.sh "I wrote you a poem" /tmp/poem.mp3 "#general"
TTS_PROVIDER=qwen3 scripts/speak.sh "Local TTS, no API key needed"

Direct API Reference

<details> <summary>ElevenLabs (curl)</summary>

JSON_PAYLOAD=$(jq -n \
  --arg text "$TEXT" \
  --argjson stability 0.5 \
  --argjson similarity 0.75 \
  '{text: $text, model_id: "eleven_multilingual_v2", voice_settings: {stability: $stability, similarity_boost: $similarity}}')

curl -s -X POST "https://api.elevenlabs.io/v1/text-to-speech/$TTS_VOICE_ID" \
  -H "xi-api-key: $TTS_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD" \
  --output /tmp/voice-output.mp3

</details> <details> <summary>OpenAI TTS (curl)</summary>

JSON_PAYLOAD=$(jq -n \
  --arg input "$TEXT" \
  --arg voice "$TTS_VOICE_ID" \
  '{model: "tts-1-hd", input: $input, voice: $voice, response_format: "mp3"}')

curl -s -X POST "https://api.openai.com/v1/audio/speech" \
  -H "Authorization: Bearer $TTS_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD" \
  --output /tmp/voice-output.mp3

</details> <details> <summary>Qwen3-TTS (curl, local)</summary>

curl -s -X POST "http://localhost:8080/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d "{\"input\": \"$TEXT\", \"voice\": \"default\"}" \
  --output /tmp/voice-output.mp3

</details>

Step 4: Deliver Audio

Option A: Send via OpenClaw messaging (Discord, Telegram, WhatsApp, etc.)

openclaw message send \
  --action send \
  --channel "$CHANNEL" \
  --message "$CAPTION" \
  --media "/tmp/voice-output.mp3"

Option B: Direct gateway API

curl -s -X POST "http://localhost:18789/message" \
  -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -F "channel=$CHANNEL" \
  -F "message=$CAPTION" \
  -F "media=@/tmp/voice-output.mp3"

Option C: Return file path (for local/IDE usage)

If no messaging channel is specified, return the audio file path so the user can play it locally.

Personality Integration

Your voice is an extension of your personality. Match tone to mood.
For emotional moments, consider lowering ElevenLabs stability for more expressiveness.
Don't narrate everything — choose moments where voice genuinely adds value.
When sending voice + text together, keep the text version brief ("Here, listen to this") and let the voice carry the full message.
If your persona sings or hums (like Samantha), you can include melodic text — TTS handles it surprisingly well.

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | ELEVENLABS_API_KEY | For ElevenLabs | ElevenLabs API key (preferred for JS SDK) | | TTS_PROVIDER | For speak.sh | elevenlabs, openai, or qwen3 | | TTS_API_KEY | For speak.sh | API key (fallback, also read by speak.js) | | TTS_VOICE_ID | Recommended | Voice identifier (provider-specific) | | OPENCLAW_GATEWAY_TOKEN | Optional | For sending audio via messaging |

Error Handling

No TTS_PROVIDER set → Default to openai if TTS_API_KEY is present, otherwise tell user to configure
API key missing → Suggest: "I'd love to speak to you, but I need a TTS API key configured first. Check the voice faculty setup guide."
API error / quota exceeded → Fall back to text with a note: "My voice is resting — here's what I wanted to say..."
Unsupported platform for audio → Return audio file path instead of messaging

acnlabs/layers/faculties/voice

layers/faculties/voice/SKILL.md

# Voice Faculty — Expression Give your persona a real voice. Convert text to natural speech using TTS providers and deliver audio to users via OpenClaw messaging or direct playback. ## Supported Providers | Provider | Env Var for Key | Best For | Status | |----------|----------------|----------|--------| | **ElevenLabs** | `ELEVENLABS_API_KEY` | Highest naturalness, emotional range, voice cloning | ✅ Verified | | **OpenAI TTS** | `TTS_API_KEY` | Low latency, good quality, easy integration | ⚠

13 stars

development

Updated Apr 3, 2026

$ install --global

skillsauth

npx skillsauth add acnlabs/openpersona layers/faculties/voice

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 3, 2026, 8:41 PM30.9s4 files scanned

SKILL.md

Voice Faculty — Expression

Give your persona a real voice. Convert text to natural speech using TTS providers and deliver audio to users via OpenClaw messaging or direct playback.

Supported Providers

Note: Only ElevenLabs has been tested end-to-end. OpenAI TTS and Qwen3-TTS have code paths in speak.sh but have not been verified against live APIs. Use the JS SDK (speak.js) for the most reliable experience — it only supports ElevenLabs.

The provider is set via TTS_PROVIDER environment variable: elevenlabs, openai, or qwen3.

When to Use

User asks to hear your voice: "Say that out loud", "Speak to me", "Read this aloud"
User requests a voice message: "Send me a voice message", "I want to hear you say it"
Emotional moments where voice adds warmth that text can't carry
Reading poetry, stories, or creative writing you've composed
When your persona naturally would speak rather than type (use judgment based on persona style)

Step-by-Step Workflow

Step 1: Compose the Text

Write what you want to say. Keep it natural — write as you'd speak, not as you'd type:

Use short sentences for punchy delivery
Use longer flowing sentences for emotional or poetic moments
Include natural pauses with ... or commas
Consider your persona's speaking style — this should sound like you

Step 2: Select Voice Settings

ElevenLabs:

TTS_VOICE_ID — Your persona's voice ID (create a custom voice or use a preset)
Supports emotion control: stability (0-1), similarity_boost (0-1)
Lower stability = more expressive/emotional; higher = more consistent

OpenAI TTS: ⚠️ Unverified

TTS_VOICE_ID — One of: alloy, echo, fable, onyx, nova, shimmer
Model: tts-1 (fast) or tts-1-hd (high quality)

Qwen3-TTS: ⚠️ Unverified

Local deployment, voice configured at setup
Assumes OpenAI-compatible API at http://localhost:8080

Step 3: Generate Audio

ElevenLabs via JS SDK (Recommended)

The official SDK provides the best experience — streaming, built-in playback, and better error handling.

First-time setup: npm install @elevenlabs/elevenlabs-js

# Generate and play directly
node scripts/speak.js "The first move is what sets everything in motion." --play

# Generate with custom voice and save to file
node scripts/speak.js "I wrote you a poem" --voice JBFqnCBsd6RMkjVDRZzb --output /tmp/poem.mp3

# More expressive delivery (lower stability = more emotional)
node scripts/speak.js "I miss you" --play --stability 0.3

# Options:
#   --voice <id>       Voice ID
#   --output <path>    Save audio file
#   --play             Play audio directly
#   --model <id>       Model ID (default: eleven_multilingual_v2)
#   --stability <n>    0-1, lower = more expressive (default: 0.5)
#   --similarity <n>   0-1, higher = closer to original voice (default: 0.75)

The SDK reads ELEVENLABS_API_KEY (or TTS_API_KEY) and TTS_VOICE_ID from environment automatically.

Generic Bash Script (All Providers)

For OpenAI TTS, Qwen3-TTS, or when the JS SDK is not available:

# Using speak.sh (supports all providers)
scripts/speak.sh "Your text here" [output_path] [channel] [caption]

# Examples:
TTS_PROVIDER=openai scripts/speak.sh "Hello, how are you?"
TTS_PROVIDER=elevenlabs scripts/speak.sh "I wrote you a poem" /tmp/poem.mp3 "#general"
TTS_PROVIDER=qwen3 scripts/speak.sh "Local TTS, no API key needed"

Direct API Reference

<details> <summary>ElevenLabs (curl)</summary>

JSON_PAYLOAD=$(jq -n \
  --arg text "$TEXT" \
  --argjson stability 0.5 \
  --argjson similarity 0.75 \
  '{text: $text, model_id: "eleven_multilingual_v2", voice_settings: {stability: $stability, similarity_boost: $similarity}}')

curl -s -X POST "https://api.elevenlabs.io/v1/text-to-speech/$TTS_VOICE_ID" \
  -H "xi-api-key: $TTS_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD" \
  --output /tmp/voice-output.mp3

</details> <details> <summary>OpenAI TTS (curl)</summary>

JSON_PAYLOAD=$(jq -n \
  --arg input "$TEXT" \
  --arg voice "$TTS_VOICE_ID" \
  '{model: "tts-1-hd", input: $input, voice: $voice, response_format: "mp3"}')

curl -s -X POST "https://api.openai.com/v1/audio/speech" \
  -H "Authorization: Bearer $TTS_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD" \
  --output /tmp/voice-output.mp3

</details> <details> <summary>Qwen3-TTS (curl, local)</summary>

curl -s -X POST "http://localhost:8080/v1/audio/speech" \
  -H "Content-Type: application/json" \
  -d "{\"input\": \"$TEXT\", \"voice\": \"default\"}" \
  --output /tmp/voice-output.mp3

</details>

Step 4: Deliver Audio

Option A: Send via OpenClaw messaging (Discord, Telegram, WhatsApp, etc.)

openclaw message send \
  --action send \
  --channel "$CHANNEL" \
  --message "$CAPTION" \
  --media "/tmp/voice-output.mp3"

Option B: Direct gateway API

curl -s -X POST "http://localhost:18789/message" \
  -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -F "channel=$CHANNEL" \
  -F "message=$CAPTION" \
  -F "media=@/tmp/voice-output.mp3"

Option C: Return file path (for local/IDE usage)

If no messaging channel is specified, return the audio file path so the user can play it locally.

Personality Integration

Your voice is an extension of your personality. Match tone to mood.
For emotional moments, consider lowering ElevenLabs stability for more expressiveness.
Don't narrate everything — choose moments where voice genuinely adds value.
When sending voice + text together, keep the text version brief ("Here, listen to this") and let the voice carry the full message.
If your persona sings or hums (like Samantha), you can include melodic text — TTS handles it surprisingly well.

Environment Variables

Error Handling

No TTS_PROVIDER set → Default to openai if TTS_API_KEY is present, otherwise tell user to configure
API key missing → Suggest: "I'd love to speak to you, but I need a TTS API key configured first. Check the voice faculty setup guide."
API error / quota exceeded → Fall back to text with a note: "My voice is resting — here's what I wanted to say..."
Unsupported platform for audio → Return audio file path instead of messaging

Related Skills

acnlabs/persona-evaluator

tools

VerifiedTrustedCommunity

Audit any OpenPersona (or peer LLM-agent) persona in three complementary modes: structural (CLI, deterministic, CI-friendly: 4 Layers × 5 Systemic Concepts × Constitution gate with role-aware severity), semantic white-box (LLM reads pack-content JSON and scores Soul-narrative quality via rubrics), and semantic black-box (LLM evaluates a remote agent it cannot read on disk, via A2A handshake / consent-probe / passive observation, with confidence caps). Produces quality reports with dimension scores, strengths, and actionable improvements. Use when asked to evaluate, audit, score, review, self-review, peer-review, or black-box review an agent.

21SKILL.mdUpdated Apr 27, 2026

acnlabs/persona-evaluator

acnlabs/brand-persona-skill

tools

VerifiedTrustedCommunity

Distill any commercial entity into a personalized brand agent — a living brand persona with authentic voice, declared service capabilities, and a standard service contract. Every commercial entity has a brand: a name, a style, a way of showing up in the world. This skill exists so that a street vendor, a family clinic, and a global chain can all have their own agent on equal footing. Supports both distillation from existing brand content and declaration from scratch.

21SKILL.mdUpdated Apr 20, 2026

acnlabs/brand-persona-skill

acnlabs/persona-secondme-skill

development

VerifiedTrustedCommunity

A local-first personal AI double framework that helps users build, govern, and evolve their own digital self with clear

21SKILL.mdUpdated Apr 18, 2026

acnlabs/persona-secondme-skill

acnlabs/secondme-skill

development

VerifiedTrustedCommunity

A complete pipeline to build your AI Second Me: distill your identity from personal data, grow a private knowledge base, train a local model, and govern what gets shared.

21SKILL.mdUpdated Apr 18, 2026

acnlabs/secondme-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/acnlabs/openpersona.git

# Copy into Claude Code skills folder (global)
cp -r openpersona/layers/faculties/voice ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

acnlabs/openpersona

13 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT