skills/elevenlabs/SKILL.md
Convert documents and text to audio using ElevenLabs text-to-speech. Use this skill when the user wants to create a podcast, narrate a document, read aloud text, generate audio from a file, or convert text to speech.
npx skillsauth add sanjay3290/ai-skills elevenlabsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill converts text and documents into high-quality audio using ElevenLabs TTS API. It supports two modes: single-voice narration and two-host conversational podcast generation.
Activate when the user mentions:
Config at skills/elevenlabs/config.json:
{
"api_key": "your-elevenlabs-api-key",
"default_voice": "JBFqnCBsd6RMkjVDRZzb",
"default_model": "eleven_multilingual_v2",
"podcast_voice1": "JBFqnCBsd6RMkjVDRZzb",
"podcast_voice2": "EXAVITQu4vr4xnSDxMaL"
}
Only api_key is required. Or set ELEVENLABS_API_KEY env var.
Dependencies: pip install PyPDF2 python-docx (only needed for PDF/DOCX files).
Requires ffmpeg for multi-chunk narration and podcasts.
python skills/elevenlabs/scripts/elevenlabs.py voices
python skills/elevenlabs/scripts/elevenlabs.py voices --json
Use this to find voice IDs for the user.
# From text
python skills/elevenlabs/scripts/elevenlabs.py tts --text "Hello world" --output ~/Downloads/hello.mp3
# From document
python skills/elevenlabs/scripts/elevenlabs.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3
# With specific voice
python skills/elevenlabs/scripts/elevenlabs.py tts --file doc.md --voice VOICE_ID --output out.mp3
The script handles text extraction, chunking at sentence boundaries (~4000 chars), TTS per chunk with voice continuity, and ffmpeg concatenation automatically.
Podcast mode requires a JSON script file with conversation segments:
[
{"speaker": "host1", "text": "Welcome to our podcast! Today we're diving into..."},
{"speaker": "host2", "text": "That's right! I found the section on..."},
{"speaker": "host1", "text": "Let's break that down..."}
]
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/script.json --voice1 ID1 --voice2 ID2 --output ~/Downloads/podcast.mp3
When the user asks to create a podcast from a document:
Extract the document text:
python skills/elevenlabs/scripts/extract.py /path/to/document.pdf
Generate a two-host conversation script from the extracted text. Follow these guidelines:
Write the script as a JSON array to a temp file:
# Write to /tmp/podcast_script.json
[
{"speaker": "host1", "text": "Welcome to today's episode..."},
{"speaker": "host2", "text": "Thanks for having me..."},
...
]
Generate the podcast:
python skills/elevenlabs/scripts/elevenlabs.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3
Clean up the temp script file.
voices first to let the user pick voices they like~/Downloads/ unless the user specifies otherwisedevops
Search, read, and manage Outline wiki documents. Use when: (1) searching wiki for documentation, (2) reading wiki pages or articles, (3) listing wiki collections or documents, (4) creating or updating wiki content, (5) exporting documents as markdown. Works with any Outline wiki instance (self-hosted or cloud).
development
Delegate coding tasks to Google Jules AI agent for asynchronous execution. Use when user says: 'have Jules fix', 'delegate to Jules', 'send to Jules', 'ask Jules to', 'check Jules sessions', 'pull Jules results', 'jules add tests', 'jules add docs', 'jules review pr'. Handles: bug fixes, documentation, features, tests, refactoring, code reviews. Works with GitHub repos, creates PRs.
development
Generate images using Google Gemini's image generation capabilities. Use this skill when the user needs to create, generate, or produce images for any purpose including UI mockups, icons, illustrations, diagrams, concept art, placeholder images, or visual representations.
development
Execute autonomous multi-step research using Google Gemini Deep Research Agent. Use for: market analysis, competitive landscaping, literature reviews, technical research, due diligence. Takes 2-10 minutes but produces detailed, cited reports. Costs $2-5 per task.