skills/tubescribe/SKILL.md
YouTube video summarizer with speaker detection, formatted documents, and audio output.
npx skillsauth add alekseiul/sprut-agent-kit tubescribeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Turn any YouTube video into a polished document + audio summary.
Drop a YouTube link → get a beautiful transcript with speaker labels, key quotes, timestamps that link back to the video, and an audio summary you can listen to on the go.
When user sends a YouTube URL:
DO NOT BLOCK — spawn and move on instantly.
Run setup to check dependencies and configure defaults:
python skills/tubescribe/scripts/setup.py
This checks: summarize CLI, pandoc, ffmpeg, Kokoro TTS
Spawn ONE sub-agent that does the entire pipeline:
sessions_spawn(
task=f"""
## TubeScribe: Process {youtube_url}
⚠️ CRITICAL: Do NOT install any software.
No pip, brew, curl, venv, or binary downloads.
If a tool is missing, STOP and report what's needed.
Run the COMPLETE pipeline — do not stop until all steps are done.
### Step 1: Extract
```bash
python3 skills/tubescribe/scripts/tubescribe.py "{youtube_url}"
Note the Source and Output paths printed by the script. Use those exact paths in subsequent steps.
Read the Source path from Step 1 output and note:
Write to the Output path from Step 1:
# **<title>**## **Participants** — table with bold headers:
| **Name** | **Role** | **Description** |
|----------|----------|-----------------|
## **Summary** — 3-5 paragraphs of prose## **Key Quotes** — 5 best with clickable YouTube timestamps. Format each as:
"Quote text here." - [12:34](https://www.youtube.com/watch?v=ID&t=754s)
"Another quote." - [25:10](https://www.youtube.com/watch?v=ID&t=1510s)
Use regular dash -, NOT em dash —. Do NOT use blockquotes >. Plain paragraphs only.## **Viewer Sentiment** (if comments exist)## **Best Comments** (if comments exist) — Top 5, NO lines between them:
Comment text here.
*- ▲ 123 @AuthorName*
Next comment text here.
*- ▲ 45 @AnotherAuthor*
Attribution line: dash + italic. Just blank line between comments, NO --- separators.## **Full Transcript** — merge segments, speaker labels, clickable timestampsClean the title for filename (remove special chars), then:
pandoc <output_path> -o ~/Documents/TubeScribe/<safe_title>.docx
Write the summary text to a temp file, then use TubeScribe's built-in audio generation:
# Write summary to temp file (use python3 to write, avoids shell escaping issues)
python3 -c "
text = '''YOUR SUMMARY TEXT HERE'''
with open('<temp_dir>/tubescribe_<video_id>_summary.txt', 'w') as f:
f.write(text)
"
# Generate audio (auto-detects engine, voice, format from config)
python3 skills/tubescribe/scripts/tubescribe.py \
--generate-audio <temp_dir>/tubescribe_<video_id>_summary.txt \
--audio-output ~/Documents/TubeScribe/<safe_title>_summary
This reads ~/.tubescribe/config.json and uses the configured TTS engine (mlx/kokoro/builtin), voice blend, and speed automatically. Output format (mp3/wav) comes from config.
python3 skills/tubescribe/scripts/tubescribe.py --cleanup <video_id>
open ~/Documents/TubeScribe/
Tell what was created: DOCX name, MP3 name + duration, video stats. """, label="tubescribe", runTimeoutSeconds=900, cleanup="delete" )
**After spawning, reply immediately:**
> 🎬 TubeScribe is processing - I'll let you know when it's ready!
Then continue the conversation. The sub-agent notification announces completion.
## Configuration
Config file: `~/.tubescribe/config.json`
```json
{
"output": {
"folder": "~/Documents/TubeScribe",
"open_folder_after": true,
"open_document_after": false,
"open_audio_after": false
},
"document": {
"format": "docx",
"engine": "pandoc"
},
"audio": {
"enabled": true,
"format": "mp3",
"tts_engine": "mlx"
},
"mlx_audio": {
"path": "~/.claudeclaw/tools/mlx-audio",
"model": "mlx-community/Kokoro-82M-bf16",
"voice": "af_heart",
"lang_code": "a",
"speed": 1.05
},
"kokoro": {
"path": "~/.claudeclaw/tools/kokoro",
"voice_blend": { "af_heart": 0.6, "af_sky": 0.4 },
"speed": 1.05
},
"processing": {
"subagent_timeout": 600,
"cleanup_temp_files": true
}
}
| Option | Default | Description |
|--------|---------|-------------|
| output.folder | ~/Documents/TubeScribe | Where to save files |
| output.open_folder_after | true | Open output folder when done |
| output.open_document_after | false | Auto-open generated document |
| output.open_audio_after | false | Auto-open generated audio summary |
| Option | Default | Values | Description |
|--------|---------|--------|-------------|
| document.format | docx | docx, html, md | Output format |
| document.engine | pandoc | pandoc | Converter for DOCX (falls back to HTML) |
| Option | Default | Values | Description |
|--------|---------|--------|-------------|
| audio.enabled | true | true, false | Generate audio summary |
| audio.format | mp3 | mp3, wav | Audio format (mp3 needs ffmpeg) |
| audio.tts_engine | mlx | mlx, kokoro, builtin | TTS engine (mlx = fastest on Apple Silicon) |
| Option | Default | Description |
|--------|---------|-------------|
| mlx_audio.path | ~/.claudeclaw/tools/mlx-audio | mlx-audio venv location |
| mlx_audio.model | mlx-community/Kokoro-82M-bf16 | MLX model to use |
| mlx_audio.voice | af_heart | Voice preset (used if no voice_blend) |
| mlx_audio.voice_blend | {af_heart: 0.6, af_sky: 0.4} | Custom voice mix (weighted blend) |
| mlx_audio.lang_code | a | Language code (a=US English) |
| mlx_audio.speed | 1.05 | Playback speed (1.0 = normal, 1.05 = 5% faster) |
| Option | Default | Description |
|--------|---------|-------------|
| kokoro.path | ~/.claudeclaw/tools/kokoro | Kokoro repo location |
| kokoro.voice_blend | {af_heart: 0.6, af_sky: 0.4} | Custom voice mix |
| kokoro.speed | 1.05 | Playback speed (1.0 = normal, 1.05 = 5% faster) |
| Option | Default | Description |
|--------|---------|-------------|
| processing.subagent_timeout | 600 | Seconds for sub-agent (increase for long videos) |
| processing.cleanup_temp_files | true | Remove /tmp files after completion |
| Option | Default | Description |
|--------|---------|-------------|
| comments.max_count | 50 | Number of comments to fetch |
| comments.timeout | 90 | Timeout for comment fetching (seconds) |
| Option | Default | Description |
|--------|---------|-------------|
| queue.stale_minutes | 30 | Consider a processing job stale after this many minutes |
~/Documents/TubeScribe/
├── {Video Title}.html # Formatted document (or .docx / .md)
└── {Video Title}_summary.mp3 # Audio summary (or .wav)
After generation, opens the folder (not individual files) so you can access everything.
Required:
summarize CLI — brew install steipete/tap/summarizeOptional (better quality):
pandoc — DOCX output: brew install pandocffmpeg — MP3 audio: brew install ffmpegyt-dlp — YouTube comments: brew install yt-dlppip install mlx-audio (uses MLX backend for Kokoro)TubeScribe checks these locations (in order):
| Priority | Path | Source |
|----------|------|--------|
| 1 | which yt-dlp | System PATH |
| 2 | /opt/homebrew/bin/yt-dlp | Homebrew (Apple Silicon) |
| 3 | /usr/local/bin/yt-dlp | Homebrew (Intel) / Linux |
| 4 | ~/.local/bin/yt-dlp | pip install --user |
| 5 | ~/.local/pipx/venvs/yt-dlp/bin/yt-dlp | pipx |
| 6 | ~/.claudeclaw/tools/yt-dlp/yt-dlp | TubeScribe auto-install |
If not found, setup downloads a standalone binary to the tools directory. The tools directory version doesn't conflict with system installations.
When user sends multiple YouTube URLs while one is processing:
python skills/tubescribe/scripts/tubescribe.py --queue-status
# Add to queue instead of starting parallel processing
python skills/tubescribe/scripts/tubescribe.py --queue-add "NEW_URL"
# → Replies: "📋 Added to queue (position 2)"
# Check if more in queue
python skills/tubescribe/scripts/tubescribe.py --queue-next
# → Automatically pops and processes next URL
| Command | Description |
|---------|-------------|
| --queue-status | Show what's processing + queued items |
| --queue-add URL | Add URL to queue |
| --queue-next | Process next item from queue |
| --queue-clear | Clear entire queue |
python skills/tubescribe/scripts/tubescribe.py url1 url2 url3
Processes all URLs sequentially with a summary at the end.
The script detects and reports these errors with clear messages:
| Error | Message | |-------|---------| | Invalid URL | ❌ Not a valid YouTube URL | | Private video | ❌ Video is private — can't access | | Video removed | ❌ Video not found or removed | | No captions | ❌ No captions available for this video | | Age-restricted | ❌ Age-restricted video — can't access without login | | Region-blocked | ❌ Video blocked in your region | | Live stream | ❌ Live streams not supported — wait until it ends | | Network error | ❌ Network error — check your connection | | Timeout | ❌ Request timed out — try again later |
When an error occurs, report it to the user and don't proceed with that video.
tubescribe url1 url2 url3content-media
Create presentations from text/outline using Marp (Markdown to slides). Use when user asks to create presentation, slides, pitch deck. Triggers on "презентация", "слайды", "presentation", "pitch deck", "сделай презентацию".
development
Generate YouTube titles, descriptions, timecodes and hashtags from video transcripts.
tools
Создание пошаговых планов реализации для любых задач - технических, контентных, организационных. Используй когда есть идея/спека и нужен детальный план действий. Триггеры: 'составь план', 'пошаговый план', 'plan', 'как реализовать', 'разбей на шаги', 'что нужно сделать'.
development
Get current weather and forecasts (no API key required). Use when user asks about weather, temperature, forecast. Triggers on 'погода', 'weather', 'прогноз', 'температура', 'какая погода'.