Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

riccardogrin/transcribing-youtube

Name: transcribing-youtube
Author: riccardogrin

skills/transcribing-youtube/SKILL.md

npx skillsauth add riccardogrin/skills transcribing-youtube

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Transcribing YouTube

Download audio from YouTube videos using yt-dlp, transcribe with OpenAI Whisper API, and summarize the content. Transcriptions and summaries are cached in a transcriptions/ folder to avoid redundant API calls.

Reference Files

| File | Read When | |------|-----------| | references/installation.md | Phase 0 reports a missing dependency (yt-dlp, ffmpeg, or API key) | | references/troubleshooting.md | Download fails, transcription errors, codec issues, or rate limiting |

Prerequisites

OPENAI_API_KEY, yt-dlp, ffmpeg. Phase 0 checks all three and references/installation.md covers install steps if anything is missing.

All scripts/ paths are relative to the skill directory — resolve to absolute paths before running.

Workflow Checklist

- [ ] Phase 0: Verify prerequisites
- [ ] Phase 1: Check cache for existing transcription
- [ ] Phase 2: Download audio
- [ ] Phase 3: Transcribe audio
- [ ] Phase 4: Summarize transcription
- [ ] Phase 5: Save and report

Phase 0: Verify Prerequisites

Run these checks. If anything is MISSING, read references/installation.md and install it before proceeding.

python -c "
import os, shutil
from pathlib import Path
key = os.environ.get('OPENAI_API_KEY', '')
if not key:
    env = Path('.env')
    if env.exists():
        for line in env.read_text().splitlines():
            if line.strip().startswith('OPENAI_API_KEY'):
                key = line.partition('=')[2].strip().strip('\"').strip(\"'\")
print('OPENAI_API_KEY:', 'OK' if key else 'MISSING')
print('yt-dlp:', 'OK' if shutil.which('yt-dlp') else 'MISSING')
print('ffmpeg:', 'OK' if shutil.which('ffmpeg') else 'MISSING')
"

Then ensure dependencies and output directory are ready:

pip install -r <skill-dir>/scripts/requirements.txt
mkdir -p transcriptions

Phase 1: Check Cache

Before downloading anything, check if this video has already been transcribed.

The download script extracts the video ID from the URL. Check for existing files:

ls transcriptions/<VIDEO_ID>_transcript.txt 2>/dev/null && echo "CACHED" || echo "NEW"

Extract the video ID from common URL formats:

https://www.youtube.com/watch?v=VIDEO_ID — the v parameter
https://youtu.be/VIDEO_ID — the path segment
https://www.youtube.com/shorts/VIDEO_ID — the path after shorts/

If cached:

Read the existing transcript from transcriptions/<VIDEO_ID>_transcript.txt
Read the existing summary from transcriptions/<VIDEO_ID>_summary.txt (if it exists)
Skip to Phase 4 if only the transcript exists (to regenerate summary), or Phase 5 if both exist
Ask the user if they want to re-transcribe (e.g., if the previous result was poor)

Phase 2: Download Audio

Download audio only (no video) to minimize bandwidth and storage.

python scripts/download_audio.py --url "YOUTUBE_URL" --output-dir transcriptions

The script:

Downloads the best available audio stream
Converts to mp3 (Whisper accepts it and it's ~10x smaller than WAV)
Splits files larger than 25MB into chunks (Whisper API limit)
Outputs metadata: title, duration, video ID, file path(s)
Names files as <VIDEO_ID>.mp3 (or <VIDEO_ID>_chunk_001.mp3 etc. if split)

Output format:

VIDEO_ID: dQw4w9WgXcQ
TITLE: Rick Astley - Never Gonna Give You Up
DURATION: 213
FILES: transcriptions/dQw4w9WgXcQ.mp3
STATUS: OK

If the video is longer than 3 hours, warn the user — transcription will be expensive and slow. Ask for confirmation before proceeding.

Phase 3: Transcribe Audio

Transcribe the downloaded audio using OpenAI's Whisper API.

python scripts/transcribe_audio.py --input transcriptions/<VIDEO_ID>.mp3 --output transcriptions/<VIDEO_ID>_transcript.txt

The script:

Sends audio to OpenAI Whisper API (whisper-1 model)
Handles chunked files automatically (concatenates results)
Saves raw transcript text to the output file
Includes timestamps if --timestamps flag is passed

For chunked audio (files that were split in Phase 2):

python scripts/transcribe_audio.py --input-dir transcriptions --video-id <VIDEO_ID> --output transcriptions/<VIDEO_ID>_transcript.txt

This mode auto-discovers all <VIDEO_ID>_chunk_*.mp3 files and transcribes them in order.

Output:

WORDS: 3847
DURATION_SECONDS: 213
OUTPUT: transcriptions/dQw4w9WgXcQ_transcript.txt
STATUS: OK

After transcription completes, clean up audio files to save disk space:

rm transcriptions/<VIDEO_ID>.mp3 transcriptions/<VIDEO_ID>_chunk_*.mp3 2>/dev/null

Phase 4: Summarize Transcription

Read the transcript file and produce a summary. This is done by you (the agent), not a script.

Audience: Write for a smart, knowledgeable reader. Don't pad with filler or restate the obvious. Focus on genuinely valuable insights, novel arguments, and actionable information. Omit trivial details, pleasantries, sponsor reads, advertisements, and anything a thoughtful person could infer from context. If the video references specific tools, websites, repos, or resources that viewers would find useful, mention them. For tutorials and how-to content, preserve specific values, thresholds, and step sequences — these are the primary value.

Read transcriptions/<VIDEO_ID>_transcript.txt
Produce a summary that includes:
- Title of the video
- Key points — bulleted list of the most substantive ideas, arguments, or takeaways. Scale with content length: a 10-minute video might warrant 3-5 bullets, a 2-hour podcast might need 15-20. Each bullet should convey real information, not vague topic labels. Prefer "X works because Y" over "Discusses X"
- Detailed summary — length should scale with the content. A short video gets a paragraph or two; a long podcast gets as many paragraphs as needed to do justice to the material. Lead with the core thesis or most important insight. Prioritize novel or non-obvious information over background context the reader likely already knows
- Notable quotes — direct quotes that are especially insightful, surprising, or well-stated. Include as many as are genuinely worth preserving — could be 1 for a short video, could be 10+ for a rich long-form conversation. Skip generic motivational filler
- Metadata — duration, word count, date transcribed
Save to transcriptions/<VIDEO_ID>_summary.txt

Summary format:

# Video Summary: <TITLE>

**Source:** <YOUTUBE_URL>
**Duration:** <DURATION>
**Transcribed:** <DATE>
**Words:** <WORD_COUNT>

## Key Points

- Point 1
- Point 2
- ...

## Summary

<paragraphs scaled to content length>

## Notable Quotes

> "Quote 1"

> "Quote 2"

Phase 5: Save and Report

Confirm all files are saved:
- transcriptions/<VIDEO_ID>_transcript.txt — full transcript
- transcriptions/<VIDEO_ID>_summary.txt — summary
Clean up any remaining temporary audio files
Report to the user:
- Video title and duration
- Key points from the summary
- File paths for transcript and summary
- Note that cached transcriptions will be reused for this video ID

Anti-Patterns

| Avoid | Do Instead | |-------|------------| | Downloading full video | Download audio only (-x flag) — much smaller | | Sending files over 25MB to Whisper | Split into chunks before transcribing | | Re-downloading already transcribed videos | Check cache by video ID first | | Keeping audio files after transcription | Delete WAV files to save disk space | | Transcribing 3+ hour videos without asking | Warn user about cost and get confirmation | | Using youtube-dl (unmaintained) | Use yt-dlp (actively maintained fork) | | Hardcoding API keys in scripts | Load from env var or .env file | | Summarizing without reading the full transcript | Always read the complete transcript before summarizing | | Generating summary in a script | Use the agent (Claude) for summarization — better quality |

Dependency Note

This skill requires yt-dlp, openai, and ffmpeg. Phase 0 checks availability; references/installation.md handles the rest. The only manual step for the user is providing OPENAI_API_KEY.

riccardogrin/transcribing-youtube

skills/transcribing-youtube/SKILL.md

Downloads YouTube videos, transcribes audio via OpenAI Whisper, and produces summaries stored locally. Covers yt-dlp download, audio extraction, transcription, caching, and summarization. Use when a YouTube link is shared and the user wants a transcript or summary

4 stars

data-ai

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add riccardogrin/skills transcribing-youtube

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 10:55 PM6.7s6 files scanned

SKILL.md

name:: transcribing-youtube
description:: Downloads YouTube videos, transcribes audio via OpenAI Whisper, and produces summaries stored locally. Covers yt-dlp download, audio extraction, transcription, caching, and summarization. Use when a YouTube link is shared and the user wants a transcript or summary

Transcribing YouTube

Reference Files

Prerequisites

OPENAI_API_KEY, yt-dlp, ffmpeg. Phase 0 checks all three and references/installation.md covers install steps if anything is missing.

All scripts/ paths are relative to the skill directory — resolve to absolute paths before running.

Workflow Checklist

- [ ] Phase 0: Verify prerequisites
- [ ] Phase 1: Check cache for existing transcription
- [ ] Phase 2: Download audio
- [ ] Phase 3: Transcribe audio
- [ ] Phase 4: Summarize transcription
- [ ] Phase 5: Save and report

Phase 0: Verify Prerequisites

Run these checks. If anything is MISSING, read references/installation.md and install it before proceeding.

python -c "
import os, shutil
from pathlib import Path
key = os.environ.get('OPENAI_API_KEY', '')
if not key:
    env = Path('.env')
    if env.exists():
        for line in env.read_text().splitlines():
            if line.strip().startswith('OPENAI_API_KEY'):
                key = line.partition('=')[2].strip().strip('\"').strip(\"'\")
print('OPENAI_API_KEY:', 'OK' if key else 'MISSING')
print('yt-dlp:', 'OK' if shutil.which('yt-dlp') else 'MISSING')
print('ffmpeg:', 'OK' if shutil.which('ffmpeg') else 'MISSING')
"

Then ensure dependencies and output directory are ready:

pip install -r <skill-dir>/scripts/requirements.txt
mkdir -p transcriptions

Phase 1: Check Cache

Before downloading anything, check if this video has already been transcribed.

The download script extracts the video ID from the URL. Check for existing files:

ls transcriptions/<VIDEO_ID>_transcript.txt 2>/dev/null && echo "CACHED" || echo "NEW"

Extract the video ID from common URL formats:

https://www.youtube.com/watch?v=VIDEO_ID — the v parameter
https://youtu.be/VIDEO_ID — the path segment
https://www.youtube.com/shorts/VIDEO_ID — the path after shorts/

If cached:

Read the existing transcript from transcriptions/<VIDEO_ID>_transcript.txt
Read the existing summary from transcriptions/<VIDEO_ID>_summary.txt (if it exists)
Skip to Phase 4 if only the transcript exists (to regenerate summary), or Phase 5 if both exist
Ask the user if they want to re-transcribe (e.g., if the previous result was poor)

Phase 2: Download Audio

Download audio only (no video) to minimize bandwidth and storage.

python scripts/download_audio.py --url "YOUTUBE_URL" --output-dir transcriptions

The script:

Downloads the best available audio stream
Converts to mp3 (Whisper accepts it and it's ~10x smaller than WAV)
Splits files larger than 25MB into chunks (Whisper API limit)
Outputs metadata: title, duration, video ID, file path(s)
Names files as <VIDEO_ID>.mp3 (or <VIDEO_ID>_chunk_001.mp3 etc. if split)

Output format:

VIDEO_ID: dQw4w9WgXcQ
TITLE: Rick Astley - Never Gonna Give You Up
DURATION: 213
FILES: transcriptions/dQw4w9WgXcQ.mp3
STATUS: OK

If the video is longer than 3 hours, warn the user — transcription will be expensive and slow. Ask for confirmation before proceeding.

Phase 3: Transcribe Audio

Transcribe the downloaded audio using OpenAI's Whisper API.

python scripts/transcribe_audio.py --input transcriptions/<VIDEO_ID>.mp3 --output transcriptions/<VIDEO_ID>_transcript.txt

The script:

Sends audio to OpenAI Whisper API (whisper-1 model)
Handles chunked files automatically (concatenates results)
Saves raw transcript text to the output file
Includes timestamps if --timestamps flag is passed

For chunked audio (files that were split in Phase 2):

python scripts/transcribe_audio.py --input-dir transcriptions --video-id <VIDEO_ID> --output transcriptions/<VIDEO_ID>_transcript.txt

This mode auto-discovers all <VIDEO_ID>_chunk_*.mp3 files and transcribes them in order.

Output:

WORDS: 3847
DURATION_SECONDS: 213
OUTPUT: transcriptions/dQw4w9WgXcQ_transcript.txt
STATUS: OK

After transcription completes, clean up audio files to save disk space:

rm transcriptions/<VIDEO_ID>.mp3 transcriptions/<VIDEO_ID>_chunk_*.mp3 2>/dev/null

Phase 4: Summarize Transcription

Read the transcript file and produce a summary. This is done by you (the agent), not a script.

Read transcriptions/<VIDEO_ID>_transcript.txt
Produce a summary that includes:
- Title of the video
- Key points — bulleted list of the most substantive ideas, arguments, or takeaways. Scale with content length: a 10-minute video might warrant 3-5 bullets, a 2-hour podcast might need 15-20. Each bullet should convey real information, not vague topic labels. Prefer "X works because Y" over "Discusses X"
- Detailed summary — length should scale with the content. A short video gets a paragraph or two; a long podcast gets as many paragraphs as needed to do justice to the material. Lead with the core thesis or most important insight. Prioritize novel or non-obvious information over background context the reader likely already knows
- Notable quotes — direct quotes that are especially insightful, surprising, or well-stated. Include as many as are genuinely worth preserving — could be 1 for a short video, could be 10+ for a rich long-form conversation. Skip generic motivational filler
- Metadata — duration, word count, date transcribed
Save to transcriptions/<VIDEO_ID>_summary.txt

Summary format:

# Video Summary: <TITLE>

**Source:** <YOUTUBE_URL>
**Duration:** <DURATION>
**Transcribed:** <DATE>
**Words:** <WORD_COUNT>

## Key Points

- Point 1
- Point 2
- ...

## Summary

<paragraphs scaled to content length>

## Notable Quotes

> "Quote 1"

> "Quote 2"

Phase 5: Save and Report

Confirm all files are saved:
- transcriptions/<VIDEO_ID>_transcript.txt — full transcript
- transcriptions/<VIDEO_ID>_summary.txt — summary
Clean up any remaining temporary audio files
Report to the user:
- Video title and duration
- Key points from the summary
- File paths for transcript and summary
- Note that cached transcriptions will be reused for this video ID

Anti-Patterns

Dependency Note

This skill requires yt-dlp, openai, and ffmpeg. Phase 0 checks availability; references/installation.md handles the rest. The only manual step for the user is providing OPENAI_API_KEY.

Related Skills

riccardogrin/reviewing-code

development

VerifiedTrustedCommunity

Runs an adversarial code review loop that spawns independent reviewer and fixer subagents, iterating until only nitpicks remain. Scores findings by confidence, fixes real issues, and re-reviews with fresh eyes — all internal, no GitHub comments. Use when asked to review code, self-review, adversarial review, or polish code before pushing

4SKILL.mdUpdated Apr 11, 2026

riccardogrin/reviewing-code

riccardogrin/planning

development

VerifiedTrustedCommunity

Creates implementation-ready plans through discovery interviews, external research, and codebase analysis. Covers requirements, competitor research, architecture decisions, and change sequencing. Use when planning features, roadmaps, specs, or any work that needs discovery before coding

4SKILL.mdUpdated Apr 11, 2026

riccardogrin/planning

riccardogrin/planning-loop

development

VerifiedTrustedCommunity

Generates an autonomous game design loop that iteratively expands a game concept into a comprehensive vision and implementation plan across multiple sessions. Covers mechanic exploration, system design, competitor research, and plan generation. Use when developing a game idea from seed concept to full implementation plan

4SKILL.mdUpdated Apr 11, 2026

riccardogrin/planning-loop

riccardogrin/looping-tasks

testing

VerifiedTrustedCommunity

Generates an autonomous implementation loop that executes tasks from a plan across Claude sessions, with periodic audit passes that inject follow-up tasks. Covers loop script, prompt design, and audit cadence. Use when setting up autonomous task execution or Ralph-style iterative workflows

4SKILL.mdUpdated Apr 11, 2026

riccardogrin/looping-tasks

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/riccardogrin/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/skills/transcribing-youtube ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

riccardogrin/skills

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT