skills/capabilities/youtube-apify-transcript/SKILL.md
Fetch YouTube transcripts via APIFY API. Works from cloud IPs (Hetzner, AWS, etc.) by bypassing YouTube's bot detection. Free tier includes $5/month credits (~714 videos). No credit card required.
npx skillsauth add athina-ai/goose-skills youtube-apify-transcriptInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetch YouTube transcripts via APIFY API (works from cloud IPs, bypasses YouTube bot detection).
YouTube blocks transcript requests from cloud IPs (AWS, GCP, etc.). APIFY runs the request through residential proxies, bypassing bot detection reliably.
# Add to ~/.bashrc or ~/.zshrc
export APIFY_API_TOKEN="apify_api_YOUR_TOKEN_HERE"
# Or use .env file (never commit this!)
echo 'APIFY_API_TOKEN=apify_api_YOUR_TOKEN_HERE' >> .env
# Get transcript as text (uses cache by default)
python3 scripts/fetch_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"
# Short URL also works
python3 scripts/fetch_transcript.py "https://youtu.be/VIDEO_ID"
# Output to file
python3 scripts/fetch_transcript.py "URL" --output transcript.txt
# JSON format (includes timestamps)
python3 scripts/fetch_transcript.py "URL" --json
# Both: JSON to file
python3 scripts/fetch_transcript.py "URL" --json --output transcript.json
# Specify language preference
python3 scripts/fetch_transcript.py "URL" --lang de
Transcripts are cached locally by default. Repeat requests for the same video cost $0.
# First request: fetches from APIFY ($0.007)
python3 scripts/fetch_transcript.py "URL"
# Second request: uses cache (FREE!)
python3 scripts/fetch_transcript.py "URL"
# Output: [cached] Transcript for: VIDEO_ID
# Bypass cache (force fresh fetch)
python3 scripts/fetch_transcript.py "URL" --no-cache
# View cache stats
python3 scripts/fetch_transcript.py --cache-stats
# Clear all cached transcripts
python3 scripts/fetch_transcript.py --clear-cache
Cache location: .cache/ in skill directory (override with YT_TRANSCRIPT_CACHE_DIR env var)
Process multiple videos at once:
# Create a file with URLs (one per line)
cat > urls.txt << EOF
https://youtube.com/watch?v=VIDEO1
https://youtu.be/VIDEO2
https://youtube.com/watch?v=VIDEO3
EOF
# Process all URLs
python3 scripts/fetch_transcript.py --batch urls.txt
# Batch with JSON output to file
python3 scripts/fetch_transcript.py --batch urls.txt --json --output all_transcripts.json
The script sends the following input to pintostudio/youtube-transcript-scraper:
{
"videoUrl": "https://www.youtube.com/watch?v=VIDEO_ID"
}
Output fields:
Each result contains a data array of transcript segments:
| Field | Type | Description |
|---------|--------|------------------------------------|
| start | number | Segment start time (seconds) |
| dur | number | Segment duration (seconds) |
| text | string | Transcript text for this segment |
Text (default):
Hello and welcome to this video.
Today we're going to talk about...
JSON (--json):
{
"video_id": "dQw4w9WgXcQ",
"title": "Video Title",
"transcript": [
{"start": 0.0, "dur": 2.5, "text": "Hello and welcome"},
{"start": 2.5, "dur": 3.0, "text": "to this video"}
],
"full_text": "Hello and welcome to this video..."
}
The script handles common errors:
content-media
Takes an existing screen recording or demo video and adds professional zoom/pan effects synchronized to the narration. Uses transcript-driven zoom targeting and Remotion for rendering. Optionally replaces audio with a soundtrack.
tools
Repurposes long-form video (podcasts, interviews, talks) into short-form vertical clips for Instagram Reels, TikTok, and YouTube Shorts. Handles transcription, moment selection, clip extraction, speaker-tracked reframing (16:9 to 9:16), and animated captions.
development
Creates talking head videos from any source material (docs, changelogs, blog posts, notes, transcripts). Produces multi-scene videos with avatar narration over screenshots/images using HeyGen v2 API. Supports Quick Shot and Full Producer modes.
tools
Generates Instagram-ready product reels from any e-commerce product page URL. Scrapes product images, classifies by type, generates AI-animated clips via Higgsfield API, creates text overlays with style presets, and composes a 15-20 second reel with music. Supports model-based and product-only reels.