skills/skill-transcript/SKILL.md
Transcribe audio and video files using ElevenLabs Scribe, OpenAI Whisper, or Google Gemini. Supports automatic chunking for large files, speaker diarization, timestamps, and multiple output formats (text, SRT, VTT, JSON).
npx skillsauth add hasna/skills transcriptInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides high-quality speech-to-text transcription using multiple AI providers. It automatically handles large files through compression and chunking.
This CLI is API-backed. Set SKILL_API_KEY when routing through the hosted skills/connectors runtime; provider-specific keys are managed by that runtime.
bun run src/index.ts transcribe \
--provider openai \
--input ./recording.mp3
bun run src/index.ts transcribe \
--provider elevenlabs \
--input ./meeting.mp3 \
--diarize \
--timestamps \
--format srt
bun run src/index.ts transcribe \
--provider gemini \
--input ./video.mp4 \
--format vtt \
--output ./captions.vtt
bun run src/index.ts providers
| Format | Extension | Description | |--------|-----------|-------------| | text | .txt | Plain text transcript | | srt | .srt | SubRip subtitle format | | vtt | .vtt | WebVTT subtitle format | | json | .json | Full structured data with metadata |
The skill automatically handles files larger than provider limits:
# ElevenLabs
export ELEVENLABS_API_KEY=your_key
# OpenAI
export OPENAI_API_KEY=your_key
# Google Gemini
export GOOGLE_API_KEY=your_key
For chunking support (OpenAI with large files):
ffmpeg - Audio processingffprobe - Duration detectionInstall on macOS:
brew install ffmpeg
tools
Generate hosted voiceover variants and short jingles
tools
Generate premium video highlight packages with clip plans, captions, thumbnails, chapter markers, social copy, edit decisions, and manifest metadata.
testing
Generate high-quality articles using parallel AI agents. Supports research, writing, and optional cover image generation. Write single articles or batch process multiple topics with configurable parallelism.
testing
Generate videos using OpenAI Sora, Minimax Hailuo, Gemini Veo, or Seedance through the hosted Skills runtime with provider-cost pricing.