skills/voice-memo-transcriber/SKILL.md
Transcribe voice memos to text using Whisper. Use when user provides audio/video files (.m4a, .mp3, .mov, etc.) and asks to transcribe them into text and SRT format with timestamps.
npx skillsauth add jeffvincent/claude-config voice-memo-transcriberInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill automatically transcribes voice memo files into plain text using OpenAI's Whisper (open-source, local processing). It handles multiple audio/video formats, generates both plain text and timestamped SRT files, and provides a summary of the content.
All processing happens locally - no data is sent to cloud services. Voice memos may contain sensitive information, so privacy is maintained throughout.
Use this skill when:
Do NOT use this skill for:
Required:
file_path: Absolute path to voice memo file
Optional:
output_dir: Directory for output files
whisper_model: Whisper model size (affects accuracy and speed)
Generated files (in output directory):
[filename].txt - Plain text transcript with no timestamps[filename].srt - Subtitle file with timestamps (HH:MM:SS,mmm format)[filename].mp3 - Audio file (if conversion from another format was needed)Displayed to user:
Run these checks in sequence:
Check ffmpeg:
which ffmpeg
If not found, provide installation instructions:
brew install ffmpegsudo apt-get install ffmpeg or sudo yum install ffmpegCheck pipx:
which pipx
If not found, install:
brew install pipx
pipx ensurepath
Check openai-whisper:
pipx list | grep openai-whisper
If not found, install:
pipx install openai-whisper
Determine output directory:
Check if file is already MP3:
ffmpeg -i "[input_file]" -vn -ar 16000 -ac 1 -b:a 96k "[output_dir]/[basename].mp3"
Flags explained:
-vn: No video (audio only)-ar 16000: 16kHz sample rate (Whisper's native rate)-ac 1: Mono audio-b:a 96k: 96kbps bitrate (good quality, small size)Run Whisper transcription:
whisper "[audio_file]" \
--model [whisper_model] \
--output_dir "[output_dir]" \
--output_format txt \
--output_format srt \
--language English
This generates both .txt and .srt files automatically.
Important notes:
Display to user:
✓ Transcription complete!
Files generated:
- Transcript: [path]/[filename].txt
- Subtitles: [path]/[filename].srt
- Audio: [path]/[filename].mp3 (if converted)
Summary:
[Your 1-3 sentence summary here]
Processing time: [X] seconds
File not found:
Unsupported format:
Missing dependencies:
Corrupted audio:
Insufficient disk space:
See resources/EXAMPLES.md for complete examples.
See resources/CHECKLIST.md for validation steps.
Local Processing:
Sensitive Content:
File Handling:
Dependencies:
tools
Render a video clip with captions overlaid, using the Remotion captioner at `/Users/jvincent/Projects/remotion-captioner/`. Use when user provides a video file and wants to add captions/subtitles, mentions "caption this video", "add captions", "burn in subtitles", or provides a video + SRT file pair.
development
Upload video files to Wistia projects using the Data API. Use when user wants to upload videos to their Wistia account for hosting, transcription, or sharing.
testing
# Voice Authenticity Reviewer ## Purpose Review any written content for alignment with authentic speaking and writing voice using analyzed patterns from 7 meeting transcripts and strategic memos. ## When to Use This Skill - Before sharing strategic memos with leadership - Before sending important emails - When drafting presentation scripts - When reviewing documentation for external sharing - As part of Writing /produce-memo workflow (Step 6) - Anytime voice authenticity verification is needed
data-ai
Analyze customer interview transcripts (SRT or plain text) to generate thematic breakdowns with summary, quotes, topics, timestamps, and full transcript. Use when given video transcripts or asked to create chapter markers.