platforms/claude/skills/video-transcribe/SKILL.md
Video/audio transcription, visual frame analysis, Groq Whisper long-form transcripts, timestamped Obsidian notes, and keyframe-based visual summaries. Use for video links, audio links, 字幕/转录/视频总结/画面分析/图文笔记, especially when the result must replace watching the video. Keywords: video, transcribe, 转录, 视频, 音频, audio, subtitle, 字幕, summary, 总结, 图文笔记, 视频内容, 画面分析, visual analysis, keyframe, whisper, groq, yt-dlp
npx skillsauth add codingsamss/ai-dotfiles video-transcribeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when the user asks to understand, transcribe, summarize, or visually analyze a video/audio source.
When intent is unclear, default to transcript + keyframes for short videos and ask before spending API quota on long videos.
whisper-large-v3 with response_format=verbose_json; do not use plain text when timestamps or completeness checks matter.language=zh for non-Chinese videos. Use the detected language or set --language en for English technical videos.Codex, Remotion, Supabase, Typefully, TestFlight, Vercel, Claude Code.已读; that belongs to the user, not the agent./tmp/video-transcribe/<slug>.scripts/download_media.sh.scripts/transcribe_groq.py when speech content is needed.scripts/extract_frames.sh when visual context or screenshots are needed.references/obsidian-video-note.md before writing.scripts/verify_obsidian_note.sh and run touch <note> after editing an Obsidian file externally.Example:
WORK=/tmp/video-transcribe/codex-super-app
mkdir -p "$WORK"
SKILL_DIR="$HOME/.codex/skills/video-transcribe"
# In this repo, use: SKILL_DIR=platforms/codex/skills/video-transcribe
# In Claude runtime, use: SKILL_DIR="$HOME/.claude/skills/video-transcribe"
VIDEO=$("$SKILL_DIR/scripts/download_media.sh" "$URL" "$WORK" full)
"$SKILL_DIR/scripts/transcribe_groq.py" \
"$VIDEO" \
--work-dir "$WORK" \
--language en \
--prompt "Technical terms: Codex, Remotion, Supabase, Typefully, TestFlight, Vercel, Claude Code."
"$SKILL_DIR/scripts/extract_frames.sh" \
"$VIDEO" "$WORK/frames" --count 16
For a note intended to replace watching a video, use two layers:
Avoid a single flat list of dozens of timestamps. It is technically complete but hard to read.
scripts/download_media.sh: yt-dlp wrapper with cookie retry and uvx --from yt-dlp fallback.scripts/transcribe_groq.py: media-to-audio extraction, size-based segmentation, Groq transcription, and timestamp merge.scripts/extract_frames.sh: uniform or timestamp-based keyframe extraction.scripts/verify_obsidian_note.sh: Markdown image/timestamp/frontmatter checks.references/obsidian-video-note.md: long-form Obsidian note structure and coverage standard.references/troubleshooting.md: common yt-dlp, Groq, ffmpeg, and note-validation failures.Load reference files only when the current request needs that detail.
development
Query Midea MX / 美信 local message cache through the MX local HTTP query service from Codex. Use when the user asks to read MX sessions, search chat history, search messages globally or inside a group/session, list recent messages, or page message history. This is read-only and does not require send authorization. Never fall back to reading SQLite or app cache files directly.
development
Safely search MX users or groups and send Midea MX / 美信 IM messages from Codex. Use when the user asks to notify someone, send a message to a person or group, use a configured group alias, @ users, @ all, or send MX file/image messages. Read lookups need no extra authorization; every live send needs explicit user authorization for that exact target and message.
tools
MX channel output rules. Always active in MX conversations.
tools
Use the company WorkSpace `ws` CLI reliably as a delegated coding agent from Codex. Trigger when the user wants Codex to command `ws`, WorkSpace CLI, or the company opencode-derived coding tool to generate code, inspect a repo, run a bounded implementation task, or use a requested WorkSpace model while Codex reviews the output.