skills/media-understand/SKILL.md
使用 AI 理解和分析多媒体内容(图片、视频、音频)。Use when user wants to 理解图片, 分析视频, 音频转文字, 视频问答, understand media, analyze video, transcribe audio, describe image, what is in this video/image/audio.
npx skillsauth add infquest/vibe-ops-plugin media-understandInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
使用 Gemini 2.5 Flash 分析和理解多媒体内容。
| Type | Formats | Max Size | |------|---------|----------| | Image | jpg, jpeg, png, gif, webp | 20MB | | Video | mp4, mpeg, mov, webm, YouTube URL | 100MB | | Audio | wav, mp3, aiff, aac, ogg, flac, m4a | 100MB |
MAX_API_KEY 环境变量(Max 自动注入)bun skills/media-understand/media-understand.js <media_path_or_url> [prompt] [language]
Arguments:
media_path_or_url: File path or YouTube URLprompt: Question or analysis request (default: "Please describe this content")language: Output language - chinese or english (default: chinese)# Describe image
bun skills/media-understand/media-understand.js ./photo.jpg "请描述这张图片" chinese
# OCR - Extract text
bun skills/media-understand/media-understand.js ./screenshot.png "识别图片中的所有文字" chinese
# Answer question about image
bun skills/media-understand/media-understand.js ./chart.png "这个图表显示了什么趋势?" chinese
# YouTube video summary
bun skills/media-understand/media-understand.js "https://youtube.com/watch?v=xxx" "总结这个视频的主要内容" chinese
# Local video analysis
bun skills/media-understand/media-understand.js ./video.mp4 "视频中发生了什么?" chinese
# Timestamp-based question
bun skills/media-understand/media-understand.js "https://youtu.be/xxx" "视频 2:30 处讲了什么?" chinese
# Transcribe audio
bun skills/media-understand/media-understand.js ./recording.mp3 "请转录这段音频" chinese
# Summarize podcast
bun skills/media-understand/media-understand.js ./podcast.m4a "总结这段播客的要点" chinese
# Detect speakers
bun skills/media-understand/media-understand.js ./meeting.wav "识别不同的说话人并整理他们说的内容" chinese
Image:
Video:
Audio:
File not found: Check the file path is correct
Unsupported format: Use supported formats listed above
File too large: Compress or trim the media file
API error: 请在 Max 设置中检查 Max API Key 是否正确配置
content-media
使用 yt-dlp 下载 YouTube 视频、音频或字幕。Use when user wants to 下载视频, 下载YouTube, youtube下载, 下载油管, download youtube, download video, 下载B站, bilibili下载.
tools
裁剪视频片段,支持压缩、音频控制等选项。Use when user wants to 剪辑视频, 裁剪视频, 截取视频, 视频剪切, 切视频, trim video, cut video, clip video, extract video segment.
data-ai
使用 AI 生成视频,支持 Veo/Sora 模型。Use when user wants to 生成视频, AI视频, 文生视频, 图生视频, generate video, create video, text to video, image to video, 做一个视频.
content-media
合并多个视频文件为一个视频。Use when user wants to 合并视频, 拼接视频, 视频合并, 视频拼接, 把视频合在一起, 连接视频, join videos, merge videos, combine videos, concatenate videos.