claude/skills/gemini-video-understanding/SKILL.md
Analyze videos using Google's Gemini API - describe content, answer questions, transcribe audio with visual descriptions, reference timestamps, clip videos, and process YouTube URLs. Supports 9 video formats, multiple models (Gemini 2.5/2.0), and context windows up to 2M tokens (6 hours of video).
npx skillsauth add einverne/dotfiles gemini-video-understandingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill enables comprehensive video analysis using Google's Gemini API, including video summarization, question answering, transcription, timestamp references, and more.
Gemini 2.5 Series:
gemini-2.5-pro - Best quality, 1M contextgemini-2.5-flash - Balanced quality/speed, 1M contextgemini-2.5-flash-preview-09-2025 - Preview features, 1M contextGemini 2.0 Series:
gemini-2.0-flash - Fast processinggemini-2.0-flash-lite - Lightweight optionContext Windows:
The skill checks for GEMINI_API_KEY in this order:
process.env.GEMINI_API_KEY or $GEMINI_API_KEY.claude/skills/gemini-video-understanding/.env.env file in project rootTo set up:
# Option 1: Environment variable (recommended)
export GEMINI_API_KEY="your-api-key-here"
# Option 2: Skill directory .env file
echo "GEMINI_API_KEY=your-api-key-here" > .claude/skills/gemini-video-understanding/.env
# Option 3: Project root .env file
echo "GEMINI_API_KEY=your-api-key-here" > .env
Get your API key at: https://aistudio.google.com/apikey
Use this skill when the user asks to:
For video files:
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "/path/to/video.mp4" \
--prompt "Summarize this video in 3 key points"
For YouTube URLs:
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--youtube-url "https://www.youtube.com/watch?v=VIDEO_ID" \
--prompt "What are the main topics discussed?"
Video Clipping (specific time range):
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "/path/to/video.mp4" \
--prompt "Summarize this segment" \
--start-offset "40s" \
--end-offset "80s"
Custom Frame Rate:
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "/path/to/video.mp4" \
--prompt "Analyze the rapid movements" \
--fps 5
Transcription with Timestamps:
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "/path/to/video.mp4" \
--prompt "Transcribe the audio with timestamps and visual descriptions"
Multiple Videos (Gemini 2.5+ only):
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-paths "/path/video1.mp4" "/path/video2.mp4" \
--prompt "Compare these two videos and highlight the differences"
Model Selection:
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "/path/to/video.mp4" \
--prompt "Detailed analysis" \
--model "gemini-2.5-pro"
Required (one of):
--video-path PATH Path to local video file
--youtube-url URL YouTube video URL
--video-paths PATH [PATH..] Multiple video paths (Gemini 2.5+)
Required:
--prompt TEXT Analysis prompt/question
Optional:
--model NAME Model to use (default: gemini-2.5-flash)
--start-offset TIME Video clip start (e.g., "40s", "1m30s")
--end-offset TIME Video clip end (e.g., "80s", "2m")
--fps NUMBER Frame sampling rate (default: 1)
--output-file PATH Save response to file
--verbose Show detailed processing info
Prompt: "Summarize this video in 3 key points with timestamps"
Prompt: "Create a quiz with 5 questions and answer key based on this video"
Prompt: "What happens at 01:15 and how does it relate to the topic at 02:30?"
Prompt: "Transcribe the audio from this video with timestamps for salient events and visual descriptions"
Prompt: "Compare these two product demo videos. Which one explains the features more clearly?"
Prompt: "List all the actions performed in this tutorial video with timestamps"
Free Tier (per model):
YouTube Limitations:
Storage (Files API):
Video tokens depend on resolution:
Example: A 10-minute video = 600 seconds × 300 tokens = ~180,000 tokens
Common errors and solutions:
| Error | Cause | Solution | |-------|-------|----------| | 400 Bad Request | Invalid video format or corrupt file | Check file format and integrity | | 403 Forbidden | Invalid/missing API key | Verify GEMINI_API_KEY configuration | | 404 Not Found | File URI not found | Ensure file is uploaded and active | | 429 Too Many Requests | Rate limit exceeded | Implement backoff, upgrade to paid tier | | 500 Internal Error | Server-side issue | Retry with exponential backoff |
When a user requests video analysis:
For videos >20MB or reusable content:
For videos <20MB:
# User: "Analyze this YouTube tutorial video"
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--youtube-url "https://www.youtube.com/watch?v=abc123" \
--prompt "Create a structured summary with: 1) Main topics, 2) Key takeaways, 3) Recommended audience"
# User: "Transcribe this interview with timestamps"
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-path "interview.mp4" \
--prompt "Transcribe this interview with speaker labels, timestamps, and visual descriptions of gestures or slides shown"
# User: "Compare these two product demo videos"
python .claude/skills/gemini-video-understanding/scripts/analyze_video.py \
--video-paths "demo1.mp4" "demo2.mp4" \
--model "gemini-2.5-pro" \
--prompt "Compare these product demos on: features shown, presentation quality, clarity of explanation, and overall effectiveness"
API Key Not Found:
# Check API key detection
python .claude/skills/gemini-video-understanding/scripts/check_api_key.py
Video Too Large:
Error: Request size exceeds 20MB
Solution: Script automatically uses Files API for large videos
Processing Timeout:
Error: File not reaching ACTIVE state
Solution: Check video integrity, try smaller file, or different format
Rate Limit Errors:
Error: 429 Too Many Requests
Solution: Wait before retry, or upgrade to paid tier
development
生成符合项目规范的 React 组件。当用户要求创建组件、新建 React 组件或生成组件文件时使用
development
生成符合 Conventional Commits 规范的 Git 提交信息。当用户要求生成提交、创建 commit 或写提交信息时使用
devops
将当前分支部署到测试环境。当用户要求部署、发布到测试或在 staging 环境测试时使用
development
进行系统化的代码审查,检查代码质量、安全性和性能。当用户要求审查代码、review 或检查代码时使用