plugins/claude-content/skills/extract-frames/SKILL.md
Extracts first and/or last frames of every shot from a video using adaptive scene detection. Use this skill when the user says "extract frames", "get shot frames", "pull frames", "shot breakdown", "scene detect", "first frame of each shot", "last frame of each shot", "extract shots from video", or wants to extract key frames at shot cut points from a video file.
npx skillsauth add gupsammy/claudest extract-framesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Parse the user's request for:
Default: extract first frame of every shot.
ffprobe -v error -show_entries format=duration -show_entries stream=r_frame_rate,width,height,codec_name -of default "$INPUT"
{video_basename}_frames/. If it already exists, check for scores.txt — if present, skip Phase 1.Dump per-frame scene scores for the entire video:
ffmpeg -i "$INPUT" -vf "select='gte(scene,0)',metadata=print:file=$OUTPUT_DIR/scores.txt" -fps_mode vfr -f null - 2>&1
This is the expensive step. The output file scores.txt contains blocks like:
frame:0 pts:0 pts_time:0.000000
lavfi.scene_score=0.000000
Parse scores.txt with this awk one-liner to get distribution + all candidate frames:
awk '
/pts_time/ { split($0, a, "pts_time:"); ts=a[2]+0 }
/scene_score/ { split($0, a, "="); score=a[2]+0;
if (score < 0.01) b1++;
else if (score < 0.05) b2++;
else if (score < 0.10) b3++;
else if (score < 0.20) b4++;
else b5++;
total++;
if (score > max) max=score;
if (score > 0.05) printf " ts=%.3fs score=%.6f\n", ts, score;
}
END {
print "\n--- Distribution ---";
printf "< 0.01: %d (%.1f%%)\n", b1, b1/total*100;
printf "0.01 - 0.05: %d (%.1f%%)\n", b2, b2/total*100;
printf "0.05 - 0.10: %d (%.1f%%)\n", b3, b3/total*100;
printf "0.10 - 0.20: %d (%.1f%%)\n", b4, b4/total*100;
printf "0.20+: %d (%.1f%%)\n", b5, b5/total*100;
printf "Max score: %.6f\n", max;
}' "$OUTPUT_DIR/scores.txt"
Step 2a — Startup artifact filter: Discard any frames in the first 0.5s where score > 0.05 (a fixed preliminary value — the final user-confirmed threshold is not known yet). Fade-ins from black or codec initialization commonly produce score=1.0 spikes at t=0.03-0.08s that are not real cuts. After discarding these, recompute max score from the remaining frames.
Step 2b — Branch on max score. If max score (after startup filter) < 0.05, the video has no cuts:
Step 2c — Gap analysis and threshold. If max score >= 0.05, use the raw candidates (all frames with score > 0.05, after startup filter) to find the noise ceiling and the lowest-scoring candidate. Place the proposed threshold at the midpoint of the gap. Present to user via AskUserQuestion:
If --threshold was provided, skip confirmation and set $THRESHOLD_VALUE directly. Otherwise set $THRESHOLD_VALUE to the user-confirmed value before proceeding.
Step 2d — Run-based dedup. Using $THRESHOLD_VALUE from Step 2c, identify cut points. A "run" is a sequence of consecutive frames where every frame scores above threshold. The principle: an aftershock immediately follows its parent cut (consecutive frames both above threshold), while a real cut always rises from the noise floor (preceded by at least one below-threshold frame).
awk -v THRESHOLD="$THRESHOLD_VALUE" '
/pts_time/ { split($0, a, "pts_time:"); ts=a[2]+0 }
/scene_score/ { split($0, a, "="); score=a[2]+0;
if (score > THRESHOLD) {
if (!in_run) { run_best_ts = ts; run_best_score = score; in_run = 1; }
else if (score > run_best_score) { run_best_ts = ts; run_best_score = score; }
} else {
if (in_run) { printf "%.3f %.6f\n", run_best_ts, run_best_score; in_run = 0; }
}
}
END { if (in_run) printf "%.3f %.6f\n", run_best_ts, run_best_score }
' "$OUTPUT_DIR/scores.txt"
This keeps the peak frame of each run and discards aftershocks within the same run. It correctly handles:
For each detected cut point (plus t=0 for shot 1):
First frame (default):
ffmpeg -y -ss $TIMESTAMP -i "$INPUT" -frames:v 1 -update 1 "$OUTPUT_DIR/shot_${NUM}_${TIME}s.png"
Last frame (--last):
last_time = next_shot_timestamp - (1/fps)last_time = duration - (2/fps) (use 2 frames back, not 1 — seeking to duration - 1/fps can produce empty files near the end of some videos)ffmpeg -y -ss $LAST_TIME -i "$INPUT" -frames:v 1 -update 1 "$OUTPUT_DIR/shot_${NUM}_last_${TIME}s.png"
If a last-frame extraction produces an empty file (0 bytes), back off by another frame and retry.
Both (--first --last or --all): extract both per shot.
Filtered (--shots 3,5,7): only extract for the specified shot numbers.
{video_basename}_frames/ next to the input videoshot_01_0.00s.png, shot_02_1.63s.png, ...shot_01_last_1.60s.png, shot_02_last_3.77s.png, ...scores.txt always retained for re-runs at different thresholdsReport to user: total shots detected, shot list with timestamps, output directory path.
If scores.txt already exists in the output directory, skip Phase 1 entirely and go straight to Phase 2 analysis. This makes threshold iteration instant — the user can re-run with --threshold 0.15 without waiting for the score dump again.
Use pipe-based loops for frame extraction instead of array syntax (zsh handles for over arrays differently than bash):
echo "0.000 2.480 4.280" | tr ' ' '\n' | awk '{printf "%02d %s\n", NR, $1}' | while read LABEL TS; do
TS_FMT=$(printf "%.2f" "$TS")
ffmpeg -y -ss "$TS" -i "$INPUT" -frames:v 1 -update 1 "$OUTPUT_DIR/shot_${LABEL}_${TS_FMT}s.png" 2>/dev/null
done
testing
Recall, search, continue, or analyze past conversations. Triggers on recall phrases ("what did we discuss", "continue where we left off", "we decided"), retrospective phrases ("do a retro", "post-mortem", "what went well", "lessons learned", "find antipatterns"), and implicit signals (past-tense references, possessives without context, assumptive questions like "do you remember").
data-ai
Persist learnings to memory or maintain existing memories. Triggers on "extract learnings", "save this for next time", "remember this pattern", "consolidate memories", "dream", "clean up memories".
development
Use for any image creation or editing request — logo, sticker, product mockup, nano banana, t2i, i2i, multi-reference compositing via generate.py. Not for HTML/CSS mockups, diagrams, or coded UI.
development
This skill should be used when the user says "update CLAUDE.md", "refresh CLAUDE.md", "sync CLAUDE.md with the codebase", "reorganize CLAUDE.md", "optimize project instructions", or when CLAUDE.md is stale, verbose, or out of sync.