artifacts/bundle/skills/ehaye/multimedia/SKILL.md
# ehAye Multimedia Use this skill for **video, audio, images, media conversion, previews, transcription, thumbnails, frame extraction, Spotter visual search, or FFmpeg-backed processing**. Core rule: use ehAye native media tools first. Do not reach first for shell `ffmpeg`, `ffprobe`, Python, or `mediainfo` when a native media tool can do the job. Native tools use bundled engines, show proper tool UI, respect cancellation/timeouts, integrate with Preview/Spotter, and avoid cross-platform shell
npx skillsauth add neekware/ehayeskills artifacts/bundle/skills/ehaye/multimediaInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill for video, audio, images, media conversion, previews, transcription, thumbnails, frame extraction, Spotter visual search, or FFmpeg-backed processing.
Core rule: use ehAye native media tools first. Do not reach first for shell ffmpeg, ffprobe, Python, or mediainfo when a native media tool can do the job. Native tools use bundled engines, show proper tool UI, respect cancellation/timeouts, integrate with Preview/Spotter, and avoid cross-platform shell quoting problems.
media_preflightUse before local transcription.
Typical call:
MediaPreflight(type=transcribe, path=<audio-or-video>)
It probes duration and estimates transcription speed. Use its recommended within value when calling transcription.
media_processUse for FFmpeg-engine work and local transcription.
Common parameters:
type=engine — raw media-engine operation.type=transcribe — local English-mode speech-to-text.args=[...] — FFmpeg-style args for type=engine.path=<file> — media path for transcription.format=markdown|summary — transcription output shape.outputPath=<path> — optional transcription output path.within=<ms> — time budget; use the preflight recommendation for transcription.Probe media:
MediaProcess(type=engine, args=["-i", "<path>"])
FFmpeg may exit with "At least one output file must be specified" after printing metadata. Treat that as a successful probe: duration, streams, codecs, subtitles, and container info are in the output.
previewUse to open the in-app Preview lightbox for images, video, audio, markdown, text, and files.
Common parameters:
path=<file> — artifact or local file to open.seek / time — video timestamp in seconds or HH:MM:SS/MM:SS style.muted=true / sound=off — start silent.autoplay=true — start playback automatically when possible.zoom=<number> — initial image zoom, e.g. 0.90.label=<text> — friendly title.Useful visible form:
Preview(type=image, zoom=0.90)
Preview(type=video, time=154s, sound=off)
Preview(type=md)
Preview(type=txt)
Default to safe audio. When opening video/audio automatically, start muted or low volume unless the user asked for sound. The volume slider should match the actual mute state: volume zero means muted; moving above zero unmutes.
Preview can seek an already-open video without reloading if the same media path is used. During Spotter/visual search, keep the preview moving: call preview repeatedly with the same video path and new timestamps so the user can watch candidate scrubbing live.
The Preview lightbox uses native WebView playback. It is fast for compatible MP4/WebM, but containers like .mkv, .avi, .flv, .wmv, .ts, .m2ts, or .mov with unsupported codecs may show a black frame, 0:00, no sound, or no seeking.
Do not fight the browser. Create a previewable MP4 copy.
Prefer fast remux first when streams are MP4-compatible, such as H.264 video and AAC audio:
MediaProcess(type=engine, args=["-i", "<input>", "-c", "copy", "-movflags", "+faststart", "<output>.mp4"])
If carrying SRT subtitles into MP4, convert subtitles to mov_text:
MediaProcess(type=engine, args=["-i", "<input>", "-map", "0", "-c", "copy", "-c:s", "mov_text", "-movflags", "+faststart", "<output>.mp4"])
Only transcode when stream-copy fails because codecs are not MP4-compatible:
MediaProcess(type=engine, args=["-i", "<input>", "-c:v", "libx264", "-c:a", "aac", "-movflags", "+faststart", "<output>.mp4"])
Converted-copy policy:
.mp4.Example:
/path/to/show.mkv
/path/to/show.mp4
MediaProcess(type=transcribe) is local and currently English-mode only.
Workflow:
MediaPreflight(type=transcribe, path=<file>).lang=en).MediaProcess(type=transcribe, path=<file>, format=markdown, within=<recommended>).Preview(type=md).Markdown transcript previews should preserve timestamps and timeline structure.
Use media_spotter when the user asks to find a person, object, logo, scene, or reference image inside video.
Spotter is LLM-powered. Do not describe it as local face recognition. Local media processing prepares candidates; the active vision model decides semantic matches such as "Rachel," "this woman," or "the first time this object appears."
Good Spotter behavior:
When visual matching is expensive or ambiguous, use a human-assisted loop:
A1 00:02:13.Preview(type=image, zoom=...).This is ideal when many frames look similar, model confidence is low, or the user can identify the target faster than more model calls.
For thumbnails, stills, or evidence frames, use MediaProcess(type=engine) with FFmpeg args. Prefer concise JPEG/PNG outputs in an artifact directory or next to the source when the user needs a persistent result.
Examples:
MediaProcess(type=engine, args=["-ss", "00:02:13", "-i", "<video>", "-frames:v", "1", "<frame>.jpg"])
MediaProcess(type=engine, args=["-i", "<video>", "-vf", "fps=1", "<frames-dir>/frame-%06d.jpg"])
Preview extracted frames with Preview(type=image).
Always tell the user:
Prefer showing evidence through Preview over dumping long paths or raw media-engine output.
Creator: Ehaye License: MIT Source Repo:
neekware/ehaye-skillsSource Bucket:ehayeOriginal Path:ehaye/multimedia
development
Test-driven development skill for writing unit tests, generating test fixtures and mocks, analyzing coverage gaps, and guiding red-green-refactor workflows across Jest, Pytest, JUnit, Vitest, and Mocha. Use when the user asks to write tests, improve test coverage, practice TDD, generate mocks or stubs, or mentions testing frameworks like Jest, pytest, or JUnit. Handles test generation from source code, coverage report parsing (LCOV/JSON/XML), quality scoring, and framework conversion for TypeScript, JavaScript, Python, and Java projects.
tools
Help a user set up Telegram for ehAye Dojo. Default to Personal private bots (recommended). Group setup is advanced for teams/observers/demos.
development
# Writing Skills ## Overview **Writing skills IS Test-Driven Development applied to process documentation.** **Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.agents/skills/` for Codex)** You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes). **Core principle:** If you didn't watch an agent fail without the ski
tools
# Writing Plans ## Overview Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits. Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test