hyperframes-best-practices/SKILL.md
Best practices for programmatic video creation using HyperFrames, plain HTML compositions with GSAP animations rendered to MP4, with full Hebrew and RTL support. Covers composition authoring, data-* timing attributes, GSAP timeline contract, layout-before-animation methodology, visual identity gate, Hebrew fonts via Google Fonts (Heebo, Rubik, Assistant), RTL text rendering with dir="rtl", Hebrew TikTok/Reels-style captions via Whisper, audio-reactive visuals, scene transitions, and bidirectional Hebrew+English text. Use when building HTML-based video content or Hebrew social/marketing videos without React. Do NOT use for Remotion or general React video work, use remotion-best-practices for that.
npx skillsauth add skills-il/developer-tools hyperframes-best-practicesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Adapted from the upstream HyperFrames v1.0.x skill at heygen-com/hyperframes (Apache-2.0). Hebrew and RTL adaptations by skills-il.
HTML is the source of truth for video. A composition is an HTML file with data-* attributes for timing, a GSAP timeline for animation, and CSS for appearance. The framework handles clip visibility, media playback, and timeline sync.
Building HTML-based videos with Hebrew text requires the compiler to fetch Hebrew Google Fonts on demand, explicit dir="rtl" on Hebrew containers, mirrored GSAP entrance directions, and Hebrew caption sync via Whisper, none of which HyperFrames documents out of the box. Hebrew voiceover is a separate gap: the built-in Kokoro TTS does not support Hebrew (only 8 languages: en-us, en-gb, es, fr-fr, hi, it, pt-br, ja, zh), so Hebrew narration must be generated by an external TTS service and imported as an <audio> element.
For Hebrew and RTL compositions, load references/hebrew-rtl.md. It covers Hebrew font loading (the compiler auto-fetches from Google Fonts), dir="rtl" scoping, GSAP x-axis mirroring, Hebrew caption sync via hyperframes transcribe --language he, Hebrew voiceover via external TTS, and bidirectional text with <bdi>.
Before writing HTML, think at a high level:
For small edits (fix a color, adjust timing, add one element), skip straight to the rules.
Check in this order:
style_prompt_full and structured fields. (Note: visual-style.md is a project-specific file. visual-styles.md is the style library with 8 named presets, different files.)## Style Prompt (one paragraph), ## Colors (3-5 hex values with roles), ## Typography (1-2 font families), ## What NOT to Do (3-5 anti-patterns).Every composition must trace its palette and typography back to a DESIGN.md, visual-style.md, or explicit user direction. If you're reaching for #333, #3b82f6, or Roboto, you skipped this step.
</HARD-GATE>
For motion defaults, sizing, entrance patterns, and easing, follow house-style.md. The house style handles HOW things move. The DESIGN.md handles WHAT things look like.
Position every element where it should be at its most visible moment, the frame where it's fully entered, correctly placed, and not yet exiting. Write this as static HTML+CSS first. No GSAP yet.
Why this matters: If you position elements at their animated start state (offscreen, scaled to 0, opacity 0) and tween them to where you think they should land, you're guessing the final layout. Overlaps are invisible until the video renders. By building the end state first, you can see and fix layout problems before adding any motion.
.scene-content container MUST fill the full scene using width: 100%; height: 100%; padding: Npx; with display: flex; flex-direction: column; gap: Npx; box-sizing: border-box. Use padding to push content inward, NEVER position: absolute; top: Npx on a content container. Absolute-positioned content containers overflow when content is taller than the remaining space. Reserve position: absolute for decoratives only.gsap.from(), animate FROM offscreen/invisible TO the CSS position. The CSS position is the ground truth; the tween describes the journey to get there.gsap.to(), animate TO offscreen/invisible FROM the CSS position./* scene-content fills the scene, padding positions content */
.scene-content {
display: flex;
flex-direction: column;
justify-content: center;
width: 100%;
height: 100%;
padding: 120px 160px;
gap: 24px;
box-sizing: border-box;
}
.title {
font-size: 120px;
}
.subtitle {
font-size: 42px;
}
/* Container fills any scene size (1920x1080, 1080x1920, etc).
Padding positions content. Flex + gap handles spacing. */
WRONG, hardcoded dimensions and absolute positioning:
.scene-content {
position: absolute;
top: 200px;
left: 160px;
width: 1920px;
height: 1080px;
display: flex; /* ... */
}
// Step 3: Animate INTO those positions
tl.from(".title", { y: 60, opacity: 0, duration: 0.6, ease: "power3.out" }, 0);
tl.from(".subtitle", { y: 40, opacity: 0, duration: 0.5, ease: "power3.out" }, 0.2);
tl.from(".logo", { scale: 0.8, opacity: 0, duration: 0.4, ease: "power2.out" }, 0.3);
// Step 4: Animate OUT from those positions
tl.to(".title", { y: -40, opacity: 0, duration: 0.4, ease: "power2.in" }, 3);
tl.to(".subtitle", { y: -30, opacity: 0, duration: 0.3, ease: "power2.in" }, 3.1);
tl.to(".logo", { scale: 0.9, opacity: 0, duration: 0.3, ease: "power2.in" }, 3.2);
If element A exits before element B enters in the same area, both should have correct CSS positions for their respective hero frames. The timeline ordering guarantees they never visually coexist, but if you skip the layout step, you won't catch the case where they accidentally overlap due to a timing error.
Layered effects (glow behind text, shadow elements, background patterns) and z-stacked designs (card stacks, depth layers) are intentional. The layout step is about catching unintentional overlap, two headlines landing on top of each other, a stat covering a label, content bleeding off-frame.
| Attribute | Required | Values |
| ------------------ | --------------------------------- | ------------------------------------------------------ |
| id | Yes | Unique identifier |
| data-start | Yes | Seconds or clip ID reference ("el-1", "intro + 2") |
| data-duration | Required for img/div/compositions | Seconds. Video/audio defaults to media duration. |
| data-track-index | Yes | Integer. Same-track clips cannot overlap. |
| data-media-start | No | Trim offset into source (seconds) |
| data-volume | No | 0-1 (default 1) |
data-track-index does not affect visual layering, use CSS z-index.
| Attribute | Required | Values |
| ---------------------------- | -------- | -------------------------------------------- |
| data-composition-id | Yes | Unique composition ID |
| data-start | Yes | Start time (root composition: use "0") |
| data-duration | Yes | Takes precedence over GSAP timeline duration |
| data-width / data-height | Yes | Pixel dimensions (1920x1080 or 1080x1920) |
| data-composition-src | No | Path to external HTML file |
Sub-compositions loaded via data-composition-src use a <template> wrapper. Standalone compositions (the main index.html) do NOT use <template>, they put the data-composition-id div directly in <body>. Using <template> on a standalone file hides all content from the browser and breaks rendering.
Sub-composition structure:
<template id="my-comp-template">
<div data-composition-id="my-comp" data-width="1920" data-height="1080">
<!-- content -->
<style>
[data-composition-id="my-comp"] {
/* scoped styles */
}
</style>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/gsap.min.js"></script>
<script>
window.__timelines = window.__timelines || {};
const tl = gsap.timeline({ paused: true });
// tweens...
window.__timelines["my-comp"] = tl;
</script>
</div>
</template>
Load in root: <div id="el-1" data-composition-id="my-comp" data-composition-src="compositions/my-comp.html" data-start="0" data-duration="10" data-track-index="1"></div>
Video must be muted playsinline. Audio is always a separate <audio> element:
<video
id="el-v"
data-start="0"
data-duration="30"
data-track-index="0"
src="video.mp4"
muted
playsinline
></video>
<audio
id="el-a"
data-start="0"
data-duration="30"
data-track-index="2"
src="video.mp4"
data-volume="1"
></audio>
{ paused: true }, the player controls playbackwindow.__timelines["<composition-id>"] = tldata-duration, not from GSAP timeline lengthDeterministic: No Math.random(), Date.now(), or time-based logic. Use a seeded PRNG if you need pseudo-random values (e.g. mulberry32).
GSAP: Only animate visual properties (opacity, x, y, scale, rotation, color, backgroundColor, borderRadius, transforms). Do NOT animate visibility, display, or call video.play()/audio.play().
Animation conflicts: Never animate the same property on the same element from multiple timelines simultaneously.
No repeat: -1: Infinite-repeat timelines break the capture engine. Calculate the exact repeat count from composition duration: repeat: Math.ceil(duration / cycleDuration) - 1.
Synchronous timeline construction: Never build timelines inside async/await, setTimeout, or Promises. The capture engine reads window.__timelines synchronously after page load. Fonts are embedded by the compiler, so they're available immediately, no need to wait for font loading.
Never do:
window.__timelines registration<audio>data-layer (use data-track-index) or data-end (use data-duration)data-composition-idrepeat: -1 on any timeline or tween, always finite repeatsasync, setTimeout, Promise)gsap.set() on clip elements from later scenes, they don't exist in the DOM at page load. Use tl.set(selector, vars, timePosition) inside the timeline at or after the clip's data-start time instead.<br> in content text, forced line breaks don't account for actual rendered font width. Text that wraps naturally + a <br> produces an extra unwanted break, causing overlap. Let text wrap via max-width instead. Exception: short display titles where each word is deliberately on its own line (e.g., "THE\nIMMORTAL\nGAME" at 130px).Every multi-scene composition MUST follow ALL of these rules. Violating any one of them is a broken composition.
gsap.from(). No element may appear fully-formed. If a scene has 5 elements, it needs 5 entrance tweens.gsap.to() that animates opacity to 0, y offscreen, scale to 0, or any other "out" animation before a transition fires. The transition IS the exit. The outgoing scene's content MUST be fully visible at the moment the transition starts.gsap.to(..., { opacity: 0 }) is allowed.WRONG, exit animation before transition:
// BANNED, this empties the scene before the transition can use it
tl.to("#s1-title", { opacity: 0, y: -40, duration: 0.4 }, 6.5);
tl.to("#s1-subtitle", { opacity: 0, duration: 0.3 }, 6.7);
// transition fires on empty frame
RIGHT, entrance only, transition handles exit:
// Scene 1 entrance animations
tl.from("#s1-title", { y: 50, opacity: 0, duration: 0.7, ease: "power3.out" }, 0.3);
tl.from("#s1-subtitle", { y: 30, opacity: 0, duration: 0.5, ease: "power2.out" }, 0.6);
// NO exit tweens, transition at 7.2s handles the scene change
// Scene 2 entrance animations
tl.from("#s2-heading", { x: -40, opacity: 0, duration: 0.6, ease: "expo.out" }, 8.0);
font-variant-numeric: tabular-nums on number columnsWhen no visual-style.md or animation direction is provided, follow house-style.md for aesthetic defaults.
font-family you want in CSS, the compiler embeds supported fonts automatically. If a font isn't supported, the compiler warns.crossorigin="anonymous" to external mediawindow.__hyperframes.fitTextFontSize(text, { maxWidth, fontFamily, fontWeight })index.html; sub-compositions use ../npx hyperframes lint and npx hyperframes validate both passhyperframes validate runs a WCAG contrast audit by default. It seeks to 5 timestamps, screenshots the page, samples background pixels behind every text element, and computes contrast ratios. Failures appear as warnings:
⚠ WCAG AA contrast warnings (3):
· .subtitle "secondary text", 2.67:1 (need 4.5:1, t=5.3s)
If warnings appear:
hyperframes validate until cleanUse --no-contrast to skip if iterating rapidly and you'll check later.
After authoring animations, run the animation map to verify choreography:
node skills/hyperframes/scripts/animation-map.mjs <composition-dir> \
--out <composition-dir>/.hyperframes/anim-map
Outputs a single animation-map.json with:
"#card1 animates opacity+y over 0.50s. moves 23px up. fades in. ends at (120, 200)""3 elements stagger at 120ms")offscreen, collision, invisible, paced-fast (under 0.2s), paced-slow (over 2s)Read the JSON. Scan summaries for anything unexpected. Check every flag, fix or justify. Verify the timeline shows the intended choreography rhythm. Re-run after fixes.
Skip on small edits (fixing a color, adjusting one duration). Run on new compositions and significant animation changes.
references/captions.md, Captions, subtitles, lyrics, karaoke synced to audio. Tone-adaptive style detection, per-word styling, text overflow prevention, caption exit guarantees, word grouping. Read when adding any text synced to audio timing.
references/tts.md, Text-to-speech with Kokoro-82M. Voice selection, speed tuning, TTS+captions workflow. Read when generating narration or voiceover.
references/audio-reactive.md, Audio-reactive animation: map frequency bands and amplitude to GSAP properties. Read when visuals should respond to music, voice, or sound.
references/css-patterns.md, CSS+GSAP marker highlighting: highlight, circle, burst, scribble, sketchout. Deterministic, fully seekable. Read when adding visual emphasis to text.
references/typography.md, Typography: font pairing, OpenType features, dark-background adjustments, font discovery script. Always read, every composition has text.
references/motion-principles.md, Motion design principles: easing as emotion, timing as weight, choreography as hierarchy, scene pacing, ambient motion, anti-patterns. Read when choreographing GSAP animations.
visual-styles.md, 8 named visual styles (Swiss Pulse, Velvet Standard, Deconstructed, Maximalist Type, Data Drift, Soft Signal, Folk Frequency, Shadow Cut) with hex palettes, GSAP easing signatures, and shader pairings. Read when user names a style or when generating DESIGN.md.
house-style.md, Default motion, sizing, and color palettes when no style is specified.
patterns.md, PiP, title cards, slide show patterns.
data-in-motion.md, Data, stats, and infographic patterns.
references/transcript-guide.md, Transcription commands, whisper models, external APIs, troubleshooting.
references/dynamic-techniques.md, Dynamic caption animation techniques (karaoke, clip-path, slam, scatter, elastic, 3D).
references/hebrew-rtl.md, Hebrew and RTL compositions: dir="rtl" scoping, Google Fonts auto-fetch for Heebo/Rubik/Assistant, GSAP x-axis mirroring, Hebrew captions via hyperframes transcribe --language he, Hebrew voiceover via external TTS, bidirectional text with <bdi>. Read for any composition with Hebrew text.
references/transitions.md, Scene transitions: crossfades, wipes, reveals, shader transitions. Energy/mood selection, CSS vs WebGL guidance. Always read for multi-scene compositions, scenes without transitions feel like jump cuts.
@hyperframes/shader-transitions (packages/shader-transitions/), read package source, not skill files.For GSAP timeline patterns and easing, follow house-style.md and references/motion-principles.md in this skill, plus the official GSAP docs at https://gsap.com/docs/v3/.
These are agent failure modes specific to Hebrew/RTL HyperFrames work. Generic HyperFrames gotchas (see upstream) still apply.
<link rel="stylesheet"> tag or a CSS @import url(...) statement for Hebrew fonts. The compiler already fetches Google Fonts server-side via fetchGoogleFont() in packages/producer/src/services/deterministicFonts.ts, caches the WOFF2s at ~/.cache/hyperframes/fonts/<slug>/, and embeds them as base64 data URIs in the compiled HTML. An external stylesheet breaks determinism (network dependency at render time) and duplicates the font loading. Just write font-family: 'Heebo', sans-serif;.hyperframes tts command for Hebrew narration. The bundled Kokoro-82M supports 8 languages via voice-ID prefix, a=American English, b=British English, e=Spanish, f=French, h=Hindi, i=Italian, j=Japanese, p=Brazilian Portuguese, z=Mandarin. Hebrew is not included. Generate the WAV/MP3 with an external service (ElevenLabs Hebrew voices, OpenAI TTS, Google Cloud TTS Hebrew) and drop the file into the composition as a normal <audio> clip..en Whisper models on Hebrew audio. .en variants TRANSLATE non-English audio to English instead of transcribing it. For Hebrew captions use npx hyperframes transcribe audio.wav --model small --language he (or medium --language he / large-v3 --language he for noisy audio). The .en suffix is only correct when the user explicitly says the audio is English.dir="rtl" on Hebrew text containers, even inside a RTL-defaulted composition. HyperFrames sub-compositions set their own direction context. GSAP x: tweens also don't auto-mirror. A title that uses gsap.from({x: -80}) enters from the left in both LTR and RTL, for Hebrew, flip to x: 80 so it enters from the right, matching reading direction.<bdi> or unicode-bidi: isolate. Without isolation, the Unicode bidi algorithm reorders mixed-direction runs and can place punctuation on the wrong side of the brand name or visually reverse it. Wrap brand names: הצטרפו ל־<bdi>HyperFrames</bdi> עכשיו.The <bdi> rule above covers brand names. Mixed-direction Hebrew compositions need three more bidi habits:
max-width so it wraps naturally, and add word-break: keep-all (or overflow-wrap: normal) so the compiler does not break inside a Hebrew word. Do NOT use <br> to force breaks (see Rule 11). For deliberate one-word-per-line display titles, give each word its own element instead.2025 next to a Hebrew word can render as 5202 or jump to the wrong side. Wrap any digit run with <bdi> or ... (LTR isolate): בשנת <bdi>2025</bdi>, <bdi>15%</bdi> הנחה, <bdi>₪199</bdi>. The currency symbol stays attached to the number this way.( visually becomes ) in an RTL run. In Hebrew captions and paragraphs let the browser mirror them by keeping the text in a proper dir="rtl" container. Do NOT hand-swap ( and ) to "fix" it, and when a parenthetical contains LTR content (a brand, a URL, a number) wrap that inner content in <bdi> so only the inner run is LTR while the parentheses stay correctly mirrored.hyperframes command not found / render fails immediatelyHyperFrames requires Node 22+ and FFmpeg on PATH. Confirm node --version is 22 or higher and ffmpeg -version resolves. On macOS install FFmpeg with brew install ffmpeg; on Debian/Ubuntu use apt install ffmpeg. Without FFmpeg the compiler cannot encode the MP4 and aborts before rendering any frames.
The compiler only embeds fonts it can fetch from Google Fonts. Use a Hebrew family that exists on Google Fonts (Heebo, Rubik, Assistant, Alef, Frank Ruhl Libre, Noto Sans Hebrew) and write it plainly in CSS: font-family: 'Heebo', sans-serif;. Do NOT add a <link rel="stylesheet"> or @import (see Gotchas), that breaks determinism without fixing the warning. If a custom non-Google font is required, the upstream docs cover local font embedding.
hyperframes validate)validate samples background pixels behind each text element at 5 timestamps and flags ratios under 4.5:1 (normal text) or 3:1 (large text). Fix by adjusting the failing color WITHIN the palette family: brighten it on dark backgrounds, darken it on light backgrounds. Do not invent a new color. Re-run hyperframes validate until clean. Use --no-contrast only while iterating, never as the final state.
The most common cause is a <template> wrapper on a standalone composition. The main index.html must put the data-composition-id div directly in <body>, not inside <template>. Also check that every timeline is registered via window.__timelines["<composition-id>"] = tl and built synchronously (not inside async, setTimeout, or a Promise), the capture engine reads window.__timelines synchronously after page load.
GSAP x: tweens do not auto-mirror for RTL. A gsap.from({x: -80}) enters from the left in both LTR and RTL. For Hebrew, flip to a positive value (x: 80) so the element enters from the right, matching reading direction. See references/hebrew-rtl.md.
| Source | URL | What to Check |
|---|---|---|
| HyperFrames GitHub | https://github.com/heygen-com/hyperframes | Upstream repo, issues, releases |
| HyperFrames docs | https://hyperframes.heygen.com/quickstart | CLI, Node 22+, FFmpeg requirement |
| Compiler font logic | https://github.com/heygen-com/hyperframes/blob/main/packages/producer/src/services/deterministicFonts.ts | Canonical font list, Google Fonts fallback, cache path |
| Kokoro TTS voices | https://github.com/heygen-com/hyperframes/blob/main/skills/hyperframes/references/narration.md | Kokoro voices across 8 languages (no Hebrew) |
| Whisper model guide | https://github.com/heygen-com/hyperframes/blob/main/skills/hyperframes/references/transcript-guide.md | .en vs multilingual models, --language flag |
| Google Fonts Hebrew | https://fonts.google.com/?subset=hebrew | Heebo, Rubik, Assistant, Alef, Frank Ruhl Libre, Noto Sans Hebrew |
| Unicode bidi spec | https://developer.mozilla.org/en-US/docs/Web/CSS/unicode-bidi | isolate, <bdi>, mixed-direction text |
tools
Best practices for using browser-use/video-use to edit Hebrew videos end-to-end with Claude Code. Covers the Hebrew-specific deltas to video-use's 12 Hard Rules: SUB_FORCE_STYLE override (Helvetica has no Hebrew glyphs), the python-bidi pre-shape recipe for libass+SRT BiDi failures on macOS, Hebrew filler-word post-pass on Scribe word timestamps, fontsdir= parameter for reliable font discovery, takes_packed.md handling for Hebrew with sofit/nikud/code-switching, and animation slot guidance that defers to hyperframes-best-practices and remotion-best-practices. Use when editing Hebrew talking-head video, podcast clips, tutorials, or marketing video with video-use. Do NOT use for non-Hebrew video-use sessions (read upstream SKILL.md directly), Hebrew podcast audio-only post-production (use hebrew-podcast-postproduction), or generic FFmpeg work without video-use orchestration.
development
Best practices for authoring presentations with open-slide, the React slide framework with a fixed 1920×1080 canvas, with full Hebrew and RTL support. Covers the slides/[id]/index.tsx file contract, type scale, DesignSystem tokens, themes/ system, @slide-comment inspector markers, current.json deictic resolution, Hebrew Google Fonts (Heebo, Rubik, Assistant, Noto Sans Hebrew), CSS logical properties, bidirectional Hebrew+English text with the bdi element, and Hebrew-aware type scale tuning. Use when authoring or editing slides under slides/[id]/ in an open-slide project, or when building Hebrew or bilingual decks on the framework. Do NOT use for video creation (use remotion-best-practices or hyperframes-best-practices), or for generic Hebrew presentations outside open-slide (use presentation-generator).
tools
Build Zapier Zaps connecting Israeli business apps (Morning/Green Invoice, Cardcom, Tranzila, iCount, Grow) with global services for billing, payment, and workflow automation. Use when asked to "create a Zap for Israeli invoicing", "automate Morning receipts", "connect Cardcom to my CRM", or set up payment notifications. Covers Hebrew text handling, ILS formatting, bimonthly VAT logic, Invoice Reform 2026, Zapier AI (Copilot, Agents, MCP), and webhooks from Israeli processors. All amounts use decimal shekels, not agorot. Customer WhatsApp requires Twilio/WATI (not Zapier native). Do NOT use for n8n (use n8n-hebrew-workflows), Make.com (use make-com-israeli-automations), or non-Zapier automation.
development
Build Telegram bots with grammY, Telegraf, or python-telegram-bot. Covers Bot API v10.0 webhooks vs polling, inline keyboards, commands, middleware patterns, Telegram Stars + Gifts payments, Mini Apps 2.0, Bot Business mode, and Hebrew message handling. Use when building a Telegram bot, setting up webhooks, handling Hebrew/RTL messages in a bot, or integrating Telegram payments. Do NOT use for WhatsApp bots (use israeli-whatsapp-business), voice bots (use hebrew-voice-bot-builder), or general chatbot design patterns (use hebrew-chatbot-builder).