skills/kinetic-video-creator/SKILL.md
--- name: kinetic-video-creator description: "Create professional kinetic typography videos from scratch. Includes speech writing, TTS with emotional dynamics, music generation, and animated text. Use for: promo videos, explainers, social content, inspirational speeches, product launches." argument-hint: [topic] [tone: inspirational/dramatic/energetic/calm] enhancedBy: - speech-generator: "TTS with aviz's cloned voice - optimized for Hebrew" - transcribe: "Word-level timing for animation syn
npx skillsauth add aviz85/claude-skills-library skills/kinetic-video-creatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Create stunning kinetic typography videos with AI-generated speech, music, and dynamic animations.
/speech-generator skill for TTS/transcribe skill for word timing/music-generator skill for background/youtube-uploader skill (optional)Hebrew (Recommended for aviz's voice):
English:
aviz's cloned voice is optimized for Hebrew. Use these Hebrew directions:
| Direction | Effect |
|-----------|--------|
| [נשימה עמוקה] | Deep breath, pause |
| [בהתלהבות] | Enthusiastic |
| [ברצינות] | Serious tone |
| [בעצב] | Sad, emotional |
| [בשקט] | Quiet, intimate |
| [מהר] | Fast pace |
| [לאט ובבירור] | Slow and clear |
| [שאלה] | Question tone |
| [הפתעה] | Surprise |
| [צחוק קל] | Light laugh |
| [בחום] | Warm tone |
| [בכוח] | Powerful, emphatic |
אממ... - hesitationאהה... - thinkingכאילו... - likeנו... - wellיאללה... - come onבקיצור... - in short... - pause[נשימה עמוקה] יש רגע...
[לאט ובבירור] רגע שהכל משתנה.
[ברצינות] אבא שלי חלה בפוליו כשהיה תינוק.
כל חייו הוא היה על כיסא גלגלים.
[בהתלהבות] אבל אבא שלי? הוא היה ספורטאי מצטיין!
[בחום] הוא תמיד האמין... שאפשר להגשים כל חלום.
[בעצב] כשהייתי בן חמש עשרה... אבא נפטר.
[בכוח] והכאב הזה? הפך למשימה שלי.
[בחום] לעזור לאנשים אחרים להגשים את החלומות שלהם.
| Direction | Effect |
|-----------|--------|
| [pause] | Brief pause |
| [long pause] | Extended pause |
| [slowly] | Slower delivery |
| [faster] | Quickened pace |
| [whisper] | Softer, intimate |
| [emphatic] | Strong emphasis |
| [building] | Increasing intensity |
| [warm] | Friendly tone |
| [dramatic] | Theatrical |
| [matter-of-fact] | Conversational |
[HOOK - 5-10 seconds]
[dramatic pause] Opening line that grabs attention.
[slowly, with weight] The provocative statement.
[BUILD - 20-40 seconds]
[building intensity] Establish the context.
[pause for effect] Key insight moment.
[PEAK - 20-30 seconds]
[powerful, emphatic] The main message.
[pause] Let it land.
[RESOLVE - 15-25 seconds]
[warm, inspiring] Paint the vision.
[final beat] Memorable closing.
Use the speech-generator skill:
/speech-generator [path/to/script.txt] -o [path/to/speech.mp3]
Or invoke directly:
cd ~/.claude/skills/speech-generator/scripts
npx ts-node generate_speech.ts -f script.txt -o speech.mp3
Important: The speech-generator uses aviz's cloned voice, which works best with Hebrew text and Hebrew emotional directions.
Use the transcribe skill:
/transcribe [path/to/speech.mp3] --json
Or invoke directly:
cd ~/.claude/skills/transcribe/scripts
npx ts-node transcribe.ts -i speech.mp3 -o transcript.srt --json
Output: transcript_transcript.json with word-level timing data.
Use the music-generator skill:
/music-generator [composition.json] -o background_music.mp3
{
"duration_ms": 75000,
"instrumental": true,
"positive_global_styles": ["cinematic", "inspirational"],
"negative_global_styles": ["aggressive", "chaotic"],
"sections": [
{
"section_name": "Hook - Mysterious",
"duration_ms": 12000,
"positive_local_styles": ["suspenseful", "soft"],
"negative_local_styles": ["loud"],
"lines": []
},
{
"section_name": "Build - Rising",
"duration_ms": 25000,
"positive_local_styles": ["hopeful", "building"],
"negative_local_styles": ["slow"],
"lines": []
},
{
"section_name": "Peak - Triumphant",
"duration_ms": 20000,
"positive_local_styles": ["triumphant", "uplifting"],
"negative_local_styles": ["quiet"],
"lines": []
}
]
}
ffmpeg -y \
-i speech.mp3 \
-i background_music.mp3 \
-filter_complex "[0:a]volume=1.0[speech];[1:a]volume=0.15[music];[speech][music]amix=inputs=2:duration=first[out]" \
-map "[out]" -c:a libmp3lame -q:a 2 \
final_audio.mp3
cd /Users/aviz/remotion-assistant
Recommended: Use SequenceComposition for maximum impact - displays one word at a time with full-screen typography.
import { SequenceComposition } from '../templates/SequenceComposition';
import transcriptData from '../../projects/[project]/transcript_transcript.json';
const WORD_TIMINGS = transcriptData.words
.filter((w) => w.word.trim() !== '')
.map((w) => ({
word: w.word,
start: w.start,
end: w.end,
}));
export const MyVideo: React.FC = () => {
return (
<SequenceComposition
wordTimings={WORD_TIMINGS}
audioFile="[project]/final_audio.mp3"
baseFontSize={200}
dustEnabled={true}
lightBeamsEnabled={true}
centerGlowEnabled={true}
glowIntensity={1}
anticipationFrames={5}
colorSchemeStart={0}
/>
);
};
Use for faster-paced content with multiple words on screen:
import { MultiWordComposition } from '../templates/MultiWordComposition';
For Hebrew text, use Heebo font:
import { loadFont } from '@remotion/google-fonts/Heebo';
const { fontFamily } = loadFont('normal', {
weights: ['400', '600', '700', '900'],
subsets: ['hebrew', 'latin'],
});
Add RTL styling:
style={{
direction: 'rtl',
fontFamily,
}}
cd /Users/aviz/remotion-assistant
npx remotion render CompositionName output.mp4
Use the youtube-uploader skill:
/youtube-uploader [video.mp4] --title "Title" --description "Description"
remotion-assistant/
├── public/[project]/
│ └── final_audio.mp3 # Audio for Remotion
├── projects/[project]/
│ ├── speech.txt # Script
│ ├── speech.mp3 # TTS output
│ ├── transcript_transcript.json # Word timings
│ ├── music_composition.json
│ ├── background_music.mp3
│ ├── final_audio.mp3 # Merged audio
│ └── output.mp4 # Final video
└── src/compositions/
└── [ProjectName].tsx # Composition
| Step | Skill/Command |
|------|---------------|
| Speech | /speech-generator script.txt -o speech.mp3 |
| Transcribe | /transcribe speech.mp3 --json |
| Music | /music-generator composition.json |
| Merge | ffmpeg (see above) |
| Render | npx remotion render Name output.mp4 |
| Upload | /youtube-uploader output.mp4 |
tools
Start real-time microphone transcription using ElevenLabs Scribe v2 Realtime. Use when user wants to start live transcription, dictation, or real-time speech capture. Triggers on: 'תתחיל תמלול', 'תמלל בזמן אמת', 'start transcribing', 'live transcribe', 'הקלט מה שאני אומר'. After starting, tell user they can say 'אוקי זה מספיק בוא נעצור את התמלול' to stop, or use /live-transcribe-stop.
tools
Stop a running real-time transcription. Use when user wants to stop/end live transcription. Triggers on: 'עצור תמלול', 'תעצור את התמלול', 'stop transcribing', 'end transcription', 'תפסיק להקליט'.
testing
Read the latest real-time transcription. Use when user asks to see, read, or show a transcription that was captured via live-transcribe. Triggers on: 'תקריא תמלול', 'מה תמללתי', 'התמלול האחרון', 'show transcription', 'what did I say', 'read the transcript', 'מה נכתב בתמלול', 'תראה לי את התמלול'. Also use when user references transcription content without being explicit — e.g. 'summarize what I said', 'translate the transcription'.
development
Fetch X (Twitter) bookmarks via the official X API v2. Downloads recent bookmarks with text, images, and videos into a local folder. Use whenever user asks to grab/download/export their X bookmarks, save bookmarked tweets, or pull recent saved posts from X/Twitter. Uses OAuth 2.0 user-context auth (one-time browser consent, then refresh-token forever).