skills/acestep-simplemv/SKILL.md
Render music videos from audio files and lyrics using Remotion. Accepts audio + LRC/JSON lyrics + title to produce MP4 videos with waveform visualization and synced lyrics display. Use when users mention MV generation, music video rendering, creating video from audio/lyrics, or visualizing songs.
npx skillsauth add ace-step/ace-step-skills acestep-simplemvInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Render music videos with waveform visualization and synced lyrics from audio + lyrics input.
scripts/ directory within this skillBefore first use, check and install dependencies:
# 1. Check Node.js
node --version
# 2. Install npm dependencies
cd {project_root}/{.claude or .codex}/skills/acestep-simplemv/scripts && npm install
# 3. Check ffprobe
ffprobe -version
If ffprobe is not available, install ffmpeg (which includes ffprobe):
choco install ffmpeg or download from https://ffmpeg.org/download.html and add to PATHbrew install ffmpegsudo apt-get install ffmpeg (Debian/Ubuntu) or sudo dnf install ffmpeg (Fedora)cd {project_root}/{.claude or .codex}/skills/acestep-simplemv/
./scripts/render-mv.sh --audio /path/to/song.mp3 --lyrics /path/to/song.lrc --title "Song Title"
Output: MP4 file at out/<audio_basename>.mp4 (or custom --output path).
./scripts/render-mv.sh --audio <file> --lyrics <lrc_file> --title "Title" [options]
Options:
--audio Audio file path (absolute paths supported)
--lyrics LRC format lyrics file (timestamped)
--lyrics-json JSON lyrics file [{start, end, text}] (alternative to --lyrics)
--title Video title (default: "Music Video")
--subtitle Subtitle text
--credit Bottom credit text
--offset Lyric timing offset in seconds (default: -0.5)
--output Output file path (default: out/<audio_basename>.mp4)
--codec h264|h265|vp8|vp9 (default: h264)
--background Background image file path (if omitted, uses animated gradient)
--browser Custom browser executable path (Chrome/Edge/Chromium)
--max-size Max output file size in MB (e.g. 24). Auto-compresses if exceeded.
Use for IM platforms (WhatsApp≤16MB, Discord≤25MB, Telegram≤50MB)
Environment variables:
BROWSER_EXECUTABLE Path to browser executable (overrides auto-detection)
Remotion requires a Chromium-based browser for rendering. The script auto-detects browsers in this priority order:
BROWSER_EXECUTABLE environment variable--browser CLI argumentchrome-headless-shell, downloaded by Remotion)--chrome-mode=chrome-for-testing)--chrome-mode=chrome-for-testing)--chrome-mode=chrome-for-testing)Important: New versions of Chrome/Edge removed the old headless mode. When using regular Chrome/Edge/Chromium, the script automatically sets --chrome-mode=chrome-for-testing (which uses --headless=new). When using chrome-headless-shell, it uses the default headless-shell mode (which uses --headless=old). This is handled transparently.
If no browser is found, Remotion will attempt to download chrome-headless-shell from Google servers. This will fail if Google servers are inaccessible from your network.
Since Edge is pre-installed on Windows 10/11, it should be auto-detected without any manual configuration. The script automatically detects Chrome/Edge and uses the correct headless mode. If auto-detection fails:
# Option 1: Set environment variable
export BROWSER_EXECUTABLE="/path/to/msedge.exe"
# Option 2: Pass as CLI argument
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --browser "/path/to/msedge.exe"
# Option 3: Enable proxy and let Remotion download chrome-headless-shell
# Basic render
./scripts/render-mv.sh --audio /tmp/abc123_1.mp3 --lyrics /tmp/abc123.lrc --title "夜桜"
# Custom output path
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "My Song" --output /tmp/my_mv.mp4
# With subtitle and credit
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --subtitle "Artist Name" --credit "Generated by ACE-Step"
# With background image
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --background /path/to/cover.jpg
# Compress for Discord upload (max 25MB)
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --max-size 24
# Compress for WhatsApp (max 16MB)
./scripts/render-mv.sh --audio song.mp3 --lyrics song.lrc --title "Song" --max-size 15
When sending MV to chat platforms, use --max-size to auto-compress:
| Platform | Limit | Recommended --max-size |
|----------|-------|--------------------------|
| WhatsApp | 16MB | 15 |
| Discord (free) | 25MB | 24 |
| Telegram | 50MB | 48 |
| Slack (free) | 1GB | - |
The compression uses ffmpeg two-pass encoding to achieve the best quality within the size constraint.
When running in containers (e.g. OpenClaw), CJK fonts may not be pre-installed, causing lyrics to render as □ boxes. The script automatically:
fc-list)fonts-noto-cjk (Debian/Ubuntu), font-noto-cjk (Alpine), or google-noto-sans-cjk-fonts (Fedora/RHEL)If auto-install doesn't work, manually install fonts before rendering:
# Debian/Ubuntu
apt-get install -y fonts-noto-cjk
# Alpine
apk add font-noto-cjk
# Fedora/RHEL
dnf install -y google-noto-sans-cjk-fonts
IMPORTANT: Use the audio file's job ID as the output filename to avoid overwriting. Do NOT use custom names like --output my_song.mp4. Let the default naming handle it (derives from audio filename).
Default output uses the audio filename as base:
acestep_output/{job_id}_1.mp3acestep_output/{job_id}_1.lrc--output acestep_output/{job_id}.mp4 (use the job ID from the audio file)Example: if audio is chatcmpl-abc123_1.mp3, pass --output acestep_output/chatcmpl-abc123.mp4
--title short and single-line (max ~50 chars, auto-truncated)--subtitle for additional info--titleGood: --title "Open Source" --subtitle "ACE-Step v1.5"
Bad: --title "Open Source - ACE-Step v1.5\nCelebrating Music AI"
public/ by render.mjsdevelopment
Use ACE-Step API to generate music, edit songs, and remix music. Supports text-to-music, lyrics generation, audio continuation, and audio repainting. Use this skill when users mention generating music, creating songs, music production, remix, or audio continuation.
development
Generate song cover/thumbnail images using Gemini API. Creates artistic images suitable for music video backgrounds. Use when users want to generate album art, song covers, thumbnails, or background images for MVs.
documentation
Music songwriting guide for ACE-Step. Provides professional knowledge on writing captions, lyrics, choosing BPM/key/duration, and structuring songs. Use this skill when users want to create, write, or plan a song before generating it with ACE-Step.
development
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files, or extract lyrics with timestamps from audio.