/SKILL.md
Use when the user gives a topic and wants an automated video podcast created, or asks to learn visual design patterns from a reference video/image. Produces 4K video via research → script → TTS → Remotion → MP4 + BGM.
npx skillsauth add agents365-ai/video-podcast-maker video-podcast-makerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
REQUIRED: Load Remotion Best Practices First
This skill depends on
remotion-best-practices. You MUST invoke it before proceeding:Invoke the skill/tool named: remotion-best-practices
Open your coding agent and say: "Make a video podcast about $ARGUMENTS"
Or invoke directly: /video-podcast-maker AI Agent tutorial
Extract visual design patterns from reference videos or images and apply them to new video compositions. Skip this section unless the user provides a reference video/image or asks to save/list/delete style profiles.
→ See references/design-learning.md for commands, reference-library management, style-profile management, and integration with Pre-workflow / Step 9.
Agent behavior: Check for updates at most once per day (throttled by timestamp file).
Before any shell command that reads files from this skill, resolve SKILL_DIR to the directory containing SKILL.md.
If your agent exposes a built-in skill directory variable such as ${CLAUDE_SKILL_DIR}, you may map it to SKILL_DIR.
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
STAMP="${SKILL_DIR}/.last_update_check"
NOW=$(date +%s)
LAST=$(cat "$STAMP" 2>/dev/null || echo 0)
if [ ! -d "${SKILL_DIR}/.git" ]; then
echo "MANUAL_INSTALL"
elif [ $((NOW - LAST)) -gt 86400 ]; then
timeout 5 git -C "${SKILL_DIR}" fetch --quiet 2>/dev/null || true
LOCAL=$(git -C "${SKILL_DIR}" rev-parse HEAD 2>/dev/null)
REMOTE=$(git -C "${SKILL_DIR}" rev-parse origin/main 2>/dev/null)
echo "$NOW" > "$STAMP"
if [ -n "$LOCAL" ] && [ -n "$REMOTE" ] && [ "$LOCAL" != "$REMOTE" ]; then
echo "UPDATE_AVAILABLE"
else
echo "UP_TO_DATE"
fi
else
echo "SKIPPED_RECENT_CHECK"
fi
git -C "${SKILL_DIR}" pull. No → continue..git directory — skill was installed via tarball/zip/cp): Continue silently. Auto-update is disabled; the user must reinstall manually to update.Agent behavior: Run the pre-flight check below before Step 1. SKILL_DIR must already be resolved (see Auto Update Check section above for resolution rules).
SKILL_DIR="${SKILL_DIR:-${CLAUDE_SKILL_DIR}}"
python3 "${SKILL_DIR}/scripts/check_prereqs.py"
If MISSING reported, see README.md for full setup instructions (install commands, API key setup, Remotion project init). The check is backend-aware: backend is resolved as TTS_BACKEND env var → user_prefs.json (global.tts.backend) → edge default, then only env vars required by that backend are validated.
Automated pipeline for professional Bilibili horizontal knowledge videos from a topic.
Target: Bilibili horizontal video (16:9)
- Resolution: 3840×2160 (4K) or 1920×1080 (1080p)
- Style: Clean white (default)
Tech stack: Coding agent + TTS backend + Remotion + FFmpeg
| Parameter | Horizontal (16:9) | Vertical (9:16) | |-----------|-------------------|-----------------| | Resolution | 3840×2160 (4K) | 2160×3840 (4K) | | Frame rate | 30 fps | 30 fps | | Encoding | H.264, 16Mbps | H.264, 16Mbps | | Audio | AAC, 192kbps | AAC, 192kbps | | Duration | 1-15 min | 60-90s (highlight) |
Agent behavior: Detect user intent at workflow start:
Full pipeline with sensible defaults. Mandatory stop at Step 9:
| Step | Decision | Auto Default | |------|----------|-------------| | 3 | Title position | top-center | | 5 | Media assets | Skip (text-only animations) | | 7 | Thumbnail method | Remotion-generated (16:9 + 4:3) | | 9 | Outro animation | Pre-made MP4 (white/black by theme) | | 9 | Preview method | Remotion Studio (mandatory) | | 12 | Subtitles | Skip | | 14 | Cleanup | Auto-clean temp files |
Users can override any default in their initial request:
Prompts at each decision point. Activated by:
Hard constraints for video production. Visual design remains the agent's creative freedom within these rules:
| Rule | Requirement |
|------|-------------|
| Single Project | All videos under videos/{name}/ in user's Remotion project. NEVER create a new project per video. |
| 4K Output | 3840×2160, use scale(2) wrapper over 1920×1080 design space |
| Content Width | ≥85% of screen width |
| Bottom Safe Zone | Bottom 100px reserved for subtitles |
| Audio Sync | All animations driven by timing.json timestamps |
| Thumbnail | MUST generate 16:9 (1920×1080) AND 4:3 (1200×900). Centered layout, title ≥120px, icons ≥120px, fill most of canvas. See design-guide.md. |
| Font | PingFang SC / Noto Sans SC for Chinese text |
| Studio Before Render | MUST launch remotion studio for user review. NEVER render 4K until user explicitly confirms ("render 4K", "render final"). |
Load these files on demand — do NOT load all at once:
timing.json format.All scripts in ${SKILL_DIR}/scripts/ are reachable through one hierarchical entry point:
python3 ${SKILL_DIR}/scripts/cli.py --help # list resources
python3 ${SKILL_DIR}/scripts/cli.py <resource> --help # list actions
python3 ${SKILL_DIR}/scripts/cli.py <resource> <action> --help # forwards to underlying script
python3 ${SKILL_DIR}/scripts/cli.py schema [<method>] # JSON parameter schema
Routes (all forward to the existing scripts; direct invocations keep working):
tts run|validate, verify, audit beats, shorts gen, design list|show|delete|add, prereqs, prefs get|migrate|backend|bgm-path, schema [<method>].
The dispatcher is optional — every snippet in this file uses the direct script path for back-compat. Agents that prefer one entry point with discoverable help should prefer cli.py.
Before launching Studio for Step 9 review:
python3 ${SKILL_DIR}/scripts/audit_beat_sync.py <Video.tsx> <timing.json>
Prints a beat-vs-narration alignment table and flags drift > 1.5s. Especially important for kinetic-typography style where each beat must hit a specific narration moment.
After Step 13, before declaring the video done:
python3 ${SKILL_DIR}/scripts/verify_output.py videos/{name}/
Auto-fixes common omissions (creates final_video.mp4 if missing; disable with --no-fix), validates resolution/codec/duration, checks audio-timing drift, sanity-checks publish_info.md. Exit 0 = green light to publish; exit 2 = warnings only, still publishable. For machine-readable output add --format json (auto when piped). Full flag reference in references/workflow-publish.md.
project-root/ # Remotion project root
├── src/remotion/ # Remotion source
│ ├── compositions/ # Video composition definitions
│ ├── Root.tsx # Remotion entry
│ └── index.ts # Exports
│
├── public/ # Remotion default (unused — use --public-dir videos/{name}/)
│
├── videos/{video-name}/ # Video project assets
│ ├── topic_definition.md # Step 1
│ ├── topic_research.md # Step 2
│ ├── podcast.txt # Step 4: narration script
│ ├── podcast_audio.wav # Step 8: TTS audio
│ ├── podcast_audio.srt # Step 8: subtitles
│ ├── timing.json # Step 8: timeline
│ ├── thumbnail_*.png # Step 7
│ ├── output.mp4 # Step 10
│ ├── video_with_bgm.mp4 # Step 11
│ ├── final_video.mp4 # Step 12: final output
│ └── bgm.mp3 # Background music
│
└── remotion.config.ts
Important: Always use
--public-dirand full output path for Remotion render:npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/
Video name {video-name}: lowercase English, hyphen-separated (e.g., reference-manager-comparison)
Section name {section}: lowercase English, underscore-separated, matches [SECTION:xxx]
Thumbnail naming (16:9 AND 4:3 both required):
| Type | 16:9 | 4:3 |
|------|------|-----|
| Remotion | thumbnail_remotion_16x9.png | thumbnail_remotion_4x3.png |
| AI | thumbnail_ai_16x9.png | thumbnail_ai_4x3.png |
Use --public-dir videos/{name}/ for all Remotion commands. Each video's assets (timing.json, podcast_audio.wav, bgm.mp3) stay in its own directory — no copying to public/ needed. This enables parallel renders of different videos.
# All render/studio/still commands use --public-dir
npx remotion studio src/remotion/index.ts --public-dir videos/{name}/
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --public-dir videos/{name}/ --video-bitrate 16M
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail.png --public-dir videos/{name}/
At Step 1 start, use your agent's task tracker (Claude Code TaskCreate / Codex todo list / equivalent) to create one task per step. Mark in_progress on start, completed on finish. Files in videos/{name}/ (e.g. podcast.txt, timing.json, output.mp4) act as the durable record of what completed — if a session is interrupted, inspect the directory to determine where to resume.
1. Define topic direction → topic_definition.md
2. Research topic → topic_research.md
3. Design video sections (5-7 chapters)
4. Write narration script → podcast.txt
4.5. Pronunciation pre-flight (zh-CN only) → videos/{name}/phonemes.json
5. Collect media assets → media_manifest.json
6. Generate publish info (Part 1) → publish_info.md
7. Generate thumbnails (16:9 + 4:3) → thumbnail_*.png
8. Generate TTS audio → podcast_audio.wav, timing.json
9. Create Remotion composition + Studio preview (mandatory stop)
10. Render 4K video (only on user request) → output.mp4
11. Mix background music → video_with_bgm.mp4
12. Add subtitles (optional) → final_video.mp4
13. Complete publish info (Part 2) → chapter timestamps
14. Verify output & cleanup
15. Generate vertical shorts (optional) → shorts/
After Step 8 (TTS):
podcast_audio.wav exists and plays correctlytiming.json has all sections with correct timestampspodcast_audio.srt encoding is UTF-8After Step 10 (Render):
output.mp4 resolution is 3840x2160See CLAUDE.md for the full command reference (TTS, Remotion, FFmpeg, shorts generation).
Skill learns and applies preferences automatically. See references/troubleshooting.md for commands and learning details.
| File | Purpose |
|------|---------|
| user_prefs.json | Learned preferences (auto-created from template) |
| user_prefs.template.json | Default values |
| prefs_schema.json | JSON schema definition |
Final = merge(Root.tsx defaults < global < topic_patterns[type] < current instructions)
| Command | Effect | |---------|--------| | "show preferences" | Show current preferences | | "reset preferences" | Reset to defaults | | "save as X default" | Save to topic_patterns |
Full reference: Read references/troubleshooting.md on errors, preference questions, or BGM options.
development
Use when the user gives a topic and wants an automated topic-driven narrated explainer, podcast, or knowledge-summary video (Bilibili / YouTube / Xiaohongshu / Douyin / WeChat Channels), or asks to learn visual design patterns from a reference video/image. Trigger when the user mentions creating a knowledge video, narrated explainer, video podcast, or talking-head topic video from a topic — even if they don't say "video podcast" explicitly. Do NOT trigger for generic video editing, trimming, format conversion, color grading, or non-narrative video tasks. Produces 4K video via research → script → TTS → Remotion → MP4 + BGM.
content-media
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).
content-media
QQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
content-media
Summarize or extract text/transcripts from URLs, podcasts, and local files (great fallback for “transcribe this YouTube/video”).