tools/video/happyhorse/SKILL.md
Generate and edit videos with Alibaba HappyHorse 1.0 models via inference.sh CLI. Models: HappyHorse T2V, I2V, R2V, Video Edit. Capabilities: text-to-video, image-to-video, reference-to-video, video editing with natural language, character preservation, 720P/1080P, up to 15 seconds. Use for: physically realistic video, video editing, character-consistent content, product demos, social media. Triggers: happyhorse, happy horse, alibaba video, happyhorse 1.0, dashscope video, alibaba happyhorse, video editing ai, ai video editor
npx skillsauth add inference-sh-0/skills happyhorseInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
Security scan pending...
This skill is queued for security scanning. Results will appear when the scan completes.
Install the belt CLI skill:
npx skills add belt-sh/cli
Generate and edit physically realistic videos with Alibaba's HappyHorse 1.0 models via inference.sh CLI.
Requires inference.sh CLI (
belt). Install instructions
belt login
belt app run alibaba/happyhorse-1-0-t2v --input '{"prompt": "a horse galloping across a sunlit meadow"}'
| Model | App ID | Best For |
|-------|--------|----------|
| T2V | alibaba/happyhorse-1-0-t2v | Text-to-video, physically realistic motion |
| I2V | alibaba/happyhorse-1-0-i2v | Animate a single image |
| R2V | alibaba/happyhorse-1-0-r2v | Preserve characters from up to 9 reference images |
| Video Edit | alibaba/happyhorse-1-0-video-edit | Edit existing videos with natural language |
All models support 720P/1080P resolution, up to 15 seconds duration.
belt app run alibaba/happyhorse-1-0-t2v --input '{
"prompt": "a golden retriever running through autumn leaves in a park, slow motion",
"duration": 10,
"resolution": "1080P",
"ratio": "16:9"
}'
Animate a still image:
belt app run alibaba/happyhorse-1-0-i2v --input '{
"first_frame": "https://your-image.jpg",
"prompt": "gentle camera zoom, clouds moving in the sky",
"duration": 8,
"resolution": "720P"
}'
Generate videos that preserve characters from reference images (up to 9):
belt app run alibaba/happyhorse-1-0-r2v --input '{
"prompt": "a woman walking through a busy market street",
"reference_images": ["https://portrait.jpg"],
"duration": 10,
"resolution": "720P"
}'
belt app run alibaba/happyhorse-1-0-r2v --input '{
"prompt": "two friends sitting at a cafe having coffee",
"reference_images": ["https://person1.jpg", "https://person2.jpg"],
"ratio": "16:9"
}'
Edit existing videos using natural language instructions:
belt app run alibaba/happyhorse-1-0-video-edit --input '{
"video": "https://your-video.mp4",
"prompt": "change the background to a snowy mountain landscape"
}'
belt app run alibaba/happyhorse-1-0-video-edit --input '{
"video": "https://your-video.mp4",
"prompt": "replace the person with the character from the reference image",
"reference_images": ["https://character.jpg"]
}'
belt app run alibaba/happyhorse-1-0-video-edit --input '{
"video": "https://your-video.mp4",
"prompt": "make the scene look like a rainy day",
"audio_setting": "generate"
}'
| Resolution | Price | |------------|-------| | 720P | $0.14 per second | | 1080P | $0.24 per second |
Video Edit is billed on input + output duration.
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| prompt | string | required | Text description of the video |
| duration | integer | 5 | Duration in seconds (3–15) |
| resolution | enum | 720P | 720P or 1080P |
| ratio | enum | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4, 21:9 |
| seed | integer | random | Reproducible generation |
| watermark | boolean | false | Add HappyHorse watermark |
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| first_frame | file | required | First frame image (JPEG, PNG, WebP) |
| prompt | string | - | Optional text description |
| duration | integer | 5 | Duration in seconds (3–15) |
| resolution | enum | 720P | 720P or 1080P |
| seed | integer | random | Reproducible generation |
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| prompt | string | required | Text description of the scene |
| reference_images | array | required | Up to 9 character reference images |
| duration | integer | 5 | Duration in seconds (3–15) |
| resolution | enum | 720P | 720P or 1080P |
| ratio | enum | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4, 21:9 |
| seed | integer | random | Reproducible generation |
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| video | file | required | Video to edit (MP4/MOV, H.264) |
| prompt | string | required | Editing instruction |
| reference_images | array | - | Up to 5 reference images |
| audio_setting | enum | auto | auto, generate, or keep_original |
| resolution | enum | 720P | 720P or 1080P |
| seed | integer | random | Reproducible generation |
belt app store search "happyhorse"
# Full platform skill (all 250+ apps)
npx skills add inference-sh/skills@infsh-cli
# All video generation models
npx skills add inference-sh/skills@ai-video-generation
# Seedance 2.0
npx skills add inference-sh/skills@seedance
# Google Veo
npx skills add inference-sh/skills@google-veo
# Image generation (for image-to-video)
npx skills add inference-sh/skills@ai-image-generation
Browse all video apps: belt app store --category video
data-ai
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
tools
Generate videos with ByteDance Seedance 2.0 via inference.sh CLI. Unified model for text-to-video, image-to-video, and reference-to-video with synchronized audio, up to 1080p, 4-15s duration. Pro and Fast variants. Studio variants with private asset library for portrait consistency. Use for: social media videos, music videos, product demos, animated content, AI video with sound. Triggers: seedance, seedance 2, bytedance video, seedance t2v, seedance i2v, seedance r2v, video with audio, seedance 2.0, bytedance seedance, seedance studio
tools
Generate talking head avatar videos with Pruna P-Video-Avatar via inference.sh CLI. Turn a portrait image into a realistic speaking video with built-in TTS. 18x faster and 6x cheaper than competitors. Models: P-Video-Avatar, P-Image (for portrait generation). Capabilities: text-to-avatar, audio-driven avatars, 30 voices, 10 languages, 720p/1080p, built-in TTS, dynamic backgrounds, full-body control. Use for: AI presenters, product demos, explainer videos, virtual influencers, marketing, education, multilingual content, UGC, gaming avatars. Triggers: avatar video, talking head, ai avatar, p-video-avatar, pruna avatar, video avatar, ai presenter, digital human, virtual presenter, lipsync, talking avatar, ai spokesperson, heygen alternative, synthesia alternative, veed alternative, fabric alternative, omnihuman alternative
tools
Generate and edit images with OpenAI GPT-Image-2 via inference.sh CLI. Models: GPT-Image-2. Capabilities: text-to-image, image editing, inpainting, mask-based editing, multi-image reference, batch generation. Use for: product mockups, marketing visuals, image editing, concept art, inpainting, photo manipulation. Triggers: gpt image, gpt-image-2, openai image, chatgpt image, dall-e, dalle, openai image generation, gpt image edit, gpt inpainting, openai dall-e, gpt 4o image