tools/video/p-video-avatar/SKILL.md
Generate talking head avatar videos with Pruna P-Video-Avatar via inference.sh CLI. Turn a portrait image into a realistic speaking video with built-in TTS. 18x faster and 6x cheaper than competitors. Models: P-Video-Avatar, P-Image (for portrait generation). Capabilities: text-to-avatar, audio-driven avatars, 30 voices, 10 languages, 720p/1080p, built-in TTS, dynamic backgrounds, full-body control. Use for: AI presenters, product demos, explainer videos, virtual influencers, marketing, education, multilingual content, UGC, gaming avatars. Triggers: avatar video, talking head, ai avatar, p-video-avatar, pruna avatar, video avatar, ai presenter, digital human, virtual presenter, lipsync, talking avatar, ai spokesperson, heygen alternative, synthesia alternative, veed alternative, fabric alternative, omnihuman alternative
npx skillsauth add inference-sh-8/skills p-video-avatarInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Install the belt CLI skill:
npx skills add belt-sh/cli
Generate talking head avatar videos from a single portrait image via inference.sh CLI.
P-Video-Avatar is the fastest and most cost-effective avatar video model available. Quality on par with Veo 3.0, 18x faster and 6x cheaper than alternatives like Fabric, OmniHuman, and HeyGen.
Requires inference.sh CLI (
belt). Install instructions
belt login
# Generate avatar from portrait + text script
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "Hello, welcome to our product demo!",
"voice": "Zephyr (Female)"
}'
Use Pruna P-Image to generate the portrait, then P-Video-Avatar to animate it:
# 1. Generate a portrait image with P-Image
belt app run pruna/p-image --input '{
"prompt": "professional headshot portrait of a young woman, neutral background, looking at camera, studio lighting, photorealistic",
"aspect_ratio": "9:16"
}'
# 2. Use the generated image URL to create the avatar video
belt app run pruna/p-video-avatar --input '{
"image": "<image-url-from-step-1>",
"voice_script": "Hi there! Let me walk you through our latest features.",
"voice": "Zephyr (Female)",
"resolution": "720p"
}'
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "Welcome to our product walkthrough. Today I will show you three key features.",
"voice": "Puck (Male)",
"voice_language": "English (US)",
"resolution": "720p"
}'
Provide your own audio file instead of using built-in TTS:
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"audio": "https://speech.mp3"
}'
When both audio and voice_script are provided, audio takes priority.
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "This is exciting news for our community!",
"voice": "Aoede (Female)",
"voice_prompt": "Enthusiastic and energetic tone, slightly faster pace",
"video_prompt": "The person is presenting on stage with dramatic lighting",
"resolution": "1080p"
}'
# Spanish
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "Bienvenidos a nuestra demostración de producto.",
"voice": "Kore (Female)",
"voice_language": "Spanish"
}'
# Japanese
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "こんにちは、製品デモへようこそ。",
"voice": "Leda (Female)",
"voice_language": "Japanese"
}'
belt app run pruna/p-video-avatar --input '{
"image": "https://portrait.jpg",
"voice_script": "Consistent results every time.",
"seed": 42
}'
Female: Zephyr, Kore, Leda, Aoede, Callirrhoe, Autonoe, Despina, Erinome, Laomedeia, Achernar, Gacrux, Pulcherrima, Vindemiatrix, Sulafat
Male: Puck, Charon, Fenrir, Orus, Enceladus, Iapetus, Umbriel, Algenib, Algieba, Schedar, Achird, Zubenelgenubi, Sadachbia, Sadaltager, Alnilam, Rasalgethi
English (US), English (UK), Spanish, French, German, Italian, Portuguese (Brazil), Japanese, Korean, Hindi
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| image | file | required | Portrait image (jpg, jpeg, png, webp) |
| voice_script | string | - | Text for the person to say |
| audio | file | - | Audio file (overrides voice_script) |
| voice | string | "Zephyr (Female)" | Voice selection |
| voice_language | string | "English (US)" | Output language |
| resolution | string | "720p" | 720p or 1080p |
| video_prompt | string | "The person is talking." | Control avatar behavior and background |
| voice_prompt | string | "Say the following." | Control tone, pacing, emotion |
| seed | int | random | Reproducible generation |
| disable_safety_filter | bool | true | Disable content filter |
| disable_prompt_upsampling | bool | false | Skip prompt enhancement |
| Resolution | Price | |------------|-------| | 720p | $0.025 per second of output video | | 1080p | $0.045 per second of output video |
Example: 30-second 720p video = $0.75
P-Video-Avatar is completely free from Thursday May 1, 2026 4:00 PM CET through Sunday May 4, 2026 11:59 PM CET. All costs are on us during this window — no billing, no limits on resolution.
| Feature | P-Video-Avatar | Fabric 1.0 | OmniHuman 1.5 | HeyGen Avatar 4 | |---------|---------------|------------|---------------|-----------------| | Speed (per sec of video) | ~1.83s/s | ~34s/s (18x slower) | ~28s/s (15x slower) | ~26s/s (14x slower) | | Cost per second | $0.025 | $0.14 (5.6x more) | $0.16 (6.4x more) | $0.075 (3x more) | | Built-in TTS | Yes | Yes | No | Yes | | Dynamic Background | Yes | Yes | No | Yes | | 1080p Support | Yes | No | No | Yes |
video_prompt to control dynamic backgrounds and body languagevoice_prompt to control speaking style, emotion, and pacingpruna/p-image using aspect ratio 9:16 for vertical avatar videos# Generate portrait images
belt app run pruna/p-image --input '{"prompt": "professional headshot portrait"}'
# General video generation
belt app run pruna/p-video --input '{"prompt": "cinematic scene"}'
# Image editing
belt app run pruna/p-image-edit --input '{"prompt": "change background", "image": "https://photo.jpg"}'
# Full platform skill (all 250+ apps)
npx skills add inference-sh/skills@infsh-cli
# Pruna video generation
npx skills add inference-sh/skills@p-video
# Pruna image generation
npx skills add inference-sh/skills@p-image
# All video generation models
npx skills add inference-sh/skills@ai-video-generation
# Image generation (for creating portraits)
npx skills add inference-sh/skills@ai-image-generation
Browse all Pruna apps: belt app store --search "pruna"
data-ai
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
development
Declarative UI widgets from JSON for React/Next.js from ui.inference.sh. Render rich interactive UIs from structured agent responses. Capabilities: forms, buttons, cards, layouts, inputs, selects, checkboxes. Use for: agent-generated UIs, dynamic forms, data display, interactive cards. Triggers: widgets, declarative ui, json ui, widget renderer, agent widgets, dynamic ui, form widgets, card widgets, shadcn widgets, structured output ui
tools
Tool lifecycle UI components for React/Next.js from ui.inference.sh. Display tool calls: pending, progress, approval required, results. Capabilities: tool status, progress indicators, approval flows, results display. Use for: showing agent tool calls, human-in-the-loop approvals, tool output. Triggers: tool ui, tool calls, tool status, tool approval, tool results, agent tools, mcp tools ui, function calling ui, tool lifecycle, tool pending
development
Chat UI building blocks for React/Next.js from ui.inference.sh. Components: container, messages, input, typing indicators, avatars. Capabilities: chat interfaces, message lists, input handling, streaming. Use for: building custom chat UIs, messaging interfaces, AI assistants. Triggers: chat ui, chat component, message list, chat input, shadcn chat, react chat, chat interface, messaging ui, conversation ui, chat building blocks