Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

happycapy-ai/capy-video-gen-skill

Name: capy-video-gen-skill
Author: happycapy-ai

skills/capy-video-gen-skill/SKILL.md

npx skillsauth add happycapy-ai/happycapy-skills capy-video-gen-skill

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Capy Video Gen Skill - Script-to-Video Pipeline

Generate complete multi-shot videos from scripts or ideas with consistent character faces across all scenes. Built for HappyCapy AI Gateway. 300 experiments validated, 70% face distance improvement.

Overview

ViMax converts text scripts into full videos through an automated pipeline:

Extract characters from script with detailed physical features
Generate front/side/back character portraits
Design shot-by-shot storyboard
Decompose each shot into first_frame, last_frame, and motion descriptions
Build camera tree for shot relationships
Generate frames with reference image selection (face identity as top priority)
Generate video clips from frames
Concatenate into final video

Installation Location

The ViMax pipeline code is at: /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax/

All commands must be run from this directory using the venv:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax

Prerequisites

AI_GATEWAY_API_KEY environment variable (auto-configured in HappyCapy)
Python venv at .venv/ (already set up)

Quick Start

Script-to-Video

Edit the script, requirements, and style in the entry script, then run:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

Idea-to-Video

For generating from a brief idea (auto-generates script first):

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_idea2video.py

Programmatic Usage

import asyncio
from langchain.chat_models import init_chat_model
from tools.render_backend import RenderBackend
from utils.config_loader import load_config
from pipelines.script2video_pipeline import Script2VideoPipeline

config = load_config("configs/happycapy_script2video.yaml")
chat_model = init_chat_model(**config["chat_model"]["init_args"])
backend = RenderBackend.from_config(config)

pipeline = Script2VideoPipeline(
    chat_model=chat_model,
    image_generator=backend.image_generator,
    video_generator=backend.video_generator,
    working_dir=config["working_dir"],
)

# Run the pipeline
asyncio.run(pipeline(
    script="Your script here...",
    user_requirement="No more than 8 shots total.",
    style="Cinematic, warm lighting"
))

Pipelines

Script2VideoPipeline

Input: A formatted screenplay/script with character dialogue and scene descriptions
Output: Concatenated video at {working_dir}/final_video.mp4
Config: configs/happycapy_script2video.yaml

Idea2VideoPipeline

Input: A brief idea/concept (1-3 paragraphs)
Output: Auto-generates a script, then produces video
Config: configs/happycapy_idea2video.yaml

Configuration

HappyCapy configs at configs/happycapy_script2video.yaml:

chat_model:
  init_args:
    model: gpt-4.1
    model_provider: openai
    api_key: ${AI_GATEWAY_API_KEY}
    base_url: https://ai-gateway.happycapy.ai/api/v1/openai/v1

image_generator:
  class_path: tools.ImageGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/gemini-3.1-flash-image-preview

video_generator:
  class_path: tools.VideoGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/veo-3.1-generate-preview

working_dir: .working_dir/script2video

Key Components

Agents (AI Processing)

| Agent | File | Purpose | |-------|------|---------| | CharacterExtractor | agents/character_extractor.py | Extract characters with static/dynamic features from script | | CharacterPortraitsGenerator | agents/character_portraits_generator.py | Generate front/side/back portraits for each character | | StoryboardArtist | agents/storyboard_artist.py | Design shot-by-shot storyboard with first/last frames and motion | | ReferenceImageSelector | agents/reference_image_selector.py | Select best reference images for each frame (face identity #1 priority) | | CameraImageGenerator | agents/camera_image_generator.py | Build camera trees and generate transition videos | | BestImageSelector | agents/best_image_selector.py | Select best generated image from candidates | | Screenwriter | agents/screenwriter.py | Generate scripts from ideas |

Tools (Generation Backends)

| Tool | File | Purpose | |------|------|---------| | ImageGeneratorHappyCapyAPI | tools/image_generator_happycapy_api.py | Image generation via HappyCapy Gateway (Gemini) | | VideoGeneratorHappyCapyAPI | tools/video_generator_happycapy_api.py | Video generation via HappyCapy Gateway (Veo) | | RenderBackend | tools/render_backend.py | Factory for instantiating generators from config |

Interfaces (Data Models)

CharacterInScene - Character with identifier, static_features, dynamic_features
ShotDescription - Shot with ff_desc, lf_desc, motion_desc, variation_type
Camera - Camera with parent-child relationships
Frame - Frame with shot_idx, frame_type, visible characters
ImageOutput / VideoOutput - Generation outputs with save methods

Face Identity Consistency (CRITICAL)

This pipeline includes face identity improvements validated through 257 experiments (70% improvement in face distance, from 0.74 to 0.22):

Built-In Protections

Reference Image Selector: Face identity is the #1 priority when selecting reference images. The front-view portrait is always included when a character's face is visible.
Character Portraits: Enhanced prompts generate identity-critical details (exact nose shape, eye spacing, jawline, distinguishing marks) for cross-scene recognition.
Video Prompt Face Lock: Every video generation prompt is prepended with a face identity instruction requiring the character's face to remain identical to the starting frame throughout the clip.

Best Practices When Using ViMax

Hyper-detailed character descriptions: Include ethnicity, age, hair texture/style/color, eye shape, facial hair, glasses, skin tone, build, and distinguishing marks in your script's character introductions
Extreme close-up shots: Include at least one extreme close-up per character to anchor identity
Consistent lighting: Specify similar lighting across scenes to prevent face drift
User-provided reference photos: Place photos in the working directory and pass them as character_portraits_registry to skip AI portrait generation

What Does NOT Work

Complex prompt engineering (viseme morphing, phoneme anchoring) does not improve face identity
Simple, direct prompts with detailed physical descriptions outperform clever prompts
Lip-sync to external audio is NOT possible (Veo generates its own internal audio)

See FACE_IDENTITY_GUIDE.md in the ViMax directory for full details.

Output Structure

After a run, the working directory contains:

.working_dir/script2video/
  characters.json                      # Extracted characters
  character_portraits_registry.json    # Portrait paths registry
  character_portraits/                 # Generated portraits
    0_CharacterName/
      front.png
      side.png
      back.png
  storyboard.json                     # Shot descriptions
  camera_tree.json                    # Camera relationships
  shots/
    0/
      shot_description.json
      first_frame.png
      last_frame.png (if medium/large variation)
      video.mp4
    1/
      ...
  final_video.mp4                     # Final concatenated output

Customization

Using Your Own Reference Photos

To use real photos instead of AI-generated portraits:

# Build a portrait registry pointing to your photos
character_portraits_registry = {
    "Alice": {
        "front": {"path": "/path/to/alice_front.png", "description": "Front view of Alice"},
        "side": {"path": "/path/to/alice_side.png", "description": "Side view of Alice"},
        "back": {"path": "/path/to/alice_back.png", "description": "Back view of Alice"},
    }
}

# Pass to pipeline (skips portrait generation)
await pipeline(
    script=script,
    user_requirement=user_requirement,
    style=style,
    character_portraits_registry=character_portraits_registry,
)

Changing Models

Edit the YAML config to use different models:

Image: google/gemini-3.1-flash-image-preview (recommended for face identity)
Video: google/veo-3.1-generate-preview (recommended) or openai/sora-2
Chat: gpt-4.1 (recommended) or any OpenAI-compatible model

Troubleshooting

"No module named 'tools'" or similar import errors

Run from the ViMax root directory:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

API rate limit errors

Reduce max_requests_per_minute in the YAML config.

Face identity drift in generated videos

Add more physical detail to character descriptions in your script
Use user-provided reference photos instead of AI-generated portraits
Include extreme close-up shots for important characters
Keep lighting consistent across scenes

happycapy-ai/capy-video-gen-skill

skills/capy-video-gen-skill/SKILL.md

Multi-shot AI video generation pipeline with face identity consistency. Converts scripts or ideas into complete videos using character extraction, storyboarding, frame generation, and video assembly. 300 experiments validated, 70% face distance improvement. Use when the user asks to create a video from a script, story, idea, or wants multi-shot video with consistent characters.

96 stars

testing

Updated Apr 25, 2026

$ install --global

skillsauth

npx skillsauth add happycapy-ai/happycapy-skills capy-video-gen-skill

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 3:41 AM124.0s61 files scanned

SKILL.md

name:: capy-video-gen-skill
description:: Multi-shot AI video generation pipeline with face identity consistency. Converts scripts or ideas into complete videos using character extraction, storyboarding, frame generation, and video assembly. 300 experiments validated, 70% face distance improvement. Use when the user asks to create a video from a script, story, idea, or wants multi-shot video with consistent characters.
allowed-tools:: Bash, Read, Write, Edit

Capy Video Gen Skill - Script-to-Video Pipeline

Generate complete multi-shot videos from scripts or ideas with consistent character faces across all scenes. Built for HappyCapy AI Gateway. 300 experiments validated, 70% face distance improvement.

Overview

ViMax converts text scripts into full videos through an automated pipeline:

Extract characters from script with detailed physical features
Generate front/side/back character portraits
Design shot-by-shot storyboard
Decompose each shot into first_frame, last_frame, and motion descriptions
Build camera tree for shot relationships
Generate frames with reference image selection (face identity as top priority)
Generate video clips from frames
Concatenate into final video

Installation Location

The ViMax pipeline code is at: /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax/

All commands must be run from this directory using the venv:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax

Prerequisites

AI_GATEWAY_API_KEY environment variable (auto-configured in HappyCapy)
Python venv at .venv/ (already set up)

Quick Start

Script-to-Video

Edit the script, requirements, and style in the entry script, then run:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

Idea-to-Video

For generating from a brief idea (auto-generates script first):

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_idea2video.py

Programmatic Usage

import asyncio
from langchain.chat_models import init_chat_model
from tools.render_backend import RenderBackend
from utils.config_loader import load_config
from pipelines.script2video_pipeline import Script2VideoPipeline

config = load_config("configs/happycapy_script2video.yaml")
chat_model = init_chat_model(**config["chat_model"]["init_args"])
backend = RenderBackend.from_config(config)

pipeline = Script2VideoPipeline(
    chat_model=chat_model,
    image_generator=backend.image_generator,
    video_generator=backend.video_generator,
    working_dir=config["working_dir"],
)

# Run the pipeline
asyncio.run(pipeline(
    script="Your script here...",
    user_requirement="No more than 8 shots total.",
    style="Cinematic, warm lighting"
))

Pipelines

Script2VideoPipeline

Input: A formatted screenplay/script with character dialogue and scene descriptions
Output: Concatenated video at {working_dir}/final_video.mp4
Config: configs/happycapy_script2video.yaml

Idea2VideoPipeline

Input: A brief idea/concept (1-3 paragraphs)
Output: Auto-generates a script, then produces video
Config: configs/happycapy_idea2video.yaml

Configuration

HappyCapy configs at configs/happycapy_script2video.yaml:

chat_model:
  init_args:
    model: gpt-4.1
    model_provider: openai
    api_key: ${AI_GATEWAY_API_KEY}
    base_url: https://ai-gateway.happycapy.ai/api/v1/openai/v1

image_generator:
  class_path: tools.ImageGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/gemini-3.1-flash-image-preview

video_generator:
  class_path: tools.VideoGeneratorHappyCapyAPI
  init_args:
    api_key: ${AI_GATEWAY_API_KEY}
    model: google/veo-3.1-generate-preview

working_dir: .working_dir/script2video

Key Components

Agents (AI Processing)

Tools (Generation Backends)

Interfaces (Data Models)

CharacterInScene - Character with identifier, static_features, dynamic_features
ShotDescription - Shot with ff_desc, lf_desc, motion_desc, variation_type
Camera - Camera with parent-child relationships
Frame - Frame with shot_idx, frame_type, visible characters
ImageOutput / VideoOutput - Generation outputs with save methods

Face Identity Consistency (CRITICAL)

This pipeline includes face identity improvements validated through 257 experiments (70% improvement in face distance, from 0.74 to 0.22):

Built-In Protections

Reference Image Selector: Face identity is the #1 priority when selecting reference images. The front-view portrait is always included when a character's face is visible.
Character Portraits: Enhanced prompts generate identity-critical details (exact nose shape, eye spacing, jawline, distinguishing marks) for cross-scene recognition.
Video Prompt Face Lock: Every video generation prompt is prepended with a face identity instruction requiring the character's face to remain identical to the starting frame throughout the clip.

Best Practices When Using ViMax

Hyper-detailed character descriptions: Include ethnicity, age, hair texture/style/color, eye shape, facial hair, glasses, skin tone, build, and distinguishing marks in your script's character introductions
Extreme close-up shots: Include at least one extreme close-up per character to anchor identity
Consistent lighting: Specify similar lighting across scenes to prevent face drift
User-provided reference photos: Place photos in the working directory and pass them as character_portraits_registry to skip AI portrait generation

What Does NOT Work

Complex prompt engineering (viseme morphing, phoneme anchoring) does not improve face identity
Simple, direct prompts with detailed physical descriptions outperform clever prompts
Lip-sync to external audio is NOT possible (Veo generates its own internal audio)

See FACE_IDENTITY_GUIDE.md in the ViMax directory for full details.

Output Structure

After a run, the working directory contains:

.working_dir/script2video/
  characters.json                      # Extracted characters
  character_portraits_registry.json    # Portrait paths registry
  character_portraits/                 # Generated portraits
    0_CharacterName/
      front.png
      side.png
      back.png
  storyboard.json                     # Shot descriptions
  camera_tree.json                    # Camera relationships
  shots/
    0/
      shot_description.json
      first_frame.png
      last_frame.png (if medium/large variation)
      video.mp4
    1/
      ...
  final_video.mp4                     # Final concatenated output

Customization

Using Your Own Reference Photos

To use real photos instead of AI-generated portraits:

# Build a portrait registry pointing to your photos
character_portraits_registry = {
    "Alice": {
        "front": {"path": "/path/to/alice_front.png", "description": "Front view of Alice"},
        "side": {"path": "/path/to/alice_side.png", "description": "Side view of Alice"},
        "back": {"path": "/path/to/alice_back.png", "description": "Back view of Alice"},
    }
}

# Pass to pipeline (skips portrait generation)
await pipeline(
    script=script,
    user_requirement=user_requirement,
    style=style,
    character_portraits_registry=character_portraits_registry,
)

Changing Models

Edit the YAML config to use different models:

Image: google/gemini-3.1-flash-image-preview (recommended for face identity)
Video: google/veo-3.1-generate-preview (recommended) or openai/sora-2
Chat: gpt-4.1 (recommended) or any OpenAI-compatible model

Troubleshooting

"No module named 'tools'" or similar import errors

Run from the ViMax root directory:

cd /home/node/a0/workspace/527fb591-1439-4b5b-ad5d-90f972773f95/workspace/tmp/ViMax
.venv/bin/python main_happycapy_script2video.py

API rate limit errors

Reduce max_requests_per_minute in the YAML config.

Face identity drift in generated videos

Add more physical detail to character descriptions in your script
Use user-provided reference photos instead of AI-generated portraits
Include extreme close-up shots for important characters
Keep lighting consistent across scenes

Related Skills

happycapy-ai/latex-document

tools

VerifiedTrustedCommunity

Universal LaTeX document skill: create, compile, and convert any document to professional PDF with PNG previews. Supports resumes, reports, cover letters, invoices, academic papers, theses/dissertations, academic CVs, presentations (Beamer), scientific posters, formal letters, exams/quizzes, books, cheat sheets, reference cards, exam formula sheets, fillable PDF forms (hyperref form fields), conditional content (etoolbox toggles), mail merge from CSV/JSON (Jinja2 templates), version diffing (latexdiff), charts (pgfplots + matplotlib), tables (booktabs + CSV import), images (TikZ), Mermaid diagrams, AI-generated images, watermarks, landscape pages, bibliography/citations (BibTeX/biblatex), multi-language/CJK (auto XeLaTeX), algorithms/pseudocode, colored boxes (tcolorbox), SI units (siunitx), Pandoc format conversion (Markdown/DOCX/HTML ↔ LaTeX), and PDF-to-LaTeX conversion of handwritten or printed documents (math, business, legal, general). Compile script supports pdflatex, xelatex, lualatex with auto-detection, latexmk backend, texfot log filtering, PDF/A output, and verbosity control (--verbose/--quiet). Empirically optimized scaling: single agent 1-10 pages, split 11-20, batch-7 pipeline 21+. Use when user asks to: (1) create a resume/CV/cover letter, (2) write a LaTeX document, (3) create PDF with tables/charts/images, (4) compile a .tex file, (5) make a report/invoice/presentation, (6) anything involving LaTeX or pdflatex, (7) convert/OCR a PDF to LaTeX, (8) convert handwritten notes, (9) create charts/graphs/diagrams, (10) create slides, (11) write a thesis or dissertation, (12) create an academic CV, (13) create a poster, (14) create an exam/quiz, (15) create a book, (16) convert between document formats (Markdown, DOCX, HTML to/from LaTeX), (17) generate Mermaid diagrams for LaTeX, (18) create a formal business letter, (19) create a cheat sheet or reference card, (20) create an exam formula sheet or crib sheet, (21) condense lecture notes/PDFs into a cheat sheet, (22) create a fillable PDF form with text fields/checkboxes/dropdowns, (23) create a document with conditional content/toggles (show/hide sections), (24) generate batch/mail-merge documents from CSV/JSON data, (25) create a version diff PDF (latexdiff) highlighting changes between documents, (26) create a homework or assignment submission with problems and solutions, (27) create a lab report with data tables, graphs, and error analysis, (28) encrypt or password-protect a PDF, (29) merge multiple PDFs into one, (30) optimize/compress a PDF for web or email, (31) lint or check a LaTeX document for common issues, (32) count words in a LaTeX document, (33) analyze document statistics (figures, tables, citations), (34) fetch BibTeX from a DOI, (35) convert a Graphviz .dot file to PDF/PNG, (36) convert a PlantUML .puml file to PDF/PNG, (37) create a one-pager/fact sheet/executive summary, (38) create a datasheet or product specification sheet, (39) extract pages from a PDF (page ranges, odd/even), (40) check LaTeX package availability before compiling, (41) analyze citations and cross-reference with .bib files, (42) debug LaTeX compilation errors, (43) make a document accessible (PDF/A, tagged PDF), (44) create lecture notes or course handouts, (45) fill an existing PDF form (fillable fields or non-fillable with annotations), (46) extract text or tables from a PDF (pdfplumber, pypdf), (47) OCR a scanned PDF to text (pytesseract), (48) create a PDF programmatically with reportlab (Canvas, Platypus), (49) rotate or crop PDF pages (pypdf), (50) add a watermark to an existing PDF, (51) extract metadata from a PDF (title, author, subject).

96SKILL.mdUpdated Apr 25, 2026

happycapy-ai/latex-document

happycapy-ai/image-enhancer

testing

VerifiedTrustedCommunity

Improves the quality of images, especially screenshots, by enhancing resolution, sharpness, and clarity. Perfect for preparing images for presentations, documentation, or social media posts.

96SKILL.mdUpdated Apr 25, 2026

happycapy-ai/image-enhancer

happycapy-ai/happycapy-social-publisher

tools

VerifiedTrustedCommunity

HappyCapy-specific skill for publishing content to 13+ social media platforms (Instagram, Twitter, LinkedIn, Threads, Facebook, TikTok, YouTube, Pinterest, Reddit, Telegram, Discord, etc.) simultaneously with platform-optimized styles, optional AI-generated media (video/image), and smart error handling. Uses Late MCP integration available in HappyCapy environment. Use when you need to cross-post to social media, create multi-platform marketing content, share announcements across platforms, publish with platform-specific adaptations, generate AI media for posts, or manage social media publishing workflows. Supports interactive content creation with user-guided platform selection, media generation choices, preview before publish, and automatic retry with character limit adjustments.

96SKILL.mdUpdated Apr 25, 2026

happycapy-ai/happycapy-social-publisher

happycapy-ai/happycapy-skill-creator

development

VerifiedTrustedCommunity

Automate HappyCapy skill creation by finding and adapting existing skills from anthropics/skills repository. Handles environment constraints (Python 3.11, Node.js 24, no Docker). Use when user wants to create or adapt skills for specific tasks.

96SKILL.mdUpdated Apr 25, 2026

happycapy-ai/happycapy-skill-creator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/happycapy-ai/happycapy-skills.git

# Copy into Claude Code skills folder (global)
cp -r happycapy-skills/skills/capy-video-gen-skill ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

happycapy-ai/happycapy-skills

96 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT