skills/video-translate/SKILL.md
Translate and dub existing videos into multiple languages using HeyGen. Use when: (1) Translating a video into another language, (2) Dubbing video content with lip-sync, (3) Creating multi-language versions of existing videos, (4) Audio-only translation without lip-sync, (5) Working with HeyGen's /v2/video_translate endpoint.
npx skillsauth add heygen-com/skills video-translateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Translate and dub existing videos into multiple languages, preserving lip-sync and natural speech patterns. Provide a video URL or HeyGen video ID — no need to create the video on HeyGen first.
All requests require the X-Api-Key header. Set the HEYGEN_API_KEY environment variable.
curl -X POST "https://api.heygen.com/v2/video_translate" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{"video_url": "https://example.com/video.mp4", "output_language": "es-ES"}'
POST /v2/video_translate with the target languageGET /v2/video_translate/{translate_id} until status is completed| Field | Type | Req | Description |
|-------|------|:---:|-------------|
| video_url | string | Y* | URL of video to translate (or video_id) |
| video_id | string | Y | HeyGen video ID (*or video_url) |
| output_language | string | Y | Target language code (e.g., "es-ES") |
| title | string | | Name for the translated video |
| translate_audio_only | boolean | | Audio only, no lip-sync (faster) |
| speaker_num | number | | Number of speakers in video |
| callback_id | string | | Custom ID for webhook tracking |
| callback_url | string | | URL for completion notification |
Either video_url or video_id must be provided.
curl -X POST "https://api.heygen.com/v2/video_translate" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"video_url": "https://example.com/original-video.mp4",
"output_language": "es-ES",
"title": "Spanish Version"
}'
interface VideoTranslateRequest {
video_url?: string;
video_id?: string;
output_language: string;
title?: string;
translate_audio_only?: boolean;
speaker_num?: number;
callback_id?: string;
callback_url?: string;
}
interface VideoTranslateResponse {
error: null | string;
data: {
video_translate_id: string;
};
}
async function translateVideo(config: VideoTranslateRequest): Promise<string> {
const response = await fetch("https://api.heygen.com/v2/video_translate", {
method: "POST",
headers: {
"X-Api-Key": process.env.HEYGEN_API_KEY!,
"Content-Type": "application/json",
},
body: JSON.stringify(config),
});
const json: VideoTranslateResponse = await response.json();
if (json.error) {
throw new Error(json.error);
}
return json.data.video_translate_id;
}
import requests
import os
def translate_video(config: dict) -> str:
response = requests.post(
"https://api.heygen.com/v2/video_translate",
headers={
"X-Api-Key": os.environ["HEYGEN_API_KEY"],
"Content-Type": "application/json"
},
json=config
)
data = response.json()
if data.get("error"):
raise Exception(data["error"])
return data["data"]["video_translate_id"]
| Language | Code | Notes | |----------|------|-------| | English (US) | en-US | Default source | | Spanish (Spain) | es-ES | European Spanish | | Spanish (Mexico) | es-MX | Latin American | | French | fr-FR | Standard French | | German | de-DE | Standard German | | Italian | it-IT | Standard Italian | | Portuguese (Brazil) | pt-BR | Brazilian Portuguese | | Japanese | ja-JP | Standard Japanese | | Korean | ko-KR | Standard Korean | | Chinese (Mandarin) | zh-CN | Simplified Chinese | | Hindi | hi-IN | Standard Hindi | | Arabic | ar-SA | Modern Standard Arabic |
const config = {
video_url: "https://example.com/original.mp4",
output_language: "es-ES",
title: "Spanish Translation",
};
const config = {
video_url: "https://example.com/original.mp4",
output_language: "es-ES",
translate_audio_only: true,
};
const config = {
video_url: "https://example.com/interview.mp4",
output_language: "fr-FR",
speaker_num: 2,
};
For more control over translation:
interface VideoTranslateV4Request {
input_video_id?: string;
google_url?: string;
output_languages: string[]; // Multiple languages in one call
name: string;
srt_key?: string; // Custom SRT subtitles
instruction?: string;
vocabulary?: string[]; // Terms to preserve as-is
brand_voice_id?: string;
speaker_num?: number;
keep_the_same_format?: boolean;
input_language?: string;
enable_video_stretching?: boolean;
disable_music_track?: boolean;
enable_speech_enhancement?: boolean;
srt_role?: "input" | "output";
translate_audio_only?: boolean;
}
const config = {
input_video_id: "original_video_id",
output_languages: ["es-ES", "fr-FR", "de-DE"],
name: "Multi-language translations",
};
const config = {
video_url: "https://example.com/product-demo.mp4",
output_language: "ja-JP",
vocabulary: ["SuperWidget", "Pro Max", "TechCorp"],
};
const config = {
video_url: "https://example.com/video.mp4",
output_language: "es-ES",
srt_key: "path/to/custom-subtitles.srt",
srt_role: "input",
};
curl -X GET "https://api.heygen.com/v2/video_translate/{translate_id}" \
-H "X-Api-Key: $HEYGEN_API_KEY"
interface TranslateStatusResponse {
error: null | string;
data: {
id: string;
status: "pending" | "processing" | "completed" | "failed";
video_url?: string;
message?: string;
};
}
async function getTranslateStatus(translateId: string): Promise<TranslateStatusResponse["data"]> {
const response = await fetch(
`https://api.heygen.com/v2/video_translate/${translateId}`,
{ headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! } }
);
const json: TranslateStatusResponse = await response.json();
if (json.error) {
throw new Error(json.error);
}
return json.data;
}
Translations take longer than standard video generation — allow up to 30 minutes.
async function waitForTranslation(
translateId: string,
maxWaitMs = 1800000,
pollIntervalMs = 30000
): Promise<string> {
const startTime = Date.now();
while (Date.now() - startTime < maxWaitMs) {
const status = await getTranslateStatus(translateId);
switch (status.status) {
case "completed":
return status.video_url!;
case "failed":
throw new Error(status.message || "Translation failed");
default:
console.log(`Status: ${status.status}...`);
await new Promise((r) => setTimeout(r, pollIntervalMs));
}
}
throw new Error("Translation timed out");
}
async function translateAndDownload(
videoUrl: string,
targetLanguage: string
): Promise<string> {
console.log(`Starting translation to ${targetLanguage}...`);
const translateId = await translateVideo({
video_url: videoUrl,
output_language: targetLanguage,
});
console.log(`Translation ID: ${translateId}`);
console.log("Processing translation...");
const translatedVideoUrl = await waitForTranslation(translateId);
console.log(`Translation complete: ${translatedVideoUrl}`);
return translatedVideoUrl;
}
const spanishVideo = await translateAndDownload(
"https://example.com/my-video.mp4",
"es-ES"
);
Translate to multiple languages in parallel:
async function translateToMultipleLanguages(
sourceVideoUrl: string,
targetLanguages: string[]
): Promise<Record<string, string>> {
const results: Record<string, string> = {};
const translatePromises = targetLanguages.map(async (lang) => {
const translateId = await translateVideo({
video_url: sourceVideoUrl,
output_language: lang,
});
return { lang, translateId };
});
const translationJobs = await Promise.all(translatePromises);
for (const job of translationJobs) {
try {
const videoUrl = await waitForTranslation(job.translateId);
results[job.lang] = videoUrl;
} catch (error) {
results[job.lang] = `error: ${error.message}`;
}
}
return results;
}
const translations = await translateToMultipleLanguages(
"https://example.com/original.mp4",
["es-ES", "fr-FR", "de-DE", "ja-JP"]
);
disable_music_track: trueenable_speech_enhancement: trueCommon errors and how to handle them:
async function safeTranslate(
videoUrl: string,
targetLanguage: string
): Promise<{ success: boolean; result?: string; error?: string }> {
try {
const url = await translateAndDownload(videoUrl, targetLanguage);
return { success: true, result: url };
} catch (error) {
if (error.message.includes("quota")) {
return { success: false, error: "Insufficient credits" };
}
if (error.message.includes("duration")) {
return { success: false, error: "Video too long" };
}
if (error.message.includes("format")) {
return { success: false, error: "Unsupported video format" };
}
return { success: false, error: error.message };
}
}
tools
Translate and dub a video into another language with voice cloning and lip-sync, powered by HeyGen Video Translation. The presenter keeps their face, their voice is cloned into the target language, and lips re-sync to the new audio — viewers see the same person speaking natively. Use when: (1) localizing an existing video into one or more languages ("translate this video to Spanish", "make this in French and German", "dub this into Japanese", "I need this in 10 languages for a launch"), (2) the user has a finished video and wants the SAME presenter speaking another language (not a new presenter — that's heygen-video), (3) podcast / audio-only translation ("translate this podcast", "dub the audio but keep my video"), (4) high-stakes translations where the user wants to review/edit subtitles before final render (the proofreads workflow), (5) "translate my video", "dub this", "localize this clip", "make a multilingual version", "subtitle and dub". Returns the translated video URL (or audio file for audio-only mode), one per target language. Chain signal: if the user wants to CREATE a new video in another language (no source video exists yet), route to heygen-video and write the script in the target language — do not use heygen-translate. Use heygen-translate only when there is an existing source video to localize. NOT for: creating new videos from scratch (use heygen-video), avatar creation (use heygen-avatar), TTS-only synthesis (use heygen-video with audio-only output), or text-only translation.
development
Generate HeyGen presenter videos via the v3 Video Agent pipeline — handles Frame Check (aspect ratio correction), prompt engineering, avatar resolution, and voice selection. Required for any HeyGen video generation. Replaces deprecated endpoints with v3. Use when: (1) generating any HeyGen video (via API or otherwise), (2) sending a personalized video message (outreach, update, announcement, pitch, knowledge), (3) creating a HeyGen presenter-led explainer, tutorial, or product demo with a human face, (4) "make a video of me saying...", "send a video to my leads", "record an update for my team", "create a video pitch", "make a loom-style message", "I want to appear in this video", "generate a HeyGen video", "make a talking head video". Accepts avatar_id from heygen-avatar for identity-first HeyGen videos, or uses a stock presenter. Returns video share URL + HeyGen session URL for iteration. Chain signal: when the user wants to create/design an avatar AND make a video in the same request, run heygen-avatar first, then return here. Conjunctions to watch: "and then", "and immediately", "first...then", "X and make a video", "design [presenter] and record" = always CHAIN. If the user provides a photo AND wants a video, route to heygen-avatar first. NOT for: avatar creation or identity setup (use heygen-avatar first), cinematic footage or b-roll without a presenter, translating videos, TTS-only, or streaming avatars.
development
Create a persistent HeyGen avatar — a reusable face + voice identity for the agent, the user, or any named character — powered by HeyGen Avatar V technology. Prompt-based creation by default (description → HeyGen builds it); photo upload is optional for real-person digital twins. Use when: (1) giving the agent a face + voice so it can present videos ("bring yourself to life", "create your avatar", "give yourself an avatar", "design a presenter", "set up an avatar", "let's make an avatar"), (2) the user wants to appear in videos as themselves ("create my avatar", "I want my face in a video", "digital twin of me", "build me an avatar"), (3) building a named character presenter ("create an avatar called Cleo", "design a character named X"), (4) establishing HeyGen identity before making videos — the correct FIRST step when no avatar exists yet. Chain signal: when the user says both an identity/avatar action AND a video action in the same request ("create an avatar AND make a video", "set up identity THEN create a video", "design a presenter AND immediately record"), run heygen-avatar first, then heygen-video. Returns avatar_id + voice_id — pass directly to heygen-video to create HeyGen videos. NOT for: generating videos (use heygen-video), translating videos, or TTS-only tasks.
development
Create HeyGen avatar videos via the v3 Video Agent pipeline — handles avatar resolution, aspect ratio correction, prompt engineering, and voice selection automatically. Required for any HeyGen API usage (api.heygen.com). Replaces deprecated v1/v2 endpoints with the optimized v3 pipeline. Use when: (1) calling any HeyGen API endpoint (api.heygen.com), (2) creating a HeyGen avatar or digital twin from a photo, (3) making a personalized video message (outreach, pitch, update, announcement, knowledge), (4) "make a video of me", "create my HeyGen avatar", "I want to appear in this video", (5) "send a video to my leads", "record an update for my team", "make a loom-style message", (6) building identity-first videos where the presenter IS the user or agent, Covers: HeyGen API, api.heygen.com, video generate, avatar create, voice list, talking photo, HeyGen avatar creation, voice design, photo → digital twin, HeyGen video generation, identity-first video, messaging-first video, AI presenter, talking head video. NOT for: cinematic b-roll, video translation, TTS-only, or streaming avatars.