skills/youtube-publish/SKILL.md
End-to-end YouTube publishing workflow using ordered scripts: prepare/concat video, upload draft, transcribe with Parakeet, generate copy with the calling model, optionally prepare English dubbing assets, render thumbnails, update YouTube metadata, then schedule socials (PostFlow) 15 minutes after publish.
npx skillsauth add antoniolg/agent-kit youtube-publishInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use scripts in order. Stop for validation after copy + thumbnail generation. If the user did not already specify them, ask up front for:
Rule: the English X variant depends on the English YouTube dubbing pack. It is valid to do English dubbing for YouTube without doing the English X variant, but not the other way around.
RIP, Increíble, Brutal, Locura, Definitivo, ¿El fin de...?, and crown/fire emojis.YYYY-MM-DD HH:MM using system time and pass --publish-at + --timezone. Always determine and pass --timezone.Chapters: block with the same timestamps as ## Capítulos (final), translated naturally. Do not leave chapters only in the Spanish description.antonio (assets/antonio-1.png, antonio-2.png, antonio-3.png). If the user indicates the video is from Nino, switch presenter to nino (assets/nino-1.png, nino-2.png, nino-3.png). Keep only two non-negotiables: (1) massive bold white text (max 3-4 words), (2) cinematic dark look with cyan/magenta accents. Everything else should adapt to the video's narrative with maximum creative freedom.<workdir>/images/refs/, and use it as a visual reference for thumbnail generation. If the logo cannot be found quickly, explicitly instruct the image model to include the tool's logo/app icon by name, while avoiding fake or distorted brand marks when accuracy matters.imagegen preserves presenter identity and brand marks much better when reference images are manually attached in the chat input. Before any thumbnail generation or edit that must preserve Antonio/Nino identity or a tool logo, open <workdir>/images/refs/ for the user and explicitly ask them to attach the 3 presenter reference images and any relevant tool/logo reference in the message box. Do not call imagegen until the user confirms they have attached those images or explicitly asks to continue without them. Showing local Markdown images in chat is not enough for this gate.imagegen (image_gen) as the generation engine for the 3 Spanish thumbnail drafts. Exception: use the nano-banana-pro skill for English thumbnail localization/editing from the selected Spanish thumbnail.imagegen for a localized edit with hard invariants: same person, same layout, same headline unless changing text, same crop, same lighting, and no reinterpretation.nano-banana-pro (uv run /Users/antonio/Projects/antoniolg/agent-kit/skills/nano-banana-pro/scripts/generate_image.py --input-image ...). Edit only the main headline text into English. Keep the composition, styling, and identity intact.audio/dubbed.en.m4a, then build the English native X video with the English-edited thumbnail.unlisted, insert promo comment (Domina la IA...), then set final status (private with publishAt if scheduled, otherwise private).Programación (final): either a date YYYY-MM-DD HH:MM or private.prepare_video.py runs audio normalization in auto mode by default. It analyzes LUFS/true peak/LRA and only re-encodes audio when out of target.update_youtube.py, ensure the final thumbnail is accepted by YouTube: JPG/PNG, 16:9, preferably 1280x720, and under 2 MB. If a generated PNG is too large, create a compressed JPG sibling and upload that.privacyStatus and publishAt, then verify PostFlow using postflow schedule list --view posts for the target date. Do not rely on guessed CLI commands.content.md, .state/video_id.txt, .state/video_url.txt, images/final/thumb.es.png, video/video-x.es.mp4, transcripts/transcript.es.cleaned.srt, and, for multilingual runs, transcripts/transcript.en.srt, audio/dubbed.en.m4a, images/final/thumb.en.png. If any required artifact is missing, create it or explicitly report why it is missing..state/ whenever practical, for example .state/youtube_verification.json and .state/postflow_verification.json, so later checks do not depend only on chat history.--max-words 14 --max-duration 5.2 --silence-gap 0.35 by default), then keep the post-processing readability splitter as a safety net. Prefer max ~2 lines, about 80-90 characters per cue, and avoid long 10+ second cues with full paragraphs. Keep the dubbing SRT separate because dubbing benefits from longer natural speech units.content.md is the human/editorial source of truth for all copy in Spanish and English. Technical state for scripts may live outside it, but must be isolated under .state/. Media artifacts must be grouped by type:
.state/ for script state such as video_id.txt, video_url.txt, YouTube verification JSON, and PostFlow verification JSON.transcripts/ for all SRT files.images/refs/ for presenter/logo/reference images.images/drafts/ for generated thumbnail drafts and their prompts.images/final/ for accepted Spanish and English thumbnails.audio/ for dubbed audio exports (.wav, .m4a).video/ for derived videos such as dubbed previews and native X variants.title.en.txt, description.en.txt, linkedin.final.txt, or description.final.es.txt as canonical artifacts. Put those texts in content.md. If a downstream script requires a plain text file, create it as a temporary/generated helper under .state/ or tmp/, and keep content.md as the canonical copy.Prepare video
python scripts/prepare_video.py --videos /path/v1.mp4 [/path/v2.mp4 ...]
--audio-normalization auto targets -14 LUFS, -1 dBTP, and max LRA 9; it skips normalization when already in range.--audio-normalization always to force normalization.--audio-normalization off to skip analysis and normalization.workdir, video, slug.mkdir -p <workdir>/.state <workdir>/transcripts <workdir>/images/refs <workdir>/images/drafts <workdir>/images/final <workdir>/audio <workdir>/video <workdir>/tmp
prepare_video.py. Derived videos should go under <workdir>/video/.Prepare reference images and manual attachment gate
<workdir>/images/refs/:
cp assets/antonio-1.png assets/antonio-2.png assets/antonio-3.png <workdir>/images/refs/
Use nino-1.png, nino-2.png, nino-3.png only when the user explicitly says the video is from Nino.<workdir>/images/refs/ with a clear name, for example codex-logo.png.open <workdir>/images/refs
Upload draft (private)
python scripts/upload_draft.py --video <video> --output-video-id <workdir>/.state/video_id.txt --client-secret <path>
.state/video_id.txt and create .state/video_url.txt. Also record the video ID and URL in content.md under the YouTube section for human review.Transcribe + clean
python scripts/transcribe_parakeet.py --video <video> --out-dir <workdir> --max-words 14 --max-duration 5.2 --silence-gap 0.35
<workdir>/transcripts/transcript.es.cleaned.srt<workdir>/transcripts/transcript.es.dub.srt (same transcript resegmented into more natural dubbing units)transcript.es.cleaned.srt is the user-facing subtitle file. It must be resegmented into readable cues before upload or translation. Do not use the long Parakeet raw cues directly for YouTube subtitles.transcript.es.dub.srt is only for dubbing and may use longer speech units.<workdir>/transcripts/ before continuing and use the organized paths from then on.cp <workdir>/transcripts/transcript.es.cleaned.srt ~/Documents/aipal/transcripts/<YYYY-MM-DD>-<slug>.srt
Prepare English transcript + dubbing assets (when multilingual output is requested)
<workdir>/transcripts/transcript.es.dub.srt when present (fallback: transcripts/transcript.es.cleaned.srt) and create:
<workdir>/transcripts/transcript.en.srt translated to natural English while preserving timestamps.content.md under ## Title (EN) and ## Description (EN), not as canonical standalone files.## Description (EN), include an English Chapters: block using the exact timestamps from ## Capítulos (final).youtube-dubber project.scripts/dub_voxtral.pyvoxtral-mini-tts-latest<workdir>/audio/dubbed.en.m4a as the upload-friendly audio-only export for YouTube Studio<workdir>/video/dubbed.en.mp4 by default. Only create a muxed English preview video if explicitly requested..m4a/AAC unless the user asked for another format.Generate copy with the calling model
<workdir>/transcripts/transcript.es.cleaned.srt directly and generate:
<workdir>/content.md.content.md under ## Ideas de thumbnails. If a script needs JSON, create <workdir>/.state/ideas.json as technical state with this shape:
{
"titles": ["...", "...", "..."],
"thumbnails": [
{"text": "...", "artifact": "...", "concept": "..."},
{"text": "...", "artifact": "...", "concept": "..."},
{"text": "...", "artifact": "...", "concept": "..."}
]
}
content.md contains at least these sections so downstream validation stays compatible:
# Pack YouTube — <slug>
## YouTube
Video ID:
URL:
Status:
Publish at:
## Títulos
- ...
- ...
- ...
## Ideas de thumbnails
1. Texto: ...
Artifact: ...
Concept: ...
## Descripción
...
## Capítulos
00:00 ...
## LinkedIn
...
## Título (final)
## Descripción (final)
## Capítulos (final)
## Post LinkedIn (final)
## Thumbnail (final)
## Programación (final)
(YYYY-MM-DD HH:MM o "private")
## Title (EN)
## Description (EN)
## Assets
Thumbnail ES:
Thumbnail EN:
Transcript ES:
Transcript EN:
Audio EN:
Generate 3 thumbnails
antonio, and nino only when explicitly requested for a Nino video. Copy the reference presenter photos into <workdir>/images/refs/. Create 3 images into <workdir>/images/drafts/thumb-1.png, thumb-2.png, thumb-3.png.<workdir>/images/drafts/thumb-1.prompt.txt, thumb-2.prompt.txt, thumb-3.prompt.txt.imagegen for each thumbnail. Before generation, confirm the user has manually attached the presenter reference images in the message box. You may also load/show the three presenter reference images in chat for visual confirmation, but that does not replace manual attachment.imagegen saves under ~/.codex/generated_images/... by default. After each generation, copy the selected output into <workdir>/images/drafts/thumb-N.png; leave the original generated file in place.nano-banana-pro.7b. Edit selected thumbnail when requested
imagegen as an edit, with hard invariants: same person, same layout, same crop, same lighting, same main text unless that text is the requested edit, and no reinterpretation.<workdir>/images/drafts/ and visually inspect them against the original.7c. Create English thumbnail localization
nano-banana-pro skill, not imagegen.--input-image and save the result as <workdir>/images/final/thumb.en.png.uv run /Users/antonio/Projects/antoniolg/agent-kit/skills/nano-banana-pro/scripts/generate_image.py --prompt 'Edit this YouTube thumbnail. Change ONLY the main headline text from Spanish to English: replace it with exactly "<ENGLISH HEADLINE>". Keep identical: the same person identity and face, pose, crop, composition, logo, dark cinematic background, lighting, typography style, text size, text placement, and overall thumbnail design. Do not add new objects. Do not change the person or reinterpret the scene.' --filename <workdir>/images/final/thumb.en.png --input-image <workdir>/images/final/thumb.es.png --resolution 2K
Stop to ask for validation of:
## Title (EN) in <workdir>/content.md## Description (EN) in <workdir>/content.md## Description (EN) includes translated chapters with the same timestamps as ## Capítulos (final).nano-banana-pro after the user confirms the final thumbnail and before any final X-video build. Save the accepted Spanish thumbnail as <workdir>/images/final/thumb.es.png and the English thumbnail as <workdir>/images/final/thumb.en.png.Update YouTube
ffmpeg -y -i <thumb.png> -vf scale=1280:-2 -q:v 3 <thumb-upload.jpg>
python scripts/update_youtube.py --video-id <id> --title "..." --description-file <workdir>/.state/description.final.es.txt --thumbnail <workdir>/images/final/thumb.es.png --publish-at "YYYY-MM-DD HH:MM" --timezone <IANA> --client-secret <path>
description.final.es.txt, generate it from the final description and chapters in content.md under .state/ immediately before the update. Do not treat it as canonical copy.privacyStatus is privatepublishAt matches the requested schedule in UTC<workdir>/.state/youtube_verification.json.Build native X video variant (after thumbnail choice)
python scripts/build_x_native_video.py --video <video.mp4> --thumbnail <workdir>/images/final/thumb.es.png --output <workdir>/video/video-x.es.mp4 --intro-ms 500
<workdir>/audio/dubbed.en.m4a, then build:
python scripts/build_x_native_video.py --video <workdir>/tmp/dubbed.en.for-x.mp4 --thumbnail <workdir>/images/final/thumb.en.png --output <workdir>/video/video-x.en.mp4 --intro-ms 500
python scripts/schedule_socials.py --text-file <workdir>/.state/linkedin.final.txt --scheduled-date <ISO8601+offset> --comment-url <video_url> --image <workdir>/images/final/thumb.es.png
linkedin.final.txt, generate it from ## Post LinkedIn (final) in content.md under .state/ immediately before scheduling. Do not treat it as canonical copy.schedule_socials.py percent-encodes underscores in the --comment-url (e.g. _ -> %5F) to avoid LinkedIn URL formatting issues.postflow --json schedule list --view posts --from <day-start-iso> --to <day-end-iso>
postflow posts list.<workdir>/.state/postflow_verification.json.<workdir>/audio/dubbed.en.m4a) and apply the English title/description from content.md in the YouTube Studio multi-language UI.video/video-x.es.mp4 and the English video/video-x.en.mp4.tools
Use the private LearnWorlds CLI to inspect DevExpert Academy users, find students by email, list their enrolled courses/products, look up products, and perform safe enrollment workflows. Trigger when Antonio asks what courses a student has in LearnWorlds or academia.devexpert.io, whether someone belongs to the current or next AI Expert edition, or to use the LearnWorlds/academy CLI.
tools
Orchestrates Android development tasks including project creation, deployment, SDK management, and environment diagnostics using the `android` command-line tool.
development
Elite website image-to-code skill for Codex. For visually important web tasks, it must first generate the design image(s) itself, deeply analyze them, then implement the website to match them as closely as possible. In Codex, it must prefer large, readable, section-specific images instead of tiny compressed boards, generate fresh standalone images for sections or detail views instead of cropping old ones, avoid lazy under-generation, avoid cards-inside-cards-inside-cards UI, and keep the hero clean, spacious, readable, and visible on a small laptop.
development
Create, repair, validate, preview, and package Codex-compatible animated pet spritesheets from character art, screenshots, generated images, or visual references. Use when a user wants to hatch a Codex pet, create a custom animated pet, or build a built-in pet asset with an 8x9 atlas, transparent unused cells, row-by-row animation prompts, QA contact sheets, preview videos, and pet.json packaging. This skill composes the installed $imagegen system skill for visual generation and uses bundled scripts for deterministic spritesheet assembly.