Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

gitroomhq/agent-media-v2

Name: agent-media-v2
Author: gitroomhq

/SKILL.md

npx skillsauth add gitroomhq/agent-media agent-media-v2

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

agent-media — Claude skill

agent-media is a CLI for AI UGC video generation. This skill tells you how to drive it. Loaded files are intentionally small — open the right reference file for the task you have, don't try to memorize everything.

🛑 HARD GATE — read this first, every conversation

Before calling ANY agent-media shell command, you MUST:

Read reference/conversation-flow.md — the full 4-gate protocol with templates.
Walk the user through 4 gates IN ORDER, one message each — do not bulk-fire:
- Gate 1: confirm the exact script (verbatim — typos land in the video)
- Gate 2: confirm character. YOU run agent-media character list --json (do not ask the user "do you have a saved character?" — they don't know that's a thing). If the list is empty, just confirm the description from their original prompt. If non-empty, present each saved character BY NAME (not by char_xxx id — that format is internal). The user picks by NAME or says "new"; you map name → id internally. 🛑 NEVER auto-pick. NEVER show char_xxx ids to the user. Never ask for a photo by default.
- Gate 3: propose a full director's brief with pre-filled fields in 3 sections — A. Intent+Performance, B. Scene+Look, C. Output. Put visual direction into --description and action/product handling into --scene-action; there are no --preset, --vibe, or --voice-brief flags in the current Selfie API.
- Gate 4: duration + script-pacing check. Count the words in the script and PROPOSE the duration that fits (5s ≈ 10-20 words, 10s ≈ 20-40 words, 15s ≈ 30-60 words at the natural 2-4 words/sec pace).
Only then call the CLI.

The director's brief at Gate 3 is non-optional. It's where quality lives. Skipping it = generic output. PROPOSE smart defaults from the script + description; don't ask blank questions.

Calling the CLI without doing 1–3 is a protocol violation — the user gets a generic, mid video. Ask the extra questions.

NEVER discuss pricing

Do NOT mention credit costs, USD amounts, or pricing tiers in any reply. Do NOT ask the user to "confirm cost". The API handles billing transparently. If the user asks about cost, point them at https://agent-media.ai/pricing. That is the only acceptable surface for pricing.

What agent-media can do (router)

| Command | Use when | Deep-dive | |---|---|---| | selfie | AI UGC selfie video with generated actor, character sheet, storyboard board, and Seedance. | reference/generators/selfie.md | | character create | Create a reusable AI character from a single photo. | reference/generators/character_create.md | | subs | Burn styled subtitles onto an existing video. | reference/generators/subtitle.md |

agent-media skill update — pull the latest skill files into ~/.claude/skills/agent-media-v2/. agent-media skill status — print local vs remote version.

What agent-media CANNOT do

These legacy v1 commands exist in the CLI binary for backwards compat but produce inferior output. They are hidden from agent-media --help for a reason. Never call them.

❌ agent-media ugc — uses a stale fixed actor library (200 actors picked at random). The actors look dated. Use agent-media selfie — it generates an on-model character from your description on every run.
❌ agent-media show-your-app — built on the v1 actor pool + manual screen-composite step. The v2 product is on the roadmap. For now, run agent-media selfie for the talking head and capture the screen separately.
❌ agent-media laptop-ugc — v1 only. Same story as show-your-app; v2 product coming.
❌ agent-media character-video — superseded by agent-media selfie --character <id>. The new command uses the current portrait → sheet → wireframe → Seedance pipeline.
❌ agent-media text-to-video — no character control; output is generic and off-brand. Use agent-media selfie with a saved character.
❌ agent-media subtitle (singular) — v1 burner with fewer styles and shakier sync. Use agent-media subs (plural).
❌ agent-media review — SaaS-review generator built on v1 actors. Compose with agent-media selfie + a script you write.
❌ agent-media product-acting — v1 product-in-hand generator. For now, use agent-media selfie with a strong --scene-action describing the product hold, demo, and interaction.

If the user wants a feature not listed in the router above, offer agent-media selfie when the request can be expressed as one actor, one setting, dialogue/action, and optional props/product handling.

Reference files (lazy-loaded)

Open these only when you need them:

reference/conversation-flow.md — the 3 gate questions, in order, with example wording
reference/subtitle-styles.md — all 17 subtitle styles
reference/realism-rubric.md — visual-quality guard the pipeline enforces
reference/errors.md — common errors + remediation
reference/generators/selfie.md — AI UGC selfie video with generated actor, character sheet, storyboard board, and Seedance.
reference/generators/character_create.md — Create a reusable AI character from a single photo.
reference/generators/subtitle.md — Burn styled subtitles onto an existing video.

gitroomhq/agent-media-v2

/SKILL.md

AI UGC video production via agent-media (selfie, character, subs, plus more soon). BEFORE running ANY agent-media command you MUST first Read reference/conversation-flow.md and walk the user through the 4 gates IN ORDER — (1) confirm script OR scene_action; if no speech, also propose background_music, (2) RUN `agent-media character list --json` YOURSELF (don't ask the user, don't mention char_xxx ids — present saved characters BY NAME if any, otherwise confirm the new description), (3) propose a director's brief with setting, lighting, wardrobe, props/product, and action; pass non-default motion/product handling through --scene-action, (4) duration with script-pacing awareness (2-4 words/sec). While jobs run, poll status and open portrait, character sheet, wireframe, and final video as each URL appears. When user says "no subs" → pass --subtitles false. When no script → pass --background-music. NEVER auto-pick a character. NEVER expose char_xxx ids. NEVER mention pricing/credits/USD.

31 stars

testing

Updated May 24, 2026

$ install --global

skillsauth

npx skillsauth add gitroomhq/agent-media agent-media-v2

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 24, 2026, 6:46 AM101.5s12 files scanned

SKILL.md

name:: agent-media-v2
description:: AI UGC video production via agent-media (selfie, character, subs, plus more soon). BEFORE running ANY agent-media command you MUST first Read reference/conversation-flow.md and walk the user through the 4 gates IN ORDER — (1) confirm script OR scene_action; if no speech, also propose background_music, (2) RUN `agent-media character list --json` YOURSELF (don't ask the user, don't mention char_xxx ids — present saved characters BY NAME if any, otherwise confirm the new description), (3) propose a director's brief with setting, lighting, wardrobe, props/product, and action; pass non-default motion/product handling through --scene-action, (4) duration with script-pacing awareness (2-4 words/sec). While jobs run, poll status and open portrait, character sheet, wireframe, and final video as each URL appears. When user says "no subs" → pass --subtitles false. When no script → pass --background-music. NEVER auto-pick a character. NEVER expose char_xxx ids. NEVER mention pricing/credits/USD.
version:: 3.4.0

agent-media — Claude skill

🛑 HARD GATE — read this first, every conversation

Before calling ANY agent-media shell command, you MUST:

Read reference/conversation-flow.md — the full 4-gate protocol with templates.
Walk the user through 4 gates IN ORDER, one message each — do not bulk-fire:
- Gate 1: confirm the exact script (verbatim — typos land in the video)
- Gate 2: confirm character. YOU run agent-media character list --json (do not ask the user "do you have a saved character?" — they don't know that's a thing). If the list is empty, just confirm the description from their original prompt. If non-empty, present each saved character BY NAME (not by char_xxx id — that format is internal). The user picks by NAME or says "new"; you map name → id internally. 🛑 NEVER auto-pick. NEVER show char_xxx ids to the user. Never ask for a photo by default.
- Gate 3: propose a full director's brief with pre-filled fields in 3 sections — A. Intent+Performance, B. Scene+Look, C. Output. Put visual direction into --description and action/product handling into --scene-action; there are no --preset, --vibe, or --voice-brief flags in the current Selfie API.
- Gate 4: duration + script-pacing check. Count the words in the script and PROPOSE the duration that fits (5s ≈ 10-20 words, 10s ≈ 20-40 words, 15s ≈ 30-60 words at the natural 2-4 words/sec pace).
Only then call the CLI.

The director's brief at Gate 3 is non-optional. It's where quality lives. Skipping it = generic output. PROPOSE smart defaults from the script + description; don't ask blank questions.

Calling the CLI without doing 1–3 is a protocol violation — the user gets a generic, mid video. Ask the extra questions.

NEVER discuss pricing

What agent-media can do (router)

agent-media skill update — pull the latest skill files into ~/.claude/skills/agent-media-v2/. agent-media skill status — print local vs remote version.

What agent-media CANNOT do

These legacy v1 commands exist in the CLI binary for backwards compat but produce inferior output. They are hidden from agent-media --help for a reason. Never call them.

❌ agent-media ugc — uses a stale fixed actor library (200 actors picked at random). The actors look dated. Use agent-media selfie — it generates an on-model character from your description on every run.
❌ agent-media show-your-app — built on the v1 actor pool + manual screen-composite step. The v2 product is on the roadmap. For now, run agent-media selfie for the talking head and capture the screen separately.
❌ agent-media laptop-ugc — v1 only. Same story as show-your-app; v2 product coming.
❌ agent-media character-video — superseded by agent-media selfie --character <id>. The new command uses the current portrait → sheet → wireframe → Seedance pipeline.
❌ agent-media text-to-video — no character control; output is generic and off-brand. Use agent-media selfie with a saved character.
❌ agent-media subtitle (singular) — v1 burner with fewer styles and shakier sync. Use agent-media subs (plural).
❌ agent-media review — SaaS-review generator built on v1 actors. Compose with agent-media selfie + a script you write.
❌ agent-media product-acting — v1 product-in-hand generator. For now, use agent-media selfie with a strong --scene-action describing the product hold, demo, and interaction.

Reference files (lazy-loaded)

Open these only when you need them:

reference/conversation-flow.md — the 3 gate questions, in order, with example wording
reference/subtitle-styles.md — all 17 subtitle styles
reference/realism-rubric.md — visual-quality guard the pipeline enforces
reference/errors.md — common errors + remediation
reference/generators/selfie.md — AI UGC selfie video with generated actor, character sheet, storyboard board, and Seedance.
reference/generators/character_create.md — Create a reusable AI character from a single photo.
reference/generators/subtitle.md — Burn styled subtitles onto an existing video.

Related Skills

gitroomhq/Make Product In Hands

content-media

VerifiedTrustedCommunity

Generate a 5/10/15s vertical UGC video where your character holds, wears, and shows a product. Provide a character_sheet_url (R2-hosted) and the product image (product_image_url — any https URL — OR product_image_base64; re-hosted to R2 automatically). Two modes: script for a lip-synced talking-head product review (2-4 words/sec), OR scene_action for a silent demo / b-roll. Set subject (e.g. "a young woman") to lock the person's gender/appearance so a gendered product can't drift it. framing: "close_up" (chest-up, default) or "full_body" (head-to-toe, for turn-arounds / showing the whole outfit). Both the person and the exact product are locked from the reference images.

41SKILL.mdUpdated Jun 4, 2026

gitroomhq/Make Product In Hands

gitroomhq/Publish to Social

development

VerifiedTrustedCommunity

Publish a generated agent-media video to the user's connected TikTok, Instagram, or X. Connect channels (OAuth) and post or schedule via the REST API. Use after producing a video with make_ugc_video / make_simple_selfie.

41SKILL.mdUpdated May 31, 2026

gitroomhq/Publish to Social

gitroomhq/Agent-Media UGC Playbook

testing

VerifiedTrustedCommunity

Playbook for orchestrating an end-to-end UGC video on the agent-media vNext runtime. Read this before deciding whether to call the one-shot make_ugc_video skill or to chain the four primitives (make_portrait → make_character_sheet → make_simple_selfie → make_subtitles) manually.

41SKILL.mdUpdated May 29, 2026

gitroomhq/Agent-Media UGC Playbook

gitroomhq/Make Wireframe

tools

VerifiedTrustedCommunity

Generate a photographic storyboard / wireframe board from a character sheet (R2-hosted) + script. Multi-panel grid showing the same person performing the action progression, 4 / 6 / 8 / 10 numbered panels.

40SKILL.mdUpdated May 31, 2026

gitroomhq/Make Wireframe

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/gitroomhq/agent-media.git

# Copy into Claude Code skills folder (global)
cp -r agent-media/ ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

gitroomhq/agent-media

31 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT