library/motion/ai-fight-scene/SKILL.md
Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard. Stacks GPT-Image-2 (character sheet + storyboard), Nano-Banana-2 (environment concept), and Seedance 2.0 i2v.
npx skillsauth add samuraigpt/embedai muapi-ai-fight-sceneInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate a high-cut-density action / fight scene by first composing a 16-cell storyboard image, then driving Seedance 2.0 image-to-video off that storyboard.
The core idea: action tension comes from cut density, not single-shot quality. Forcing the video model to follow a pre-drawn 4×4 storyboard grid gives you 16 distinct shots in a 15-second clip — landing punches, reverse angles, ECUs, whip-pans — that no t2v prompt could choreograph on its own.
| Name | Type | Required | Default | Description |
|:---|:---|:---|:---|:---|
| character_description | text | yes | — | Full physical description of the fighter(s). Asymmetric details (eye colour, scar side, holster on left hip) help the model preserve identity across panels. |
| environment_description | text | yes | — | The scene setting — e.g. "cyberpunk wet back-alley, neon kanji signage, Stray-game aesthetic, rain on chrome." |
| action_script | text | yes | — | The action beat — prose or numbered beats. E.g. "Hero is cornered → blocks first punch → counter-elbow → throw opponent into trash cans → finisher." |
| style_direction | text | no | cinematic action film, anamorphic lens, high contrast, motion blur on hits | Aesthetic / look tags applied to every frame. |
| duration | int | no | 15 | Final video length in seconds. The storyboard's 16 cells map roughly 1 shot per second at default. |
| aspect_ratio | text | no | 16:9 | Output aspect — 16:9 cinematic, 9:16 vertical, 1:1 square. |
Generate a clean turnaround-style character sheet using muapi image generate (model=gpt-image-2-text-to-image):
Character reference sheet of {{character_description}}. Three views — front, 3/4, profile — on a neutral grey backdrop. Studio lighting, full body, no text overlays, photoreal. Asymmetric identifying details preserved on the correct side. {{style_direction}}.3:2Present the character sheet and confirm identity details look right before proceeding. This image becomes reference #1 for later phases.
Use muapi image generate (model=nano-banana-2) to design the scene/world:
Wide establishing shot of {{environment_description}}. No characters in frame — environment only. Strong perspective lines, depth, atmospheric haze. {{style_direction}}. Production-design concept art.{{aspect_ratio}}Nano-Banana-2 is chosen here for its reasoning-driven composition — it's better than text-to-image-only models at producing locations with believable spatial logic (chokepoints, cover, sightlines) that an action scene can use. Present for approval. This becomes reference #2.
Compose the action onto a single 4×4 storyboard image using muapi image edit (model=gpt-image-2-image-to-image):
Compose a 4×4 storyboard grid (16 numbered cells) for the following action sequence:
{{action_script}}
CHARACTER (use reference image 1 identity throughout, asymmetric details preserved):
{{character_description}}
LOCATION (use reference image 2 spatial layout):
{{environment_description}}
Each cell labels: SHOT # (1–16) · SIZE (WIDE / MS / CU / ECU) · CAMERA-MOVE arrow (push, pull, whip, dolly, crash-zoom, handheld) · 1-word RHYTHM note (BEAT / IMPACT / RECOVERY / RESET).
Vary shot size aggressively — never two WIDEs in a row. Land every IMPACT on a CU or ECU.
Hand-drawn comic-book ink-and-wash style, monochrome with selective red accents on hits.
Numbered cells, clear gutters between panels.
Aesthetic: {{style_direction}}.
1:1 (square works best for a 4×4 grid)Present the storyboard to the user. Confirm:
If a panel reads poorly, regenerate just the storyboard with that cell's note bolded ("CELL 7 must be an ECU on the right fist").
Hand the storyboard to muapi video from-image (model=seedance-v2.0-i2v):
Generate a {{duration}}-second action sequence that strictly follows the 16-cell storyboard reference image, cell-by-cell, top-left to bottom-right.
- Honour each cell's labelled SHOT SIZE and CAMERA-MOVE — match cuts to the storyboard's rhythm notes.
- Strong cinematic feel and shot language. Exaggerated dynamics. Hits land hard with motion blur and impact frames.
- Camera language: anamorphic, handheld where the storyboard calls for it, locked-off where it doesn't.
- Native audio: impact sfx on every IMPACT cell, footsteps, fabric/Foley, restrained low score under the action.
Action being rendered: {{action_script}}.
Aesthetic: {{style_direction}}.
{{duration}} (default 15){{aspect_ratio}}After generation, present the final video. If the cut density feels too low or shots don't match the storyboard, regenerate Phase D first (cheaper than rebuilding the storyboard) with the prompt emphasising "strict cell-by-cell adherence" more aggressively.
seedance-2.0-i2v-480p to draft. Cheaper preview pass before committing to the full-res seedance-v2.0-i2v run.fight scene, action sequence, storyboard to video, cut density, cinematic action, combat choreography, seedance 2 storyboard
character_description ──► [GPT-Image-2 t2i] ─► character sheet ──┐
│
environment_description ─► [Nano-Banana-2 t2i] ─► environment plate ┼─► [GPT-Image-2 i2i] ─► 16-cell storyboard ─► [Seedance 2.0 i2v] ─► 15s action video
│
action_script + style_direction ───────────────────────────────────►┘
muapi CLI commands. Use muapi auth configure first if MUAPI_API_KEY is unset.curl -X POST https://api.muapi.ai/api/v1/<endpoint> -H "x-api-key: $MUAPI_API_KEY" -H 'content-type: application/json' -d '{...}' and poll with muapi predict wait <request_id>.gpt-image-2-image-to-image, pass them as a list under images_list (or the model's documented multi-ref field).{{input_name}} placeholders with the user's actual inputs before issuing each call.development
Turn a portrait photo into a high-end editorial "Color Analysis Board" in a luxury fashion-magazine style (Dior / Ralph Lauren aesthetic) — best colors, undertone, makeup guide, capsule wardrobe, hair & jewelry recommendations, all laid out on a clean beige/ivory grid.
development
Turn a person photo + a product photo + an optional script into a vertical 9:16 UGC-style video ad. Generates a lifestyle hero image (Nano-Banana Pro Edit), then animates it with native audio using Seedance 2.0 VIP image-to-video.
development
Generate a cinematic "freeze effect" video where time stops mid-scene, the subject walks through the frozen world, then time resumes with a snap.
development
Design a high-CTR YouTube thumbnail — striking imagery, bold text placement, and emotional face/subject if needed.