Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

LUAgam/fal-ai-media

Name: fal-ai-media
Author: LUAgam

.cursor/.agents/skills/fal-ai-media/SKILL.md

npx skillsauth add LUAgam/stage-harness fal-ai-media

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

fal.ai Media Generation

Generate images, videos, and audio using fal.ai models via MCP.

When to Activate

User wants to generate images from text prompts
Creating videos from text or images
Generating speech, music, or sound effects
Any media generation task
User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

MCP Requirement

fal.ai MCP server must be configured. Add to ~/.claude.json:

"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}

Get an API key at fal.ai.

MCP Tools

The fal.ai MCP provides these tools:

search — Find available models by keyword
find — Get model details and parameters
generate — Run a model with parameters
result — Check async generation status
status — Check job status
cancel — Cancel a running job
estimate_cost — Estimate generation cost
models — List popular models
upload — Upload files for use as inputs

Image Generation

Nano Banana 2 (Fast)

Best for: quick iterations, drafts, text-to-image, image editing.

generate(
  model_name: "fal-ai/nano-banana-2",
  input: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

Nano Banana Pro (High Fidelity)

Best for: production images, realism, typography, detailed prompts.

generate(
  model_name: "fal-ai/nano-banana-pro",
  input: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

Common Image Parameters

| Param | Type | Options | Notes | |-------|------|---------|-------| | prompt | string | required | Describe what you want | | image_size | string | square, portrait_4_3, landscape_16_9, portrait_16_9, landscape_4_3 | Aspect ratio | | num_images | number | 1-4 | How many to generate | | seed | number | any integer | Reproducibility | | guidance_scale | number | 1-20 | How closely to follow the prompt (higher = more literal) |

Image Editing

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

# First upload the source image
upload(file_path: "/path/to/image.png")

# Then generate with image input
generate(
  model_name: "fal-ai/nano-banana-2",
  input: {
    "prompt": "same scene but in watercolor style",
    "image_url": "<uploaded_url>",
    "image_size": "landscape_16_9"
  }
)

Video Generation

Seedance 1.0 Pro (ByteDance)

Best for: text-to-video, image-to-video with high motion quality.

generate(
  model_name: "fal-ai/seedance-1-0-pro",
  input: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(
  model_name: "fal-ai/kling-video/v3/pro",
  input: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

Veo 3 (Google DeepMind)

Best for: video with generated sound, high visual quality.

generate(
  model_name: "fal-ai/veo-3",
  input: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

Image-to-Video

Start from an existing image:

generate(
  model_name: "fal-ai/seedance-1-0-pro",
  input: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

Video Parameters

| Param | Type | Options | Notes | |-------|------|---------|-------| | prompt | string | required | Describe the video | | duration | string | "5s", "10s" | Video length | | aspect_ratio | string | "16:9", "9:16", "1:1" | Frame ratio | | seed | number | any integer | Reproducibility | | image_url | string | URL | Source image for image-to-video |

Audio Generation

CSM-1B (Conversational Speech)

Text-to-speech with natural, conversational quality.

generate(
  model_name: "fal-ai/csm-1b",
  input: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

ThinkSound (Video-to-Audio)

Generate matching audio from video content.

generate(
  model_name: "fal-ai/thinksound",
  input: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

ElevenLabs (via API, no MCP)

For professional voice synthesis, use ElevenLabs directly:

import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

VideoDB Generative Audio

If VideoDB is configured, use its generative audio:

# Voice generation
audio = coll.generate_voice(text="Your narration here", voice="alloy")

# Music generation
music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

# Sound effects
sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

Cost Estimation

Before generating, check estimated cost:

estimate_cost(model_name: "fal-ai/nano-banana-pro", input: {...})

Model Discovery

Find models for specific tasks:

search(query: "text to video")
find(model_name: "fal-ai/seedance-1-0-pro")
models()

Tips

Use seed for reproducible results when iterating on prompts
Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals
For video, keep prompts descriptive but concise — focus on motion and scene
Image-to-video produces more controlled results than pure text-to-video
Check estimate_cost before running expensive video generations

Related Skills

videodb — Video processing, editing, and streaming
video-editing — AI-powered video editing workflows
content-engine — Content creation for social platforms

LUAgam/fal-ai-media

.cursor/.agents/skills/fal-ai-media/SKILL.md

Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.

tools

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add LUAgam/stage-harness fal-ai-media

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 29, 2026, 4:35 AM37.1s2 files scanned

SKILL.md

name:: fal-ai-media
description:: Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.
origin:: ECC

fal.ai Media Generation

Generate images, videos, and audio using fal.ai models via MCP.

When to Activate

User wants to generate images from text prompts
Creating videos from text or images
Generating speech, music, or sound effects
Any media generation task
User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

MCP Requirement

fal.ai MCP server must be configured. Add to ~/.claude.json:

"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}

Get an API key at fal.ai.

MCP Tools

The fal.ai MCP provides these tools:

search — Find available models by keyword
find — Get model details and parameters
generate — Run a model with parameters
result — Check async generation status
status — Check job status
cancel — Cancel a running job
estimate_cost — Estimate generation cost
models — List popular models
upload — Upload files for use as inputs

Image Generation

Nano Banana 2 (Fast)

Best for: quick iterations, drafts, text-to-image, image editing.

generate(
  model_name: "fal-ai/nano-banana-2",
  input: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

Nano Banana Pro (High Fidelity)

Best for: production images, realism, typography, detailed prompts.

generate(
  model_name: "fal-ai/nano-banana-pro",
  input: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

Common Image Parameters

Image Editing

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

# First upload the source image
upload(file_path: "/path/to/image.png")

# Then generate with image input
generate(
  model_name: "fal-ai/nano-banana-2",
  input: {
    "prompt": "same scene but in watercolor style",
    "image_url": "<uploaded_url>",
    "image_size": "landscape_16_9"
  }
)

Video Generation

Seedance 1.0 Pro (ByteDance)

Best for: text-to-video, image-to-video with high motion quality.

generate(
  model_name: "fal-ai/seedance-1-0-pro",
  input: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(
  model_name: "fal-ai/kling-video/v3/pro",
  input: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

Veo 3 (Google DeepMind)

Best for: video with generated sound, high visual quality.

generate(
  model_name: "fal-ai/veo-3",
  input: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

Image-to-Video

Start from an existing image:

generate(
  model_name: "fal-ai/seedance-1-0-pro",
  input: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

Video Parameters

Audio Generation

CSM-1B (Conversational Speech)

Text-to-speech with natural, conversational quality.

generate(
  model_name: "fal-ai/csm-1b",
  input: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

ThinkSound (Video-to-Audio)

Generate matching audio from video content.

generate(
  model_name: "fal-ai/thinksound",
  input: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

ElevenLabs (via API, no MCP)

For professional voice synthesis, use ElevenLabs directly:

import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

VideoDB Generative Audio

If VideoDB is configured, use its generative audio:

# Voice generation
audio = coll.generate_voice(text="Your narration here", voice="alloy")

# Music generation
music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

# Sound effects
sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

Cost Estimation

Before generating, check estimated cost:

estimate_cost(model_name: "fal-ai/nano-banana-pro", input: {...})

Model Discovery

Find models for specific tasks:

search(query: "text to video")
find(model_name: "fal-ai/seedance-1-0-pro")
models()

Tips

Use seed for reproducible results when iterating on prompts
Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals
For video, keep prompts descriptive but concise — focus on motion and scene
Image-to-video produces more controlled results than pure text-to-video
Check estimate_cost before running expensive video generations

Related Skills

videodb — Video processing, editing, and streaming
video-editing — AI-powered video editing workflows
content-engine — Content creation for social platforms

Related Skills

LUAgam/verify-and-fix-cases

development

VerifiedTrustedCommunity

在 generate-test-cases 阶段之后执行，逐个验证测试用例并在失败时修复项目代码、重新编译部署、再次验证，直到通过或达到最大修复次数。覆盖 UI / API / API+UI / 性能测试四个维度，UI 测试通过浏览器真实模拟用户操作并截图， API 测试根据项目代码生成可执行的接口脚本，性能测试调用现有性能/质量技能全量执行。涉及真实用户登录信息（如手机号+验证码、账号密码、JWT）时必须中断要求用户提供，禁止编造无效凭证。所有 case 状态变更必须通过 e2e-case-tracker.sh 脚本持久化，确保中途崩溃可恢复、无 case 遗漏。

SKILL.mdUpdated May 22, 2026

LUAgam/verify-and-fix-cases

LUAgam/skills/e2e

development

VerifiedTrustedCommunity

# SKILL: e2e > **核心原则**： > 1. 测试范围跟着本次变动走。后端接口改了，对应的前端流程必须做联调验证；与本次需求无关的功能不测。对于涉及算法、转换准确率等质量敏感型需求，需额外生成专项质量测试。 > 2. **覆盖完整性优先于执行便利性**。不得以"链路复杂"、"需要外部依赖"为由跳过本次变动相关的用例；凡是受变动影响的接口和 UI 流程，都必须生成真实调用/操作用例。 > 3. **UI 测试必须模拟真实用户操作**（定位元素、点击、键入、等待渲染、断言可见文本/状态）。**禁止**将 UI 套件退化为浏览器上下文里的 `page.evaluate(fetch(...))` API 验证——那只是把 API 测试换了执行环境，没有额外价值，不算 UI 测试。 > 4. **通用性**：本 skill 不假设具体业务域，所有规则均以抽象变动面（文件、接口、页面、用户动作）为单位组织，不针对任何特定项目的数据库/领域词汇。 > 5. **E2E 套件必须验证运行时行为**。严禁把"读取源码/配置文件并做字符串/结构匹配"的检查封装成独立 E2E 套件——这类检

SKILL.mdUpdated May 22, 2026

LUAgam/skills/deploy

tools

VerifiedTrustedCommunity

# SKILL: deploy ## CLI Bootstrap 在执行任何 `harnessctl` 命令前，先解析本地 CLI 路径： ```bash if [ -z "${HARNESSCTL:-}" ]; then candidates=( "./stage-harness/scripts/harnessctl" "../stage-harness/scripts/harnessctl" "$(git rev-parse --show-toplevel 2>/dev/null)/stage-harness/scripts/harnessctl" ) for candidate in "${candidates[@]}"; do if [ -n "$candidate" ] && [ -x "$candidate" ]; then HARNESSCTL="$candidate" break fi done fi test -n "${HARNESSCTL:-}" && test -x "$H

SKILL.mdUpdated May 22, 2026

LUAgam/skills/build

tools

VerifiedTrustedCommunity

# SKILL: build ## CLI Bootstrap 在执行任何 `harnessctl` 命令前，先解析本地 CLI 路径： ```bash if [ -z "${HARNESSCTL:-}" ]; then candidates=( "./stage-harness/scripts/harnessctl" "../stage-harness/scripts/harnessctl" "$(git rev-parse --show-toplevel 2>/dev/null)/stage-harness/scripts/harnessctl" ) for candidate in "${candidates[@]}"; do if [ -n "$candidate" ] && [ -x "$candidate" ]; then HARNESSCTL="$candidate" break fi done fi test -n "${HARNESSCTL:-}" && test -x "$HA

SKILL.mdUpdated May 19, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/LUAgam/stage-harness.git

# Copy into Claude Code skills folder (global)
cp -r stage-harness/.cursor/.agents/skills/fal-ai-media ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

LUAgam/stage-harness

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT