skills/baoyu-image-gen/SKILL.md
AI image generation with OpenAI, Google, DashScope, Replicate and APIMart APIs. Supports text-to-image, reference images, aspect ratios, and batch generation from saved prompt files. Sequential by default; use batch parallel generation when the user already has multiple prompts or wants stable multi-image throughput. Use when user asks to generate, create, or draw images.
npx skillsauth add stonehah/qunz-skills baoyu-image-genInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Official API-based image generation. Supports OpenAI, Google, DashScope (阿里通义万象), Replicate and APIMart providers.
Agent Execution:
{baseDir} = this SKILL.md file's directory{baseDir}/scripts/main.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bunCRITICAL: This step MUST complete BEFORE any image generation. Do NOT skip or defer.
Check EXTEND.md existence (priority: project → user):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-image-gen/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-image-gen/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md") { "user" }
| Result | Action |
|--------|--------|
| Found | Load, parse, apply settings. If default_model.[provider] is null → ask model only (Flow 2) |
| Not found | ⛔ Run first-time setup (references/config/first-time-setup.md) → Save EXTEND.md → Then continue |
CRITICAL: If not found, complete the full setup (provider + model + quality + save location) using AskUserQuestion BEFORE generating any images. Generation is BLOCKED until EXTEND.md is created.
| Path | Location |
|------|----------|
| .baoyu-skills/baoyu-image-gen/EXTEND.md | Project directory |
| $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md | User home |
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio | Default image size | Default models | Batch worker cap | Provider-specific batch limits
Schema: references/config/preferences-schema.md
# Basic
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image cat.png
# With aspect ratio
${BUN_X} {baseDir}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9
# High quality
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k
# From prompt files
${BUN_X} {baseDir}/scripts/main.ts --promptfiles system.md content.md --image out.png
# With reference images (Google multimodal or OpenAI edits)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png
# With reference images (explicit provider/model)
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make blue" --image out.png --provider google --model gemini-3-pro-image-preview --ref source.png
# Specific provider
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider openai
# DashScope (阿里通义万象)
${BUN_X} {baseDir}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope
# Replicate (google/nano-banana-pro)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Replicate with specific model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
# APIMart (Gemini-3-Pro-Image-preview, async task polling)
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat in watercolor style" --image out.png --provider apimart
# APIMart with reference image
${BUN_X} {baseDir}/scripts/main.ts --prompt "Make this into anime style" --image out.png --provider apimart --ref source.png
# APIMart with doubao-seedream-5-0-lite model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cinematic mountain sunrise" --image out.png --provider apimart --model doubao-seedream-5-0-lite
# Batch mode with saved prompt files
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json
# Batch mode with explicit worker count
${BUN_X} {baseDir}/scripts/main.ts --batchfile batch.json --jobs 4 --json
{
"jobs": 4,
"tasks": [
{
"id": "hero",
"promptFiles": ["prompts/hero.md"],
"image": "out/hero.png",
"provider": "replicate",
"model": "google/nano-banana-pro",
"ar": "16:9",
"quality": "2k"
},
{
"id": "diagram",
"promptFiles": ["prompts/diagram.md"],
"image": "out/diagram.png",
"ref": ["references/original.png"]
}
]
}
Paths in promptFiles, image, and ref are resolved relative to the batch file's directory. jobs is optional (overridden by CLI --jobs). Top-level array format (without jobs wrapper) is also accepted.
| Option | Description |
|--------|-------------|
| --prompt <text>, -p | Prompt text |
| --promptfiles <files...> | Read prompt from files (concatenated) |
| --image <path> | Output image path (required in single-image mode) |
| --batchfile <path> | JSON batch file for multi-image generation |
| --jobs <count> | Worker count for batch mode (default: auto, max from config, built-in default 10) |
| --provider google\|openai\|dashscope\|replicate\|apimart | Force provider (default: auto-detect) |
| --model <id>, -m | Model ID (Google: gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; OpenAI: gpt-image-1.5, gpt-image-1; APIMart: gemini-3-pro-image-preview, doubao-seedream-5-0-lite) |
| --ar <ratio> | Aspect ratio (e.g., 16:9, 1:1, 4:3) |
| --size <WxH> | Size (e.g., 1024x1024) |
| --quality normal\|2k | Quality preset (default: 2k) |
| --imageSize 1K\|2K\|4K | Image size for Google/APIMart (default: from quality) |
| --ref <files...> | Reference images. Supported by Google multimodal, OpenAI GPT Image edits, Replicate, and APIMart |
| --n <count> | Number of images for the current task (APIMart currently supports only 1) |
| --json | JSON output |
| Variable | Description |
|----------|-------------|
| OPENAI_API_KEY | OpenAI API key |
| GOOGLE_API_KEY | Google API key |
| DASHSCOPE_API_KEY | DashScope API key (阿里云) |
| REPLICATE_API_TOKEN | Replicate API token |
| APIMART_API_KEY | APIMart API key |
| OPENAI_IMAGE_MODEL | OpenAI model override |
| GOOGLE_IMAGE_MODEL | Google model override |
| DASHSCOPE_IMAGE_MODEL | DashScope model override (default: z-image-turbo) |
| REPLICATE_IMAGE_MODEL | Replicate model override (default: google/nano-banana-pro) |
| APIMART_IMAGE_MODEL | APIMart model override (default: gemini-3-pro-image-preview, e.g. doubao-seedream-5-0-lite) |
| OPENAI_BASE_URL | Custom OpenAI endpoint |
| GOOGLE_BASE_URL | Custom Google endpoint |
| DASHSCOPE_BASE_URL | Custom DashScope endpoint |
| REPLICATE_BASE_URL | Custom Replicate endpoint |
| APIMART_BASE_URL | Custom APIMart endpoint (default: https://api.apimart.ai) |
| APIMART_TASK_LANGUAGE | Task status language for polling (default: en) |
| BAOYU_IMAGE_GEN_MAX_WORKERS | Override batch worker cap |
| BAOYU_IMAGE_GEN_<PROVIDER>_CONCURRENCY | Override provider concurrency, e.g. BAOYU_IMAGE_GEN_REPLICATE_CONCURRENCY |
| BAOYU_IMAGE_GEN_<PROVIDER>_START_INTERVAL_MS | Override provider start gap, e.g. BAOYU_IMAGE_GEN_REPLICATE_START_INTERVAL_MS |
Load Priority: CLI args > EXTEND.md > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env
Model priority (highest → lowest), applies to all providers:
--model <id>default_model.[provider]<PROVIDER>_IMAGE_MODEL (e.g., GOOGLE_IMAGE_MODEL)EXTEND.md overrides env vars. If both EXTEND.md default_model.google: "gemini-3-pro-image-preview" and env var GOOGLE_IMAGE_MODEL=gemini-3.1-flash-image-preview exist, EXTEND.md wins.
Agent MUST display model info before each generation:
Using [provider] / [model]Switch model: --model <id> | EXTEND.md default_model.[provider] | env <PROVIDER>_IMAGE_MODELSupported model formats:
owner/name (recommended for official models), e.g. google/nano-banana-proowner/name:version (community models by version), e.g. stability-ai/sdxl:<version>Examples:
# Use Replicate default model
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate
# Override model explicitly
${BUN_X} {baseDir}/scripts/main.ts --prompt "A cat" --image out.png --provider replicate --model google/nano-banana
--ref provided + no --provider → auto-select Google first, then OpenAI, then Replicate, then APIMart--provider specified → use it (if --ref, must be google, openai, replicate, or apimart)| Preset | Google imageSize | OpenAI Size | Replicate resolution | Use Case |
|--------|------------------|-------------|----------------------|----------|
| normal | 1K | 1024px | 1K | Quick previews |
| 2k (default) | 2K | 2048px | 2K | Covers, illustrations, infographics |
Google/APIMart imageSize: Can be overridden with --imageSize 1K|2K|4K
Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
imageConfig.aspectRatioaspect_ratio to model; when --ref is provided without --ar, defaults to match_input_imageDefault: Sequential generation.
Batch Parallel Generation: When --batchfile contains 2 or more pending tasks, the script automatically enables parallel generation.
| Mode | When to Use | |------|-------------| | Sequential (default) | Normal usage, single images, small batches | | Parallel batch | Batch mode with 2+ tasks |
Execution choice:
| Situation | Preferred approach | Why |
|-----------|--------------------|-----|
| One image, or 1-2 simple images | Sequential | Lower coordination overhead and easier debugging |
| Multiple images already have saved prompt files | Batch (--batchfile) | Reuses finalized prompts, applies shared throttling/retries, and gives predictable throughput |
| Each image still needs separate reasoning, prompt writing, or style exploration | Subagents | The work is still exploratory, so each image may need independent analysis before generation |
| Output comes from baoyu-article-illustrator with outline.md + prompts/ | Batch (build-batch.ts -> --batchfile) | That workflow already produces prompt files, so direct batch execution is the intended path |
Rule of thumb:
Parallel behavior:
--jobs <count>gemini-3-pro-image-preview, gemini-3.1-flash-image-preview; OpenAI GPT Image edits; Replicate; or APIMart)Custom configurations via EXTEND.md. See Preferences section for paths and supported options.
tools
Extracts and downloads videos from Douyin and Kuaishou short URLs using Playwright browser automation. Use when user shares "v.douyin.com" or "v.kuaishou.com" links, or asks to "download video", "extract video", "save short video", "提取视频", "下载短视频".
testing
Generates Xiaohongshu (Little Red Book) infographic series with 11 visual styles and 8 layouts. Breaks content into 1-10 cartoon-style images optimized for XHS engagement. Use when user mentions "小红书图片", "XHS images", "RedNote infographics", "小红书种草", or wants social media infographics for Chinese platforms.
development
Fetch any URL and convert to markdown using Chrome CDP. Saves the rendered HTML snapshot alongside the markdown, and automatically falls back to the pre-Defuddle HTML-to-Markdown pipeline when Defuddle fails. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.
documentation
Translates articles and documents between languages with three modes - quick (direct), normal (analyze then translate), and refined (analyze, translate, review, polish). Supports custom glossaries and terminology consistency via EXTEND.md. Use when user asks to "translate", "翻译", "精翻", "translate article", "translate to Chinese/English", "改成中文", "改成英文", "convert to Chinese", "localize", "本地化", or needs any document translation. Also triggers for "refined translation", "精细翻译", "proofread translation", "快速翻译", "快翻", "这篇文章翻译一下", or when a URL or file is provided with translation intent.