skills/image-generator-sd-webui/SKILL.md
Generate images via the Stable Diffusion WebUI / Forge HTTP API (AUTOMATIC1111-compatible `/sdapi/v1/*`). Use when the user wants to (1) discover or pick a model / extra module (TE/VAE) / sampler / scheduler / style preset from a running sd-webui server, (2) generate an image with a given prompt (txt2img), (3) check generation progress, (4) cancel/interrupt an in-flight generation, (5) inspect or change a global sd-webui option (e.g. active checkpoint), or (6) test connectivity. This skill talks to a *generic* sd-webui-compatible server (AUTOMATIC1111, Forge, reForge, sd-webui-forge-classic). Do NOT trigger for requests that are purely writing the prompt itself.
npx skillsauth add jim60105/copilot-prompt image-generator-sd-webuiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Drive a Stable Diffusion WebUI / Forge server through its REST API to enumerate available resources, run txt2img, poll progress, and interrupt jobs. All scripts under scripts/ are thin curl wrappers; they print JSON or extracted fields to stdout so the agent can pipe / parse them.
Before doing anything, confirm the server URL (and optional HTTP Basic Auth) with the user. Pass them as environment variables to every script:
export SD_WEBUI_URL="http://localhost:7860" # required, no trailing slash
export SD_WEBUI_USER="" # optional, HTTP Basic Auth
export SD_WEBUI_PASS="" # optional
If unset, scripts default to http://localhost:7860 with no auth.
Quick connectivity test (returns OK <url> on success, exits non-zero on failure):
scripts/probe.sh
scripts/probe.sh). On failure, ask the user for the correct URL / credentials.name / title / model_name — sd-webui matches exactly, do not translate or rename.scripts/generate.sh with a request JSON. It returns a JSON object containing the base64 PNG image and the generation info.scripts/progress.sh to print progress (0–1), eta_relative, and state.scripts/cancel.sh to interrupt the current job.| User wants | Command | API endpoint |
|---|---|---|
| Checkpoints (models) | scripts/list.sh models | GET /sdapi/v1/sd-models → array of {title, model_name, hash, ...} |
| Extra modules (TE / VAE, Forge-only) | scripts/list.sh modules | GET /sdapi/v1/sd-modules → array of {model_name, ...} |
| Samplers | scripts/list.sh samplers | GET /sdapi/v1/samplers → array of {name, aliases} |
| Schedulers | scripts/list.sh schedulers | GET /sdapi/v1/schedulers → array of {name, label} |
| Style presets | scripts/list.sh styles | GET /sdapi/v1/prompt-styles → array of {name, prompt, negative_prompt} |
| Upscalers | scripts/list.sh upscalers | GET /sdapi/v1/upscalers |
| LoRAs | scripts/list.sh loras | GET /sdapi/v1/loras |
| Embeddings | scripts/list.sh embeddings | GET /sdapi/v1/embeddings |
scripts/list.sh <kind> prints the canonical English identifier for each entry, one per line — pipe to column, fzf, etc. Add --json for the raw JSON.
After listing, present the options to the user (use ask_user with an enum if the list is short). For models, prefer the full title (which embeds the hash suffix, e.g. anima/animaika_v36.safetensors [d50fb5b9a0]) over model_name because the title is unambiguous — if the user supplies a bare filename without the hash, verify it via list.sh models and substitute the exact title before sending it to the API. For schedulers, list.sh schedulers prints the human-readable label (e.g. Beta); both label and the lowercase name (beta) are accepted by the txt2img scheduler field.
prompt. Recommended: negative_prompt, steps, cfg_scale, width, height, sampler_name, scheduler, styles (array of style names), and override_settings.sd_model_checkpoint (model title) / override_settings.forge_additional_modules (array of module names, Forge only). See references/txt2img-parameters.md for every field.scripts/generate.sh request.json > result.json
# or pipe:
cat request.json | scripts/generate.sh - > result.json
jq -r '.images[0]' result.json | base64 -d > out.png
info field is a JSON string with seed, all_prompts, sampler_name, etc. — parse with jq -r '.info | fromjson'.Important behaviour notes:
samples_format pre-pin: sd-webui/Forge validates samples_format before applying override_settings, so if the server's persistent value is unsupported (e.g. avif), txt2img fails. generate.sh preemptively POSTs samples_format=png to /sdapi/v1/options and redundantly injects override_settings.samples_format=png. ⚠️ The pre-pin mutates the server's persistent default to "png" — override_settings_restore_afterwards cannot undo it. If the user shares the server with clients expecting a different default, restore manually after: scripts/options.sh set samples_format '"webp"'. Convert locally if you need non-PNG output (see "Converting to another format" below).override_settings_restore_afterwards: true is forced on by generate.sh so the other override_settings keys (model checkpoint, modules, VAE) do not stick.SD_WEBUI_TIMEOUT=900 scripts/generate.sh ....If the user wants the output in a non-PNG format (WebP, AVIF, JPEG, etc.), do not try to re-enable a different samples_format on the server. Instead, convert locally while preserving the embedded sd-webui generation metadata:
format-converter.sh and copy-info.sh are available on PATH (e.g. command -v format-converter.sh && command -v copy-info.sh).format-converter.sh on the PNG — it calls copy-info.sh internally to carry the parameters over. Run format-converter.sh -h to see the current usage.PATH and -h will show usage.Call from another terminal (or background the generate.sh call with & first):
scripts/progress.sh # one-shot, prints JSON
scripts/progress.sh --watch # poll every 1s until progress reaches 1.0 or state.job is empty
scripts/progress.sh --watch --interval 2
scripts/progress.sh --field progress # just the numeric 0..1 value
scripts/progress.sh --field state.job
Endpoint: GET /sdapi/v1/progress?skip_current_image=true. Key response fields:
progress — float 0..1, fraction of current job complete.eta_relative — estimated seconds remaining.state.job — current job name (empty string when idle).state.sampling_step / state.sampling_steps — current step index / total.current_image — base64 PNG preview of the in-progress image (omitted by the script via skip_current_image=true to keep responses small; fetch raw with curl if needed).scripts/cancel.sh # POST /sdapi/v1/interrupt — stop current job, return current partial result
scripts/cancel.sh --skip # POST /sdapi/v1/skip — skip current job in a batch
Note: interrupt is cooperative — it tells the sampler to stop at the next step. The pending generate.sh call will return with whatever the model produced so far (often a usable but partial image). It does not raise an HTTP error on the txt2img call.
scripts/options.sh wraps GET /sdapi/v1/options and POST /sdapi/v1/options:
scripts/options.sh get # print all options as JSON
scripts/options.sh get sd_model_checkpoint # print one key
scripts/options.sh set sd_model_checkpoint '"<title>"' # set one key (value is JSON; string must be quoted)
scripts/options.sh set-json '{"k1":"v1","k2":"v2"}' # set multiple keys
scripts/options.sh refresh-checkpoints # POST /sdapi/v1/refresh-checkpoints
Prefer override_settings inside the txt2img request over options set — override_settings is request-scoped and reverts after the call, while options set persists globally and affects every other client.
This skill does not generate or refine prompts. When the user asks for prompt help:
sd-prompt-builder, danbooru-prompt, image-prompt-*). If so, delegate to it.prompt field. Do not invent Danbooru tags or stylistic modifiers on your own.references/api-endpoints.md — full sd-webui / Forge endpoint reference with request / response shapes for every endpoint this skill uses, plus useful adjacent ones (/sdapi/v1/memory, /sdapi/v1/png-info, etc.).references/txt2img-parameters.md — every txt2img request field including HiRes-fix, refiner, Forge-specific extensions (forge_additional_modules, forge_inference_memory, forge_preset), and override_settings keys.Read these only when constructing a non-trivial request or hitting an error that needs deeper investigation.
development
Diátaxis Documentation Expert. An expert technical writer specializing in creating high-quality software documentation, guided by the principles and structure of the Diátaxis technical documentation authoring framework.
testing
Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.
tools
Comprehensive guide for building, configuring, customizing, and deploying Docsify documentation sites. Use when the user wants to (1) initialize a new Docsify site, (2) add or organize Markdown pages, sidebars, navbars, or cover pages, (3) configure `window.$docsify` options, (4) customize themes / CSS variables / fonts, (5) install built-in or third-party Docsify plugins (search, GA, emoji, zoom, copy-code, comments, pagination, tabs, etc.), (6) write a custom Docsify plugin using lifecycle hooks, (7) use Docsify Markdown helpers (callouts, link attributes, image attributes, heading IDs, task lists, embed files with `:include`), (8) deploy to GitHub Pages, GitLab Pages, Netlify, Vercel, Firebase, Docker, Nginx, etc., (9) enable PWA / offline mode, virtual routes, or Vue compatibility, or (10) upgrade a Docsify site from v4 to v5. Triggers on mentions of "docsify", "_sidebar.md", "_navbar.md", "_coverpage.md", "$docsify", or `docsify-cli`.
testing
Writing guidelines for producing high-quality Traditional Chinese (zh-TW) content. Use when writing any kind of content. Including blog posts, notes, technical articles, technical writing, chitchat, social media posts, etc., even when you are just sending a text message. Also use when reviewing or editing existing Chinese content for tone, style, and terminology compliance.