plugins/ollama/skills/image-gen/SKILL.md
Generate images from text prompts using Ollama's local image generation models.
npx skillsauth add scaryrawr/scarypilot image-genInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate images locally with a single helper that supports both Ollama CLI and REST backends. This feature is currently macOS-only and uses Ollama's experimental image generation support.
Before generating images, check which models are locally available by calling the Ollama API:
curl -s http://localhost:11434/api/tags | jq -r '.models[].name'
Important: If you see a quantized variant with a suffix (e.g. x/z-image-turbo:q4_K_M), use that exact ID rather than the base name.
6B parameter model from Alibaba's Tongyi Lab. Best for photorealistic images and bilingual (English/Chinese) text rendering. Apache 2.0 licensed.
Black Forest Labs' fast image-generation model (4B and 9B sizes). Best for readable text in images, UI mockups, and typography-heavy designs.
| Need | Recommended Model |
| ----------------------------------- | ----------------------------------------- |
| Photorealistic portraits/scenes | x/z-image-turbo |
| Chinese text rendering | x/z-image-turbo |
| Readable text in images (signs, UI) | x/flux2-klein |
| Commercial use | x/z-image-turbo or x/flux2-klein (4B) |
| General purpose | x/z-image-turbo |
Default to x/z-image-turbo unless the user has a specific need for text rendering in images.
Use the helper script (backend defaults to auto, which prefers CLI when available):
./scripts/generate-image.sh --prompt "Young woman in a cozy coffee shop, natural window lighting, wearing a cream knit sweater, holding a ceramic mug, soft bokeh background"
Choose a specific model and image size:
./scripts/generate-image.sh \
--model x/flux2-klein \
--size 512x512 \
--output my-image.png \
--prompt "A neon sign reading OPEN 24 HOURS in a rainy alley"
Use richer generation controls (CLI backend):
./scripts/generate-image.sh \
--backend cli \
--model x/flux2-klein \
--width 1024 \
--height 1024 \
--steps 20 \
--seed 42 \
--negative-prompt "blurry, low quality, distorted" \
--output detailed.png \
--prompt "UI dashboard mockup with clean typography and clear labels"
Images are saved to the current working directory by default unless --output is provided. Always tell the user where the generated image was saved.
| Backend | Description |
| ------- | ----------- |
| auto (default) | Prefers ollama run when available; falls back to REST |
| cli | Uses ollama run directly (supports richer options) |
| rest | Uses POST /v1/images/generations |
If requested model is missing, the script attempts ollama pull <model>.
| Option | Default | Description |
| ------ | ------- | ----------- |
| --prompt | (required) | The text prompt |
| --model | x/z-image-turbo | Ollama model name |
| --size | 1024x1024 | Image dimensions (WxH) |
| --width / --height | unset | Size aliases (must be used together) |
| --output | image_YYYYMMDD_HHMMSS.png | Output file path |
| --backend | auto | auto, cli, or rest |
| --steps | unset | Denoising steps (CLI backend only) |
| --seed | unset | Random seed (CLI backend only) |
| --negative-prompt | unset | Negative prompt text (CLI backend only) |
| Capability | CLI backend | REST backend |
| ---------- | ----------- | ------------ |
| Basic prompt + model + size | ✅ | ✅ |
| steps / seed / negative prompt | ✅ | ❌ |
| width / height aliases | ✅ | ✅ (normalized to size) |
If --backend rest is forced with CLI-only options, the script fails with an actionable error.
You can still call the REST API directly:
curl -s http://localhost:11434/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "x/z-image-turbo",
"prompt": "A sunset over the ocean with dramatic clouds",
"size": "1024x1024",
"response_format": "b64_json"
}' | jq -r '.data[0].b64_json' | base64 -d > output.png
Control the output dimensions with --size (or --width + --height). Format is WxH. Smaller images generate faster and use less memory.
Common sizes: 512x512, 768x768, 1024x1024
curl -s http://localhost:11434/api/tags to list available models. Note any quantized variants (e.g. :q4_K_M) and use the exact ID.x/z-image-turbo, but prefer x/flux2-klein for text-heavy images, signage, UI, or other typography-sensitive work--backend auto unless the user explicitly requests a backend (use the exact model ID from step 1, including any quantization suffix)--backend cli (or rely on auto when CLI is available)--negative-prompt, --steps, or --seedtesting
Manage parallel git worktrees with Worktrunk (`wt`) and enforce disk-fit preflight checks before creating new worktrees.
tools
Create Ghostty windows/tabs/splits and drive terminals with focus/input for multitasking workflows on macOS.
testing
Quickly bootstrap repo-specific Copilot instructions with high signal and low context bloat.
tools
Connect to and interact with GitHub Codespaces. Manages connections via gh ado-codespaces (port forwarding, Azure auth), runs commands via gh cs ssh, invokes Copilot CLI remotely, and supports multiple codespaces.