tools/image/ai-image-generation/SKILL.md
Generate AI images with GPT-Image-2, FLUX, Gemini, Grok, Seedream, Reve and 50+ models via inference.sh CLI. Models: GPT-Image-2, FLUX Dev LoRA, FLUX.2 Klein LoRA, Gemini 3 Pro Image, Grok Imagine, Seedream 4.5, Reve, ImagineArt. Capabilities: text-to-image, image-to-image, inpainting, LoRA, image editing, upscaling, text rendering. Use for: AI art, product mockups, concept art, social media graphics, marketing visuals, illustrations. Triggers: flux, image generation, ai image, text to image, stable diffusion, generate image, ai art, midjourney alternative, dall-e alternative, text2img, t2i, image generator, ai picture, create image with ai, generative ai, ai illustration, grok image, gemini image, gpt image, openai image, chatgpt image
npx skillsauth add inference-sh-7/skills ai-image-generationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Install the belt CLI skill:
npx skills add belt-sh/cli
Generate images with 50+ AI models via inference.sh CLI.

Requires inference.sh CLI (
belt). Install instructions
belt login
# Generate an image with FLUX
belt app run falai/flux-dev-lora --input '{"prompt": "a cat astronaut in space"}'
| Model | App ID | Best For |
|-------|--------|----------|
| GPT-Image-2 | openai/gpt-image-2 | Text-to-image, editing, inpainting |
| FLUX Dev LoRA | falai/flux-dev-lora | High quality with custom styles |
| FLUX.2 Klein LoRA | falai/flux-2-klein-lora | Fast with LoRA support (4B/9B) |
| P-Image | pruna/p-image | Fast, economical, multiple aspects |
| P-Image-LoRA | pruna/p-image-lora | Fast with preset LoRA styles |
| P-Image-Edit | pruna/p-image-edit | Fast image editing |
| Gemini 3 Pro | google/gemini-3-pro-image-preview | Google's latest |
| Gemini 2.5 Flash | google/gemini-2-5-flash-image | Fast Google model |
| Grok Imagine | xai/grok-imagine-image | xAI's model, multiple aspects |
| Seedream 4.5 | bytedance/seedream-4-5 | 2K-4K cinematic quality |
| Seedream 4.0 | bytedance/seedream-4-0 | High quality 2K-4K |
| Seedream 3.0 | bytedance/seedream-3-0-t2i | Accurate text rendering |
| Reve | falai/reve | Natural language editing, text rendering |
| ImagineArt 1.5 Pro | falai/imagine-art-1-5-pro-preview | Ultra-high-fidelity 4K |
| FLUX Klein 4B | pruna/flux-klein-4b | Ultra-cheap ($0.0001/image) |
| Topaz Upscaler | falai/topaz-image-upscaler | Professional upscaling |
belt app store --category image
belt app run openai/gpt-image-2 --input '{
"prompt": "professional product photo of sneakers, studio lighting",
"quality": "high"
}'
belt app run openai/gpt-image-2 --input '{
"prompt": "change the background to a beach at sunset",
"images": ["https://your-image.jpg"]
}'
belt app run falai/flux-dev-lora --input '{
"prompt": "professional product photo of a coffee mug, studio lighting"
}'
belt app run falai/flux-2-klein-lora --input '{"prompt": "sunset over mountains"}'
belt app run google/gemini-3-pro-image-preview --input '{
"prompt": "photorealistic landscape with mountains and lake"
}'
belt app run xai/grok-imagine-image --input '{
"prompt": "cyberpunk city at night",
"aspect_ratio": "16:9"
}'
belt app run falai/reve --input '{
"prompt": "A poster that says HELLO WORLD in bold letters"
}'
belt app run bytedance/seedream-4-5 --input '{
"prompt": "cinematic portrait of a woman, golden hour lighting"
}'
belt app run falai/topaz-image-upscaler --input '{"image_url": "https://..."}'
belt app run infsh/stitch-images --input '{
"images": ["https://img1.jpg", "https://img2.jpg"],
"direction": "horizontal"
}'
# Full platform skill (all 250+ apps)
npx skills add inference-sh/skills@infsh-cli
# Pruna P-Image (fast & economical)
npx skills add inference-sh/skills@p-image
# GPT-Image-2 (OpenAI)
npx skills add inference-sh/skills@gpt-image
# FLUX-specific skill
npx skills add inference-sh/skills@flux-image
# Upscaling & enhancement
npx skills add inference-sh/skills@image-upscaling
# Background removal
npx skills add inference-sh/skills@background-removal
# Video generation
npx skills add inference-sh/skills@ai-video-generation
# AI avatars from images
npx skills add inference-sh/skills@ai-avatar-video
Browse all apps: belt app store
data-ai
Generate multi-person talking head podcast videos from scratch using AI — character creation, TTS, avatar animation, and video stitching. Use when the user wants to create a podcast, talking head video, or multi-speaker conversation video.
tools
Generate videos with ByteDance Seedance 2.0 via inference.sh CLI. Unified model for text-to-video, image-to-video, and reference-to-video with synchronized audio, up to 1080p, 4-15s duration. Pro and Fast variants. Studio variants with private asset library for portrait consistency. Use for: social media videos, music videos, product demos, animated content, AI video with sound. Triggers: seedance, seedance 2, bytedance video, seedance t2v, seedance i2v, seedance r2v, video with audio, seedance 2.0, bytedance seedance, seedance studio
tools
Generate talking head avatar videos with Pruna P-Video-Avatar via inference.sh CLI. Turn a portrait image into a realistic speaking video with built-in TTS. 18x faster and 6x cheaper than competitors. Models: P-Video-Avatar, P-Image (for portrait generation). Capabilities: text-to-avatar, audio-driven avatars, 30 voices, 10 languages, 720p/1080p, built-in TTS, dynamic backgrounds, full-body control. Use for: AI presenters, product demos, explainer videos, virtual influencers, marketing, education, multilingual content, UGC, gaming avatars. Triggers: avatar video, talking head, ai avatar, p-video-avatar, pruna avatar, video avatar, ai presenter, digital human, virtual presenter, lipsync, talking avatar, ai spokesperson, heygen alternative, synthesia alternative, veed alternative, fabric alternative, omnihuman alternative
tools
Generate and edit videos with Alibaba HappyHorse 1.0 models via inference.sh CLI. Models: HappyHorse T2V, I2V, R2V, Video Edit. Capabilities: text-to-video, image-to-video, reference-to-video, video editing with natural language, character preservation, 720P/1080P, up to 15 seconds. Use for: physically realistic video, video editing, character-consistent content, product demos, social media. Triggers: happyhorse, happy horse, alibaba video, happyhorse 1.0, dashscope video, alibaba happyhorse, video editing ai, ai video editor