tools/image/qwen-image-2-pro/SKILL.md
Generate images with Alibaba Qwen-Image-2.0-Pro via inference.sh CLI. Professional text rendering, fine-grained realism, enhanced semantic adherence. Ideal for posters, banners, and text-heavy designs. Triggers: qwen image pro, qwen-image-pro, qwen 2 pro, alibaba image pro, dashscope pro, professional text rendering
npx skillsauth add inference-sh/agent-skills qwen-image-2-proInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate images with Alibaba Qwen-Image-2.0-Pro via inference.sh CLI. Best for professional text rendering and complex designs.

Requires inference.sh CLI (
belt). Install instructions
belt login
belt app run alibaba/qwen-image-2-pro --input '{"prompt": "Poster with title \"Welcome!\" in bold blue text"}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "A futuristic cityscape at sunset with flying cars"
}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "Healing-style hand-drawn poster featuring three puppies playing with a ball. The main title \"Come Play Ball!\" is prominently displayed at the top in bold, blue cartoon font. Below, the subtitle \"Join the Fun!\" appears in green font.",
"width": 1024,
"height": 1536,
"prompt_extend": false
}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "Professional marketing banner for summer sale. Large text \"SUMMER SALE\" in white on gradient sunset background. \"50% OFF\" in yellow below. Clean, modern design.",
"width": 1920,
"height": 1080,
"prompt_extend": false,
"negative_prompt": "blurry text, distorted text, low quality"
}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "Minimalist logo design for a coffee shop called \"Bean & Brew\"",
"num_images": 4
}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "Make the person from Image 1 wear the outfit from Image 2",
"reference_images": [
{"uri": "https://example.com/person.jpg"},
{"uri": "https://example.com/outfit.jpg"}
],
"num_images": 2
}'
belt app run alibaba/qwen-image-2-pro --input '{
"prompt": "Abstract geometric art in blue and gold",
"seed": 12345
}'
| Parameter | Type | Description |
|-----------|------|-------------|
| prompt | string | Required. What to generate or edit (max 800 chars) |
| reference_images | array | Input images for editing (1-3 images) |
| num_images | integer | Number of images to generate (1-6) |
| width | integer | Output width in pixels (512-2048) |
| height | integer | Output height in pixels (512-2048) |
| watermark | boolean | Add "Qwen-Image" watermark |
| negative_prompt | string | Content to avoid (max 500 chars) |
| prompt_extend | boolean | Enable prompt rewriting (default: true) |
| seed | integer | Random seed for reproducibility (0-2147483647) |
Size constraint: Total pixels must be between 512×512 and 2048×2048.
| Field | Type | Description |
|-------|------|-------------|
| images | array | The generated or edited images (PNG format) |
| output_meta | object | Metadata with dimensions and count |
For best text results with the Pro model:
"Title: \"Hello World!\""prompt_extend: false for precise control"blurry text, distorted text, low quality"Example prompt structure:
Poster with the title "GRAND OPENING" in large red serif font at the top center.
Below, the date "March 15, 2024" in smaller black text.
Background: elegant gold and white gradient.
Style: professional, clean, modern.
{
"negative_prompt": "low resolution, low quality, deformed limbs, deformed fingers, oversaturated, waxy, no facial details, overly smooth, AI-like, chaotic composition, blurry text, distorted text"
}
# 1. Generate sample input to see all options
belt app sample alibaba/qwen-image-2-pro --save input.json
# 2. Edit the prompt
# 3. Run
belt app run alibaba/qwen-image-2-pro --input input.json
from inferencesh import inference
client = inference()
# Text-heavy poster
result = client.run({
"app": "alibaba/qwen-image-2-pro",
"input": {
"prompt": "Poster with title \"Welcome!\" in bold blue text at top",
"width": 1024,
"height": 1536,
"prompt_extend": False
}
})
print(result["output"])
# Stream live updates
for update in client.run({
"app": "alibaba/qwen-image-2-pro",
"input": {
"prompt": "Professional product photography of a watch"
}
}, stream=True):
if update.get("progress"):
print(f"progress: {update['progress']}%")
if update.get("output"):
print(f"output: {update['output']}")
# Standard Qwen-Image (faster, general use)
npx skills add inference-sh/skills@qwen-image
# Full platform skill (all 250+ apps)
npx skills add inference-sh/skills@infsh-cli
# All image generation models
npx skills add inference-sh/skills@ai-image-generation
Browse all image apps: belt app list --category image
development
Render videos from React/Remotion component code via inference.sh. Pass TSX code, get MP4. Supports all Remotion APIs: useCurrentFrame, useVideoConfig, spring, interpolate, AbsoluteFill, Sequence. Configurable resolution, FPS, duration, codec. Use for: programmatic video generation, animated graphics, motion design, data-driven videos, React animations to video. Triggers: remotion, render video from code, tsx to video, react video, programmatic video, remotion render, code to video, animated video, motion graphics code, react animation video
tools
Generate videos with Pruna P-Video and WAN models via inference.sh CLI. Models: P-Video, WAN-T2V, WAN-I2V. Capabilities: text-to-video, image-to-video, audio support, 720p/1080p, fast inference. Pruna optimizes models for speed without quality loss. Triggers: pruna video, p-video, pruna ai video, fast video generation, optimized video, wan t2v, wan i2v, economic video generation, cheap video generation, pruna text to video, pruna image to video
documentation
Still-to-video conversion guide: model selection, motion prompting, and camera movement. Covers Wan 2.5 i2v, Seedance, Fabric, Grok Video with when to use each. Use for: animating images, creating video from stills, adding motion, product animations. Triggers: image to video, i2v, animate image, still to video, add motion to image, image animation, photo to video, animate still, wan i2v, image2video, bring image to life, animate photo, motion from image
tools
Generate videos with Google Veo models via inference.sh CLI. Models: Veo 3.1, Veo 3.1 Fast, Veo 3, Veo 3 Fast, Veo 2. Capabilities: text-to-video, cinematic output, high quality video generation. Triggers: veo, google veo, veo 3, veo 2, veo 3.1, vertex ai video, google video generation, google video ai, veo model, veo video