skills/gemini-image/SKILL.md
Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".
npx skillsauth add akrindev/google-studio-skills gemini-imageInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.
Use this skill when you need to:
Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models
When to use:
Key parameters:
| Parameter | Description | Example |
|-----------|-------------|---------|
| prompt | Text description (required) | "A futuristic city at sunset" |
| --model, -m | Model to use | gemini-3.1-flash-image-preview |
| --output-dir, -o | Output directory for images | images/ |
| --name, -n | Base name for output files | artwork |
| --no-timestamp | Disable auto timestamp | Flag |
| --aspect, -a | Aspect ratio | 16:9 |
| --size, -s | Resolution | 2K or 4K |
| --num | Number of images (1-4) | 4 |
| --person | Person generation policy | allow_adult |
Output: List of saved PNG file paths
node scripts/generate_image.js "A futuristic city at sunset with flying cars"
gemini-3.1-flash-image-preview (default, Nano Banana 2)images/generated_image_YYYYMMDD_HHMMSS.pngnode scripts/generate_image.js "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
images/coffee-shop_YYYYMMDD_HHMMSS.pngnode scripts/generate_image.js "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
images/thumbnail_YYYYMMDD_HHMMSS.pngnode scripts/generate_image.js "Abstract geometric patterns in blue and gold" --num 4 --name abstract
images/abstract_YYYYMMDD_HHMMSS_0.png, images/abstract_YYYYMMDD_HHMMSS_1.png, etc.node scripts/generate_image.js "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum
gemini-3.1-flash-image-preview or gemini-3-pro-image-preview (for 4K)node scripts/generate_image.js "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate
imagen-4.0-generate-001 (photorealistic)node scripts/generate_image.js "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image
# 1. Generate content (gemini-text skill)
node skills/gemini-text/scripts/generate.js "Write a product description for smart home device"
# 2. Generate product image (this skill)
node scripts/generate_image.js "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product
# 3. Create social media post
node scripts/generate_image.js "Fixed filename image" --name my-image --no-timestamp
images/my-image.png (no timestamp)| Model | Nickname | Quality | Max Size | Best For |
|-------|----------|---------|----------|----------|
| gemini-3.1-flash-image-preview | Nano Banana 2 | Pro-level | 4K | New default, fast + strong quality |
| gemini-3-pro-image-preview | Nano Banana Pro | Highest | 4K | Maximum quality and complex text rendering |
| gemini-2.5-flash-image | Nano Banana | Good | 2K | High-volume, low-latency |
| imagen-4.0-generate-001 | Imagen 4 | Photorealistic | 2K | Realistic photos, product shots |
| Ratio | Use Case | 1K Size | 2K Size | |-------|----------|----------|----------| | 1:1 | Instagram, avatars | 1024x1024 | 2048x2048 | | 16:9 | YouTube, presentations | 1376x768 | 2752x1536 | | 9:16 | Instagram Stories, TikTok | 768x1376 | 1536x2752 | | 4:3 | Traditional displays | 1024x768 | 2048x1536 | | 3:4 | Portrait orientation | 768x1024 | 1536x2048 | | 21:9 | Ultrawide | - | 5504x2400 |
Note: 4K resolution is available with gemini-3.1-flash-image-preview and gemini-3-pro-image-preview
| Size | Use Case | Best Model | |------|----------|-------------| | 1K (1024px) | Web thumbnails, previews | Any model | | 2K (2048px) | Standard web, social media | Any model | | 4K (4096px) | Print, high-end assets | gemini-3-pro only |
| Policy | Description | Restrictions |
|---------|-------------|----------------|
| dont_allow | No people in images | None |
| allow_adult | Adults only | Recommended default |
| allow_all | All ages | Restricted in EU, UK, CH, MENA |
{name}_YYYYMMDD_HHMMSS.png (auto timestamp)artwork_20260130_031643.png{name}_YYYYMMDD_HHMMSS_0.png, {name}_YYYYMMDD_HHMMSS_1.png, etc.--no-timestamp): {name}.pngcd scripts && npm install
gemini-3.1-flash-image-preview or gemini-3-pro-image-preview--size 2K for older models--model gemini-3.1-flash-image-preview --size 4Kgemini-3.1-flash-image-preview for other languages--size 1K for smaller filesgemini-3.1-flash-image-preview for: Best default balance, quality, speed, 4Kgemini-3-pro-image-preview for: Maximum quality, complex text renderinggemini-2.5-flash-image for: Speed, high volumeimagen-4.0-generate-001 for: Photorealism, product shots--num--num 1 before generating batches# Basic
node scripts/generate_image.js "Your prompt"
# Social media (1:1)
node scripts/generate_image.js "Prompt" --aspect 1:1 --size 2K --name social-post
# YouTube thumbnail (16:9)
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 2K --name thumbnail
# 4K high quality
node scripts/generate_image.js "Prompt" --aspect 16:9 --size 4K --name high-res
# Multiple variations
node scripts/generate_image.js "Prompt" --num 4 --name variations
# Custom directory
node scripts/generate_image.js "Prompt" --output-dir ./my-images/ --name custom
# Photorealistic
node scripts/generate_image.js "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo
# No timestamp
node scripts/generate_image.js "Prompt" --name fixed-name --no-timestamp
references/ for model documentation (if available)development
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".
development
Generate text content using Google Gemini models via scripts/. Use for text generation, multimodal prompts with images, thinking mode for complex reasoning, JSON-formatted outputs, and Google Search grounding for real-time information. Triggers on "generate with gemini", "use gemini for text", "AI text generation", "multimodal prompt", "gemini thinking mode", "grounded response".
development
Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking, and file management. Triggers on "upload file", "file API", "upload image", "upload PDF", "upload video", "file management".
development
Generate text embeddings using Gemini Embedding API via scripts/. Use for creating vector representations of text, semantic search, similarity matching, clustering, and RAG applications. Triggers on "embeddings", "semantic search", "vector search", "text similarity", "RAG", "retrieval".