.claude/skills/gemini-3-image-generation/SKILL.md
Generate images with Gemini 3 Pro Image (Nano Banana Pro). Covers 4K generation, text rendering, grounded generation with Google Search, conversational editing, and cost optimization. Use when creating images, generating 4K images, editing images conversationally, fact-verified image generation, or image output tasks.
npx skillsauth add adaptationio/skrillz gemini-3-image-generationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive guide for generating images with Gemini 3 Pro Image (gemini-3-pro-image-preview), also known as Nano Banana Pro. This skill focuses on IMAGE OUTPUT (generating images) - see gemini-3-multimodal for INPUT (analyzing images).
Gemini 3 Pro Image (Nano Banana Pro 🍌) is Google's image generation model featuring native 4K support, text rendering within images, grounded generation with Google Search, and conversational editing capabilities.
gemini-3-pro-api skill)gemini-3-pro-image-previewimport google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
# Use the image generation model
model = genai.GenerativeModel("gemini-3-pro-image-preview")
# Generate image
response = model.generate_content("A serene mountain landscape at sunset")
# Save image
if response.parts:
with open("generated_image.png", "wb") as f:
f.write(response.parts[0].inline_data.data)
print("Image saved!")
import { GoogleGenerativeAI } from "@google/generative-ai";
import fs from "fs";
const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro-image-preview" });
const result = await model.generateContent("A serene mountain landscape at sunset");
const imageData = result.response.parts[0].inlineData.data;
fs.writeFileSync("generated_image.png", Buffer.from(imageData, "base64"));
console.log("Image saved!");
Goal: Create high-quality images from text descriptions.
Python Example:
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel(
"gemini-3-pro-image-preview",
generation_config={
"thinking_level": "high", # Best quality
"temperature": 1.0
}
)
# Generate image
prompt = """A futuristic cityscape at night with:
- Neon lights and holographic advertisements
- Flying vehicles
- Tall skyscrapers with unique architecture
- Rain-slicked streets reflecting the lights
- Cinematic, detailed, 4K quality"""
response = model.generate_content(prompt)
# Save image
if response.parts and hasattr(response.parts[0], 'inline_data'):
image_data = response.parts[0].inline_data.data
with open("futuristic_city.png", "wb") as f:
f.write(image_data)
print("Image generated successfully!")
else:
print("No image generated")
Tips for Better Prompts:
See: references/generation-guide.md for comprehensive prompting techniques
Goal: Create high-resolution 4K images with upscaling.
Python Example:
# Generate with 4K quality specification
prompt = """A photorealistic portrait of a scientist in a modern lab:
- 4K ultra-high definition
- Sharp focus on subject
- Soft bokeh background
- Professional studio lighting
- Fine detail in textures
- Cinema-grade quality"""
response = model.generate_content(prompt)
# 4K image will be generated
if response.parts:
with open("scientist_4k.png", "wb") as f:
f.write(response.parts[0].inline_data.data)
4K Features:
See: references/resolution-guide.md for resolution control
Goal: Generate images with readable, high-quality text.
Python Example:
prompt = """Create a professional business card design with:
- Company name: "TechVision AI"
- Text: "Dr. Sarah Chen"
- Text: "Chief AI Officer"
- Text: "[email protected]"
- Text: "+1 (555) 123-4567"
- Modern, clean design
- Professional fonts
- Blue and white color scheme
- All text clearly readable"""
response = model.generate_content(prompt)
if response.parts:
with open("business_card.png", "wb") as f:
f.write(response.parts[0].inline_data.data)
Text Rendering Best Practices:
See: references/generation-guide.md for text rendering techniques
Goal: Generate factually accurate images using Google Search grounding.
Python Example:
# Enable Google Search grounding for factual accuracy
model_grounded = genai.GenerativeModel(
"gemini-3-pro-image-preview",
tools=[{"google_search_retrieval": {}}] # Enable grounding
)
prompt = """Generate an accurate image of the International Space Station
with Earth in the background. Use current ISS configuration."""
response = model_grounded.generate_content(prompt)
if response.parts:
with open("iss_grounded.png", "wb") as f:
f.write(response.parts[0].inline_data.data)
# Check if grounding was used
if hasattr(response, 'grounding_metadata'):
print(f"Grounding sources used: {len(response.grounding_metadata.grounding_chunks)}")
Grounded Generation Use Cases:
Benefits:
Note: Uses free Google Search quota (1,500 queries/day)
See: references/grounded-generation.md for comprehensive guide
Goal: Iteratively refine images through multi-turn conversation.
Python Example:
model = genai.GenerativeModel("gemini-3-pro-image-preview")
# Start a chat session for conversational editing
chat = model.start_chat()
# First generation
response1 = chat.send_message("Create a cozy coffee shop interior")
if response1.parts:
with open("coffee_shop_v1.png", "wb") as f:
f.write(response1.parts[0].inline_data.data)
# Refine the image
response2 = chat.send_message("Add more plants and warm lighting")
if response2.parts:
with open("coffee_shop_v2.png", "wb") as f:
f.write(response2.parts[0].inline_data.data)
# Further refinement
response3 = chat.send_message("Make it more minimalist, remove some decorations")
if response3.parts:
with open("coffee_shop_v3.png", "wb") as f:
f.write(response3.parts[0].inline_data.data)
Conversational Editing Features:
Example Editing Commands:
See: references/conversational-editing.md for advanced patterns
Goal: Generate images in specific aspect ratios.
Python Example:
# 16:9 aspect ratio (4K supported)
prompt_169 = "A cinematic landscape in 16:9 aspect ratio, 4K quality"
# Square aspect ratio
prompt_square = "A square logo design for a tech company"
# Portrait orientation
prompt_portrait = "A portrait-oriented movie poster"
response = model.generate_content(prompt_169)
# Image will be generated in specified ratio
Supported Ratios:
Goal: Balance quality and cost for image generation.
Pricing:
Python Cost Optimization:
def generate_with_cost_tracking(prompt):
"""Generate image and track costs"""
response = model.generate_content(prompt)
# Calculate cost
usage = response.usage_metadata
input_cost = (usage.prompt_token_count / 1_000_000) * 2.00
output_cost = (usage.candidates_token_count / 1_000_000) * 9.00
image_cost = 0.134 # Per image
total_cost = input_cost + output_cost + image_cost
print(f"Input tokens: {usage.prompt_token_count} (${input_cost:.6f})")
print(f"Output tokens: {usage.candidates_token_count} (${output_cost:.6f})")
print(f"Image cost: ${image_cost:.6f}")
print(f"Total: ${total_cost:.6f}")
return response
response = generate_with_cost_tracking("A beautiful sunset over mountains")
Cost Optimization Strategies:
See: references/pricing-optimization.md for detailed strategies
import google.generativeai as genai
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel("gemini-3-pro-image-preview")
prompts = [
"A serene mountain lake at dawn",
"A bustling market in Morocco",
"A futuristic robot assistant",
"An abstract geometric pattern"
]
for i, prompt in enumerate(prompts):
print(f"Generating image {i+1}/{len(prompts)}: {prompt}")
response = model.generate_content(prompt)
if response.parts:
with open(f"generated_{i+1}.png", "wb") as f:
f.write(response.parts[0].inline_data.data)
print(f" Saved: generated_{i+1}.png")
from google.api_core import exceptions
def safe_image_generation(prompt):
"""Generate image with error handling"""
try:
response = model.generate_content(prompt)
if not response.parts:
return {"success": False, "error": "No image generated"}
if not hasattr(response.parts[0], 'inline_data'):
return {"success": False, "error": "Invalid response format"}
return {
"success": True,
"image_data": response.parts[0].inline_data.data,
"mime_type": response.parts[0].inline_data.mime_type
}
except exceptions.InvalidArgument as e:
return {"success": False, "error": f"Invalid prompt: {e}"}
except exceptions.ResourceExhausted as e:
return {"success": False, "error": f"Rate limit exceeded: {e}"}
except Exception as e:
return {"success": False, "error": f"Error: {e}"}
Core Guides
Optimization
Scripts
Official Resources
Solution: Check response.parts exists and has inline_data attribute
Solution: Add "4K", "high quality", "detailed" to prompt
Solution: Specify text explicitly in quotes, request "readable text"
Solution: Enable grounded generation with Google Search
Solution: Optimize prompts, batch requests, monitor usage
This skill provides complete image generation capabilities:
✅ Text-to-image generation ✅ Native 4K support ✅ Text rendering in images ✅ Grounded generation (fact-verified) ✅ Conversational editing ✅ Custom aspect ratios ✅ Cost optimization ✅ Production-ready examples
Ready to generate images? Start with Task 1: Generate Image from Text Prompt above!
development
Setup secure web-based terminal access to WSL2 from mobile/tablet via ttyd + ngrok/Cloudflare/Tailscale. One-command install, start, stop, status. Use when you need remote terminal access, web terminal, browser-based shell, or mobile access to WSL2 environment.
development
Complete development workflows where Claude writes the code while Gemini and Codex provide research, planning, reviews, and different perspectives. Claude remains the main developer. Use for complex projects requiring expert planning and multi-perspective reviews.
development
Systematic progress tracking for skill development. Manages task states (pending/in_progress/completed), updates in real-time, reports progress, identifies blockers, and maintains momentum. Use when tracking skill development, coordinating work, or reporting progress.
testing
Comprehensive testing workflow orchestrating functional testing, example validation, integration testing, and usability assessment. Sequential workflow for complete skill testing from examples through scenarios to integration validation. Use when conducting thorough testing, pre-deployment validation, ensuring skill functionality, or comprehensive quality checks.