claude/skills/gemini-image-gen/SKILL.md
Guide for implementing Google Gemini API image generation - create high-quality images from text prompts using gemini-2.5-flash-image model. Use when generating images, creating visual content, or implementing text-to-image features. Supports text-to-image, image editing, multi-image composition, and iterative refinement.
npx skillsauth add einverne/dotfiles gemini-image-genInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate high-quality images using Google's Gemini 2.5 Flash Image model with text prompts, image editing, and multi-image composition capabilities.
Use this skill when you need to:
The skill automatically detects your GEMINI_API_KEY in this order:
export GEMINI_API_KEY="your-key".claude/skills/gemini-image-gen/.env./.env (project root)Get your API key: Visit Google AI Studio
Create .env file with:
GEMINI_API_KEY=your_api_key_here
Install required package:
pip install google-genai
from google import genai
from google.genai import types
import os
# API key detection handled automatically by helper script
client = genai.Client(api_key=os.getenv('GEMINI_API_KEY'))
response = client.models.generate_content(
model='gemini-2.5-flash-image',
contents='A serene mountain landscape at sunset with snow-capped peaks',
config=types.GenerateContentConfig(
response_modalities=['image'],
aspect_ratio='16:9'
)
)
# Save to ./docs/assets/
for i, part in enumerate(response.candidates[0].content.parts):
if part.inline_data:
with open(f'./docs/assets/generated-{i}.png', 'wb') as f:
f.write(part.inline_data.data)
For convenience, use the provided helper script that handles API key detection and file saving:
# Generate single image
python .claude/skills/gemini-image-gen/scripts/generate.py \
"A futuristic city with flying cars" \
--aspect-ratio 16:9 \
--output ./docs/assets/city.png
# Generate with specific modalities
python .claude/skills/gemini-image-gen/scripts/generate.py \
"Modern architecture design" \
--response-modalities image text \
--aspect-ratio 1:1
| Ratio | Resolution | Use Case | Token Cost | |-------|-----------|----------|------------| | 1:1 | 1024×1024 | Social media, avatars | 1290 | | 16:9 | 1344×768 | Landscapes, banners | 1290 | | 9:16 | 768×1344 | Mobile, portraits | 1290 | | 4:3 | 1152×896 | Traditional media | 1290 | | 3:4 | 896×1152 | Vertical posters | 1290 |
['image']: Generate only images['text']: Generate only text descriptions['image', 'text']: Generate both images and descriptionsProvide existing image + text instructions to modify:
import PIL.Image
img = PIL.Image.open('original.png')
response = client.models.generate_content(
model='gemini-2.5-flash-image',
contents=[
'Add a red balloon floating in the sky',
img
]
)
Combine up to 3 source images (recommended):
img1 = PIL.Image.open('background.png')
img2 = PIL.Image.open('foreground.png')
response = client.models.generate_content(
model='gemini-2.5-flash-image',
contents=[
'Combine these images into a cohesive scene',
img1,
img2
]
)
Structure effective prompts with three elements:
Example: "A robot in a futuristic city, cyberpunk style with neon lighting and rain-slicked streets"
Quality modifiers:
Text in images:
See references/prompting-guide.md for comprehensive prompt engineering strategies.
The model includes adjustable safety filters. Configure per-request:
config = types.GenerateContentConfig(
response_modalities=['image'],
safety_settings=[
types.SafetySetting(
category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
)
]
)
See references/safety-settings.md for detailed configuration options.
All generated images should be saved to ./docs/assets/ directory:
# Create directory if needed
mkdir -p ./docs/assets
The helper script automatically saves to this location with timestamped filenames.
Model: gemini-2.5-flash-image
Common issues and solutions:
API key not found:
# Check environment variables
echo $GEMINI_API_KEY
# Verify .env file exists
cat .claude/skills/gemini-image-gen/.env
# or
cat .env
Safety filter blocking:
response.prompt_feedback.block_reasonToken limit exceeded:
For detailed information, see:
references/api-reference.md - Complete API specificationsreferences/prompting-guide.md - Advanced prompt engineeringreferences/safety-settings.md - Safety configuration detailsreferences/code-examples.md - Additional implementation examplesdevelopment
生成符合项目规范的 React 组件。当用户要求创建组件、新建 React 组件或生成组件文件时使用
development
生成符合 Conventional Commits 规范的 Git 提交信息。当用户要求生成提交、创建 commit 或写提交信息时使用
devops
将当前分支部署到测试环境。当用户要求部署、发布到测试或在 staging 环境测试时使用
development
进行系统化的代码审查,检查代码质量、安全性和性能。当用户要求审查代码、review 或检查代码时使用