skills/nano-banana-builder/SKILL.md
Build Next.js App Router image-generation apps using Gemini Nano Banana / Nano Banana Pro with AI SDK. Covers exact model names, Server Actions/API routes, conversational multi-turn image editing, storage, rate limiting, safety, and cost controls. Trigger: nano banana, Gemini image, AI 生图, 图片生成, text-to-image, image generation app, iterative image editor, multi-turn image editing
npx skillsauth add shiqkuangsan/oh-my-daily-skills tooyoung:nano-banana-builderInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Build production-ready web applications powered by Google's Nano Banana image generation APIs—creating everything from simple text-to-image generators to sophisticated iterative editors with multi-turn conversation.
Use ONLY these exact model strings. Do not invent, guess, or add date suffixes.
| Model String (use exactly) | Alias | Use Case |
| ---------------------------- | --------------- | ------------------------------------ |
| gemini-2.5-flash-image | Nano Banana | Fast iterations, drafts, high volume |
| gemini-3-pro-image-preview | Nano Banana Pro | Quality output, text rendering, 2K |
Common mistakes to avoid:
gemini-2.5-flash-preview-05-20 — wrong, date suffixes are for text modelsgemini-2.5-pro-image — wrong, 2.5 Pro doesn't do image generationgemini-3-flash-image — wrong, doesn't existgemini-pro-vision — wrong, that's for image input, not generationThe only valid image generation models are gemini-2.5-flash-image and gemini-3-pro-image-preview.
Examples were tested against the versions below; verify the latest AI SDK and Google provider docs before upgrading:
| Package | Minimum Version | Recommended |
| ---------------- | --------------- | ----------- |
| ai | 3.4.0+ | ^4.0.0 |
| @ai-sdk/google | 0.0.52+ | ^1.0.0 |
| @ai-sdk/react | 0.0.62+ | ^1.0.0 |
| next | 14.0.0+ | ^15.0.0 |
| react | 18.2.0+ | ^19.0.0 |
Important notes:
'use server' directive# Check your versions
npm list ai @ai-sdk/google @ai-sdk/react next
# Update to latest
npm update ai @ai-sdk/google @ai-sdk/react
Breaking changes to watch:
result.files[0] structure may change between major versionsproviderOptions.google namespace for Gemini-specific configsuseChat hook API from @ai-sdk/reactNano Banana isn't just another image API—it's conversational by design. The core insight is that image generation works best as a dialogue, not a one-shot prompt.
Think of it as working with an AI art director:
gemini-2.5-flash-image for speed/iterations, gemini-3-pro-image-preview for quality/complexityChoose based on use case:
| Use Case | Model | Why |
| ------------------------ | ---------------------------- | ------------------------------------------ |
| Rapid iterations, drafts | gemini-2.5-flash-image | Fast (2-5s), lower cost per image |
| Final output, quality | gemini-3-pro-image-preview | Superior quality, thinking, text rendering |
| Text-heavy images | gemini-3-pro-image-preview | Best typography, 2K resolution |
| Multi-turn editing | Either | Both support conversational editing |
| High volume | gemini-2.5-flash-image | Lower cost, faster throughput |
// app/actions/generate.ts
"use server";
import { google } from "@ai-sdk/google";
import { generateText } from "ai";
export async function generateImage(prompt: string) {
const result = await generateText({
model: google("gemini-2.5-flash-image"),
prompt,
providerOptions: {
google: {
responseModalities: ["IMAGE"],
imageConfig: { aspectRatio: "16:9" },
},
},
});
return result.files[0]; // { base64, uint8Array, mediaType }
}
// app/components/ImageGenerator.tsx
'use client'
import { useChat } from '@ai-sdk/react'
export function ImageGenerator() {
const { append, messages, isLoading } = useChat({
api: '/api/generate'
})
return (
<div>
{messages.map(m => (
<div key={m.id}>
{m.parts?.map((part, i) =>
part.type === 'image' && (
<img key={i} src={part.url} alt="Generated" />
)
)}
</div>
))}
<button
disabled={isLoading}
onClick={() => append({
role: 'user',
content: 'A futuristic cityscape at dusk'
})}
>
Generate
</button>
</div>
)
}
For prompt structure, quality boosters, enhancer utility, negative prompts, and use-case templates, see references/prompt-engineering.md.
For complete implementations including:
See references/advanced-patterns.md
For Gemini safety settings, pre-generation prompt filtering, safety block handling, and production best practices, see references/safety-settings.md.
For detailed configuration and operational concerns:
See references/configuration.md
❌ Inventing model names or adding date suffixes:
Why wrong: Image generation models have specific names; date suffixes like -preview-05-20 are for text models only
Better: Use exactly gemini-2.5-flash-image or gemini-3-pro-image-preview — no variations
❌ Using Gemini 2.5 Pro for images:
Why wrong: Gemini 2.5 Pro doesn't generate images directly
Better: Use gemini-2.5-flash-image or gemini-3-pro-image-preview
❌ Storing only base64 in database: Why wrong: Blobs database, expensive storage, slow retrieval Better: Store in object storage (Vercel Blob/S3), save URL only
❌ No rate limit handling: Why wrong: Will hit 429 errors in production, poor UX Better: Implement rate limiting with user-friendly error messages
❌ Ignoring multi-turn context: Why wrong: Wastes Nano Banana's conversational editing strength Better: Track chat history for iterative refinement
❌ Hardcoding API keys client-side: Why wrong: Exposes credentials, security risk Better: Use server actions / API routes with environment variables
❌ Using wrong aspect ratio: Why wrong: 21:9 on 1:1 request wastes tokens, unexpected crop Better: Match aspect ratio to intended use case
❌ No loading states: Why wrong: Image generation takes 5-30s, users think it's broken Better: Show progress indicators and estimated wait time
❌ Generating on every keystroke: Why wrong: Wastes quota, slow response Better: Debounce prompts, require explicit action
IMPORTANT: Every app should feel uniquely designed for its specific purpose.
Vary across dimensions:
Avoid overused patterns:
Context should drive design:
# .env.local
GEMINI_API_KEY=your_api_key_here
# For Vercel Blob storage
BLOB_READ_WRITE_TOKEN=your_vercel_token
# For S3 (optional)
S3_BUCKET=your-bucket
S3_ENDPOINT=https://your-endpoint.r2.cloudflarestorage.com
S3_ACCESS_KEY_ID=your_key
S3_SECRET_ACCESS_KEY=your_secret
# For Upstash rate limiting (optional)
UPSTASH_REDIS_REST_URL=your_url
UPSTASH_REDIS_REST_TOKEN=your_token
# Install dependencies
npm install @ai-sdk/google ai @ai-sdk/react @vercel/blob
# Or if using separate packages
npm install google-genai
Nano Banana enables conversational image generation that feels like working with a creative partner, not a tool.
The best apps:
You're building more than an image generator—you're creating a creative experience. Design it thoughtfully.
development
Show OpenAI Codex release highlights in Chinese. Fetch GitHub release notes, summarize feature-level changes, skip bug-fix/chore noise by default, and append a mandatory highlights section. Trigger words: Codex updates, Codex features, Codex 新功能, Codex 更新, OpenAI Codex releases
development
清理当前项目的 Claude Code 会话:列出 ~/.claude/projects 下最近会话,按序号或 sessionId 选择,经二次确认后删除对应 .jsonl 与同名附件目录。Trigger words: 清理 cc 会话, 删除历史会话, cc resume 会话, clean cc sessions, cc session cleaner
development
Create simple Three.js web apps with scene setup, lighting, geometries, materials, animations, OrbitControls, particles, and responsive rendering. Use for Three.js scenes, WebGL demos, 3D showcases, and interactive 3D web content. Trigger: threejs, Three.js, 3D scene, WebGL, 三维展示, 3D showcase, interactive 3D
development
为 Claude Code 定义个性化身份风格(人设)。触发词:定义人设、创建身份、persona、角色设定、CLAUDE.local.md