skills/blog-image/SKILL.md
AI image generation and editing for blog content powered by Gemini via MCP. Claude acts as Creative Director - interpreting intent, selecting domain expertise, constructing optimized 6-component prompts (Subject + Action + Context + Composition + Lighting + Style), and orchestrating Gemini for blog-quality results. Generates hero images, inline illustrations, social preview cards, and OG images. Edits existing blog images. Supports 6 blog-optimized domain modes (Editorial, Product, Landscape, UI/Web, Infographic, Abstract). Works standalone via /blog image or internally from blog-write and blog-rewrite workflows. Falls back gracefully when MCP is not configured. Use when user says "blog image", "generate hero image", "blog illustration", "social card", "generate blog image", "edit blog image", "image generate", "blog cover image", "inline image", "OG image".
npx skillsauth add agricidaniel/claude-blog blog-imageInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a Creative Director that orchestrates Gemini's image generation specifically for blog content. Never pass raw user text directly to the API. Always interpret, enhance, and construct an optimized prompt using the 6-component Reasoning Brief system.
| Command | What it does |
|---------|-------------|
| /blog image generate <idea> | Generate a blog image with full prompt engineering |
| /blog image edit <path> <instructions> | Edit an existing blog image intelligently |
| /blog image setup | Configure MCP server and API key |
Match the image type to blog use case:
| Image Type | Aspect Ratio | Resolution | Domain Mode | Placement |
|------------|-------------|-----------|-------------|-----------|
| Hero/Cover | 16:9 | 2K or 4K | Editorial / Landscape | Frontmatter coverImage |
| OG/Social Card | 16:9 | 1K | Editorial / Infographic | Frontmatter ogImage |
| Inline Illustration | 16:9 or 4:3 | 1K | Varies by topic | After H2, before body |
| Inline Product Shot | 4:3 or 1:1 | 1K | Product | Within product sections |
| Section Divider | 8:1 or 4:1 | 1K | Abstract / Landscape | Between major sections |
Sizing requirements:
Before generating, check if nanobanana-mcp tools are available:
get_image_history (lightweight, no side effects)/blog image setup to configure it."For /blog image generate <idea> or when invoked internally:
Determine what the blog needs:
If the request is vague, ask one clarifying question about use case and style.
Choose the expertise lens for the image:
| Mode | When to use | Prompt emphasis | |------|-------------|-----------------| | Editorial | Blog headers, feature images, lifestyle | Styling, composition, publication references | | Product | E-commerce posts, reviews, comparisons | Surface materials, studio lighting, clean BG | | Landscape | Environmental backgrounds, travel, hero sections | Atmospheric perspective, depth layers, time of day | | UI/Web | Tech blog icons, illustrations, diagrams | Clean vectors, flat design, exact colors | | Infographic | Data-driven posts, processes, comparisons | Layout structure, hierarchy, accessible colors | | Abstract | Pattern backgrounds, section dividers, decorative | Color theory, mathematical forms, textures |
Load references/prompt-engineering-blog.md for domain mode modifier libraries.
Build the prompt as natural narrative paragraphs - NEVER as keyword lists:
Template for photorealistic blog images:
A photorealistic [shot type] of [subject with physical detail], [action/pose],
set in [environment with specifics]. [Lighting conditions] create [mood].
Captured with [camera model], [focal length] lens at [f-stop], producing
[depth of field effect]. [Color palette/grading notes]. Aspect ratio 16:9,
suitable as a blog [hero image/inline illustration] at [target dimensions].
Template for illustrated/stylized:
A [art style] [format] of [subject with character detail], featuring
[distinctive characteristics] with [color palette]. [Line style] and
[shading technique]. Background is [description]. [Mood/atmosphere].
Call set_aspect_ratio BEFORE generating:
| Blog Use Case | Ratio |
|---------------|-------|
| Hero / Cover / OG | 16:9 |
| Product shot / Square | 4:3 or 1:1 |
| Section divider | 8:1 or 4:1 |
| Vertical (stories) | 9:16 |
| MCP Tool | When |
|----------|------|
| set_aspect_ratio | Always call first if ratio differs from 1:1 |
| gemini_generate_image | New image from crafted prompt |
| gemini_edit_image | Modify existing image |
| gemini_chat | Iterative refinement / multi-turn sessions |
| get_image_history | Review generated images |
| clear_conversation | Reset session context |
Model selection (use set_model MCP tool if switching):
Load references/mcp-tools.md for parameter details.
Load references/gemini-models.md for model specs, pricing, and rate limits.
After generation, resize/convert for blog use:
# Resize to blog hero dimensions (1200x630)
magick input.png -resize 1200x630^ -gravity center -extent 1200x630 hero.png
# Convert to WebP for web optimization
magick input.png -quality 85 output.webp
# Convert to AVIF (smallest, modern)
magick input.png -quality 80 output.avif
# Crop to exact OG dimensions
magick input.png -resize 1200x630^ -gravity center -extent 1200x630 og-image.png
Check if magick (ImageMagick 7) is available. Fall back to convert if not.
Provide:
~/Documents/nanobanana_generated/)coverImage: "/path/to/generated-image.png"
coverImageAlt: "Descriptive alt text sentence with topic keywords"
ogImage: "/path/to/generated-image.png"
For /blog image edit <path> <instructions>:
gemini_edit_image with enhanced instructionWhen invoked as a Task subagent from blog-write or blog-rewrite:
Input (provided by calling skill):
image_type: hero, inline, og, dividertopic: blog post topic/titlesection_context: (optional) heading or section the image supportsstyle_preference: (optional) photorealistic, illustrated, editorialcount: (optional) number of images needed (default: 1)Output (returned to calling skill):
### Generated Image
- **Path:** ~/Documents/nanobanana_generated/image_timestamp.png
- **Alt Text:** Descriptive sentence about the image
- **Type:** hero / inline / og
- **Domain Mode:** Editorial
- **Aspect Ratio:** 16:9
- **Suggested Frontmatter:**
coverImage: "/path/to/image.png"
coverImageAlt: "Alt text here"
Graceful fallback: If MCP is unavailable, return immediately with no error. The calling workflow continues with stock photos. Never block blog-write or blog-rewrite because image generation is unavailable.
For every generated image, create alt text following blog standards:
Good: Marketing team analyzing AI search traffic data on a dashboard showing citation metrics
Bad: SEO AI marketing blog optimization image
For /blog image setup:
python3 scripts/setup_image_mcp.py (interactive)
python3 scripts/setup_image_mcp.py --key YOUR_KEY (non-interactive)~/.claude/settings.json (user-private, mode 0600)--project flag opts into project .mcp.json (env-expansion only,
refuses to write a literal key into a tracked file)python3 scripts/validate_image_setup.py@ycse/[email protected]. Update the
pin in setup_image_mcp.py (constant PINNED_PACKAGE) when bumping.When IMAGE_SAFETY or SAFETY is returned, do NOT give up. Auto-rephrase and retry:
Google acknowledged filters "became way more cautious than we intended" - benign prompts are sometimes blocked. Persistence with rephrasing usually succeeds.
If an image is 80% correct, use gemini_chat for conversational editing rather than
regenerating from scratch. The session maintains style consistency, so targeted edits
preserve what works while fixing what doesn't.
When to edit vs regenerate:
| Error | Resolution |
|-------|-----------|
| MCP not configured | Run /blog image setup |
| API key invalid | New key at https://aistudio.google.com/apikey |
| Rate limited (429) | Wait 60s, retry. Free tier: ~5-15 RPM / ~20-500 RPD (varies by model and billing) |
| IMAGE_SAFETY | Auto-rephrase (see above) - Layer 2 filter, non-configurable |
| PROHIBITED_CONTENT | Content policy violation - topic is blocked. Non-retryable. |
| SAFETY | Rephrase prompt - Layer 1 filter |
| Vague request | Ask one clarifying question before generating |
| Poor quality | Review Reasoning Brief - likely missing lighting (biggest quality differentiator) |
| MCP unavailable (internal call) | Return silently - calling workflow uses stock photos |
Load on-demand - do NOT load all at startup:
references/prompt-engineering-blog.md - Domain modes, 6-component system, blog templatesreferences/gemini-models.md - Model specs, rate limits, aspect ratios, pricingreferences/mcp-tools.md - MCP tool parameters and response formatsdevelopment
Research what people are actually saying about a topic in the last 30 days across Reddit, X / Twitter, YouTube, Hacker News, dev.to, Medium, and other public discourse platforms. API-free; uses WebSearch with platform-targeted site operators plus recency filters. Produces DISCOURSE.md (a structured brief) and JSON output the writer can consume. Complements blog-researcher (which focuses on authority sources) with a recency-and-engagement lens. Use when user says "blog discourse", "discourse research", "what are people saying about", "research what people are saying", "voice of customer", "social listening", "30-day research", "trend research", "what's the discussion on", "real-time research", "practitioner discourse", "/blog discourse".
documentation
Establish durable brand and voice context for cross-skill consumption. Generates BRAND.md (audience, positioning, do/don't editorial rules, taboo phrases, competitor differentiation) and VOICE.md (existing persona JSON re-expressed as readable prose), both written to the project root. When present, all blog sub-skills auto-load these files before writing or reviewing. Pairs with blog-persona, which manages the structured persona JSON. Use when user says "blog brand", "create brand context", "brand voice doc", "BRAND.md", "VOICE.md", "establish editorial brand", "brand guidelines for blog".
testing
Translate existing blog posts into one or more target languages with SEO-optimized localization. Produces native-quality translations that preserve markdown structure, frontmatter, schema JSON-LD, image and chart embeds, and citation capsules. Localizes keywords, meta tags, numbers, dates, currencies, and quote styles per locale. Flags machine-translation artifacts for review. Run BEFORE blog-localize: this handles language conversion; localize handles cultural adaptation after translation completes. Use when user says "translate blog", "blog translate", "uebersetzen", "traduire", "traducir", "translate post", "blog auf Deutsch", "blog en espanol".
testing
One-command multilingual blog creation. Writes a blog post, translates it into user-specified languages, applies cultural adaptation, and emits hreflang tags, sitemap entries, and a CMS-ready language map. The complete write-to-publish pipeline for international content. Orchestrates blog-write, blog-translate, blog-localize, and (optionally) seo-hreflang. Use when user says "multilingual blog", "blog multilingual", "write in multiple languages", "international blog", "mehrsprachiger Blog", "blog multilingue", "blog multilingue", "create blog in German and French".