skills/thumbnail-creator/SKILL.md
Generate article or newsletter thumbnail candidates using the Gemini API from inside Claude Code. Claude reads article copy, proposes composition concepts, writes image generation prompts incorporating brand specs, calls Gemini to generate the images, evaluates the results via computer vision, and returns ranked candidates with rationale. Use when asked to create thumbnails, generate cover images, or produce visual candidates for an article or newsletter.
npx skillsauth add mohitagw15856/pm-claude-skills thumbnail-creatorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generates article and newsletter thumbnail candidates by acting as an image-generation agent inside Claude Code. Instead of switching between tools and prompting Gemini's web UI one image at a time, this skill makes Claude do the full loop: read the copy, propose compositions, write tailored prompts, call the Gemini API, evaluate the outputs, and return ranked results with brief rationale.
The output is production-ready thumbnail candidates you can drop directly into your CMS, newsletter tool, or social scheduler.
Both of these must be in place before the skill can generate images:
Get a free key from Google AI Studio.
Set it as an environment variable:
export GEMINI_API_KEY="your-key-here"
To persist it across sessions, add to your shell profile (~/.zshrc or ~/.bashrc):
echo 'export GEMINI_API_KEY="your-key-here"' >> ~/.zshrc
source ~/.zshrc
Verify it is set:
echo $GEMINI_API_KEY
This script must exist at ./generate_image.py in the project root. The full template is provided in the Script Template section below. Claude will check for it and offer to create it if missing.
Python dependencies:
pip install google-generativeai Pillow requests
Or with uv:
uv pip install google-generativeai Pillow requests
Claude will ask for these if not provided:
| Input | Required | Notes |
|---|---|---|
| Article copy or URL | Yes | Paste the full article text, or provide a URL to fetch. Used to extract themes, hooks, and key claims for composition. |
| Brand colours | Recommended | Hex codes or descriptive names. E.g. #1A1A2E (navy), #E94560 (coral). If not provided, Claude uses clean neutral defaults. |
| Fonts / type style | Recommended | E.g. "bold sans-serif", "editorial serif", "Neue Haas Grotesk". Used in prompt to guide text treatment. |
| Style reference description | Recommended | E.g. "flat illustration, minimal, like Stripe's marketing site" or "photorealistic, dark background, high contrast". A style image URL can also be provided. |
| Output dimensions | No | Defaults to 1792x1024 (landscape, standard article thumbnail). Options: 1024x1024 (square), 1024x1792 (portrait/mobile). |
| Number of candidates | No | Defaults to 4. Min 1, max 8 (API limits and cost). |
| Article title (if different from H1) | No | Used as the primary text element in image prompts. |
| Candidate selection | No | After proposing compositions, Claude asks which to generate. User can say "all" or pick by number. |
Claude presents 3-4 composition concepts for user approval. Format:
Composition Concepts for: "[Article Title]"
1. BOLD CLAIM
Layout: Full-bleed dark background, large white headline centred,
single accent data point (e.g. "3x faster") in brand colour below
Mood: High authority, newsletter-style
Best for: LinkedIn, Substack headers
Rationale: The article's central claim ("X outperforms Y by 3x") is specific
enough to anchor the visual — readers stop on data.
2. CONCEPTUAL OBJECT
Layout: Central object illustration (e.g. a broken clock for a time-waste article),
title in upper third, minimal texture background
Mood: Editorial, Medium-style
Best for: Blog header, Medium cover, email preheader
Rationale: Gives art directors visual metaphor flexibility; works across sizes.
3. CONTRAST SPLIT
Layout: Left half brand colour, right half white or image,
title on colour side, supporting subtext on white side
Mood: Clean, professional, startup-brand feel
Best for: Newsletter, LinkedIn carousel first slide
Rationale: Split layout performs consistently in newsletter A/B tests;
text is readable at small sizes.
4. TYPOGRAPHIC ONLY
Layout: No illustration, oversized title treatment,
author name in small caps at bottom, thin rule separator
Mood: Premium, confident, editorial
Best for: Substack, Ghost, high-density email lists
Rationale: Works when the brand has strong type identity. Fastest to produce.
Which compositions do you want generated? (Reply with numbers, e.g. "1, 3" or "all")
After generation, Claude saves files to ./thumbnails/[article-slug]/:
thumbnails/
└── article-slug-from-title/
├── candidate_01_bold_claim.png
├── candidate_02_conceptual_object.png
├── candidate_03_contrast_split.png
├── candidate_04_typographic.png
└── evaluation_report.md
Claude evaluates each returned image via computer vision and produces:
Thumbnail Evaluation — "[Article Title]"
Generated: 2026-05-27 | Model: Gemini Imagen | Dimensions: 1792x1024
| # | Candidate | Composition | Brand Fit /10 | Text Legibility /10 | Recommendation |
|---|---|---|---|---|---|
| 1 | candidate_01_bold_claim.png | Bold Claim | 9 | 8 | ★ Top pick — strong data anchor, brand colours correct, title readable at 200px width |
| 2 | candidate_02_conceptual_object.png | Conceptual Object | 7 | 9 | Good fallback — legible, clean, but illustration style drifted slightly from brand |
| 3 | candidate_03_contrast_split.png | Contrast Split | 8 | 7 | Works well at full size; test at thumbnail size before publishing — right side text tightens |
| 4 | candidate_04_typographic.png | Typographic | 9 | 10 | Strongest for email — zero brand drift risk, completely text-based |
Recommended for web: candidate_01_bold_claim.png
Recommended for email/mobile: candidate_04_typographic.png
Recommended for social: candidate_03_contrast_split.png
Files saved to: ./thumbnails/article-slug-from-title/
Accept article copy as pasted text or a URL.
If a URL is provided, fetch the page and extract:
If text is pasted, read it directly. Focus on:
Summarise these findings internally before proposing compositions — the proposals should feel tailored to this specific article, not generic.
Ask the user for brand specs if not provided:
To generate on-brand thumbnails, I need a few details:
1. Brand colours (hex codes or descriptions) — e.g. #1A1A2E, #E94560
2. Font style preference — e.g. "bold sans-serif", "editorial serif", "geometric"
3. Visual style — e.g. "flat minimal", "photorealistic", "illustrated", "typographic only"
4. Any style references — describe a brand or publication whose aesthetic you want to match,
or share an image URL
If you don't have brand specs yet, say "use clean defaults" and I'll use a professional
dark-on-white editorial style.
If the user says "use clean defaults", apply:
#FFFFFF or #0F0F0F (dark mode default)#2563EB (blue)Write 3-4 composition concepts tailored to the article's tone and content. Each concept must:
After presenting the concepts, ask which to generate. Wait for user confirmation before making any API calls.
For each selected composition, write a detailed image generation prompt. Image generation prompts follow a different grammar than text prompts — they are descriptive, not instructional.
Prompt structure:
[Subject/composition] + [Style] + [Colour palette] + [Mood/lighting] +
[Text treatment if any] + [What to avoid]
Example prompt for Bold Claim composition:
Article thumbnail image. Large bold white sans-serif headline text reading "3x Faster Than
Traditional Methods" centred on a deep navy blue background (#1A1A2E). Small coral accent
text (#E94560) below reading the subtitle. Minimal flat design, no gradients, no stock photo
elements, no people. Clean professional editorial style, high contrast, newsletter header
format, 16:9 landscape orientation. The composition is typographic — text is the hero,
no illustration required. Avoid: clip art, drop shadows, low contrast, crowded layout.
Prompt rules:
Before calling the API, verify:
# Check API key is set
echo $GEMINI_API_KEY
# Check script exists
ls -la ./generate_image.py
# Check dependencies
python3 -c "import google.generativeai, PIL, requests; print('Dependencies OK')"
If the script is missing, offer to create it using the template in the Script Template section below.
Run the generation script for each prompt:
python3 generate_image.py \
--prompt "your full prompt here" \
--output "./thumbnails/article-slug/candidate_01_bold_claim.png" \
--width 1792 \
--height 1024
Or pass all prompts in a batch config file:
python3 generate_image.py --config ./thumbnails/article-slug/prompts.json
After each image is saved, examine it using computer vision. Evaluate on two dimensions:
Brand Fit (score /10):
Text Legibility (score /10):
Note: Gemini Imagen sometimes renders text with spelling errors or distorted letterforms. If this happens, note it in the evaluation and suggest the user add the text overlay manually in Canva or Figma.
Write the evaluation summary table (format shown in Output Structure section) and save it as evaluation_report.md in the output folder.
Include:
After delivering the candidates, offer one iteration pass:
Want me to iterate on any of these?
Options:
- Adjust colours or style on a specific candidate
- Try a different composition concept
- Change the headline text
- Rerun with different Gemini parameters (different temperature/seed)
- Generate additional variants of the top pick
Just tell me what to change.
Claude should offer to write this file if generate_image.py is not present. This is the canonical template to use.
#!/usr/bin/env python3
"""
generate_image.py — Gemini Imagen wrapper for Thumbnail Creator skill.
Usage:
python3 generate_image.py --prompt "..." --output "./out.png" [--width 1792] [--height 1024]
python3 generate_image.py --config ./prompts.json
Config JSON format:
[
{
"prompt": "...",
"output": "./thumbnails/slug/candidate_01.png",
"width": 1792,
"height": 1024
}
]
Requirements:
pip install google-generativeai Pillow
"""
import os
import sys
import json
import argparse
import base64
from pathlib import Path
try:
import google.generativeai as genai
from google.generativeai import types as genai_types
except ImportError:
print("ERROR: google-generativeai not installed. Run: pip install google-generativeai")
sys.exit(1)
try:
from PIL import Image
import io
except ImportError:
print("ERROR: Pillow not installed. Run: pip install Pillow")
sys.exit(1)
def get_api_key() -> str:
key = os.environ.get("GEMINI_API_KEY", "")
if not key:
print("ERROR: GEMINI_API_KEY environment variable is not set.")
print("Get a key at: https://aistudio.google.com/app/apikey")
print("Then run: export GEMINI_API_KEY='your-key-here'")
sys.exit(1)
return key
def generate_image(
prompt: str,
output_path: str,
width: int = 1792,
height: int = 1024,
) -> bool:
"""
Call Gemini Imagen to generate a single image and save it to output_path.
Returns True on success, False on failure.
"""
api_key = get_api_key()
genai.configure(api_key=api_key)
# Determine aspect ratio from dimensions
ratio = width / height
if abs(ratio - 16/9) < 0.1:
aspect_ratio = "16:9"
elif abs(ratio - 1.0) < 0.1:
aspect_ratio = "1:1"
elif abs(ratio - 9/16) < 0.1:
aspect_ratio = "9:16"
else:
aspect_ratio = "16:9" # default fallback
try:
imagen_model = genai.ImageGenerationModel("imagen-3.0-generate-002")
result = imagen_model.generate_images(
prompt=prompt,
number_of_images=1,
aspect_ratio=aspect_ratio,
safety_filter_level="block_only_high",
person_generation="allow_adult",
)
if not result.images:
print(f" No images returned for: {output_path}")
return False
image_data = result.images[0]
# Ensure output directory exists
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
# Save the image
if hasattr(image_data, '_image_bytes'):
img_bytes = image_data._image_bytes
elif hasattr(image_data, 'image'):
img_bytes = image_data.image
else:
# Fallback: try to access raw data
img_bytes = bytes(image_data)
img = Image.open(io.BytesIO(img_bytes))
# Resize to exact dimensions if needed
if img.size != (width, height):
img = img.resize((width, height), Image.LANCZOS)
img.save(output_path, format="PNG", optimize=True)
print(f" Saved: {output_path} ({img.size[0]}x{img.size[1]})")
return True
except Exception as e:
print(f" ERROR generating image: {e}")
return False
def run_from_args():
parser = argparse.ArgumentParser(description="Gemini Imagen wrapper for thumbnail generation")
parser.add_argument("--prompt", type=str, help="Image generation prompt")
parser.add_argument("--output", type=str, help="Output file path (.png)")
parser.add_argument("--width", type=int, default=1792, help="Image width in pixels")
parser.add_argument("--height", type=int, default=1024, help="Image height in pixels")
parser.add_argument("--config", type=str, help="JSON config file with batch of prompts")
args = parser.parse_args()
if args.config:
# Batch mode
with open(args.config, "r") as f:
items = json.load(f)
print(f"Batch mode: {len(items)} image(s) to generate")
results = []
for i, item in enumerate(items, start=1):
print(f"\n[{i}/{len(items)}] Generating: {item['output']}")
ok = generate_image(
prompt=item["prompt"],
output_path=item["output"],
width=item.get("width", 1792),
height=item.get("height", 1024),
)
results.append({"output": item["output"], "ok": ok})
print(f"\nBatch complete: {sum(r['ok'] for r in results)}/{len(results)} succeeded")
for r in results:
status = "OK " if r["ok"] else "ERR"
print(f" {status} {r['output']}")
elif args.prompt and args.output:
# Single image mode
print(f"Generating: {args.output}")
ok = generate_image(
prompt=args.prompt,
output_path=args.output,
width=args.width,
height=args.height,
)
if ok:
print("Done.")
else:
print("Failed.")
sys.exit(1)
else:
parser.print_help()
sys.exit(1)
if __name__ == "__main__":
run_from_args()
To create this file from inside Claude Code:
# Claude will write this file if it doesn't exist:
ls ./generate_image.py || echo "Script missing — Claude will create it"
Claude should use this reference when writing image generation prompts. These patterns produce the most consistent results with Gemini Imagen.
| Composition type | Prompt anchor phrase | |---|---| | Text-led, dark background | "Bold white sans-serif headline text on deep [colour] background, minimal flat design" | | Text-led, light background | "High-contrast black headline text on clean white background, editorial layout" | | Object/illustration centred | "Centred [object] illustration, [style], [colour] background, title text in upper third" | | Split layout | "Vertical split: left half [colour], right half white. Headline on left side, supporting text on right" | | Photography style | "Photorealistic [scene description], [mood] lighting, [colour] colour grade, text overlay area at [position]" |
flat design, no gradients — clean vector-style outputseditorial magazine style — sophisticated, typographicminimal, lots of whitespace — reduces visual noisehigh contrast, bold typography — strong thumbnail legibilityBauhaus-inspired — geometric, structureddark mode aesthetic — dark backgrounds with light textstartup marketing style — clean, optimistic, sans-serifAppend to every prompt:
Avoid: stock photography clichés, clipart, excessive gradients, drop shadows,
cluttered layout, lens flares, watermarks, low contrast text, AI artefacts.
Gemini Imagen sometimes renders short text phrases accurately and longer headlines poorly. If the article headline is longer than 6 words, consider splitting it in the prompt:
Primary headline: "[First 4-5 words]"
Secondary text: "[Remaining words]"
Or instruct the user to add text overlay manually in Canva after generation if legibility is critical.
| Issue | Cause | Fix |
|---|---|---|
| GEMINI_API_KEY not set | Environment variable missing | Run export GEMINI_API_KEY="your-key" and retry |
| ModuleNotFoundError: google.generativeai | Dependency missing | Run pip install google-generativeai |
| No images returned | Safety filter triggered | Revise prompt to remove any ambiguous language; check that the prompt doesn't describe faces, violence, or brand logos |
| Generated image has garbled text | Imagen text rendering limitation | Use shorter headline in prompt, or plan to add text overlay in Canva/Figma post-generation |
| Image is the wrong size | Aspect ratio mismatch | Confirm --width and --height args match one of the supported ratios (16:9, 1:1, 9:16) |
| generate_image.py not found | Script not created yet | Ask Claude to create it using the template above |
| API quota exceeded | Free tier limit | Wait or upgrade to Gemini API paid tier |
| Style drift from brand | Prompt not specific enough | Add exact hex codes and specific style descriptors; add stronger negative prompt |
Before marking the task complete, verify each item:
GEMINI_API_KEY environment variable confirmed set before any API callsgenerate_image.py script exists in project root — created from template if missinggoogle-generativeai, Pillow)./thumbnails/[article-slug]/ with correct slug derived from article titlecandidate_01_bold_claim.png)evaluation_report.md in the output folderGemini AI Studio free tier (as of early 2026):
Paid tier:
Recommendation:
Originally created by Karen Spinner (Wondering About AI) — adapted and extended for this library.
development
Build a framework for creating shareable, high-reach social media content. Use when asked to plan viral content, develop a shareable content strategy, create a hook writing system, or build a repeatable process for content that gets shared. Produces a platform-specific viral content framework with hook formulas, content structures, shareability triggers, and a content testing system.
testing
Flips Claude's default from "find reasons you're right" to "find reasons you're wrong." A genuine thinking partner, not a mirror with grammar. Use before high-stakes decisions, plans, assumptions, or pitches you haven't stress-tested.
development
Scrapes a Substack Notes page and exports engagement data (likes, comments, restacks) to a formatted .xlsx file with conditional formatting and summary stats.
testing
Audit an existing social media presence across all active platforms. Use when asked to review social media performance, analyse a brand's social presence, benchmark against competitors, or identify what's working and what isn't. Produces a scored audit with platform-by-platform analysis, content performance review, competitive benchmarking, and a prioritised action plan.