Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

rdfitted/gemini-image

Name: gemini-image
Author: rdfitted

skills/gemini-image/SKILL.md

npx skillsauth add rdfitted/claude-code-setup gemini-image

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Gemini Image Skill

Invoke Google Gemini models for image generation, image understanding, and visual analysis using the Python google-genai SDK.

Available Models

| Model ID | Description | Best For | Output Format | |----------|-------------|----------|---------------| | gemini-3-pro-image-preview | Best image generation + understanding | High-quality image gen, complex visual analysis | JPEG | | gemini-2.5-flash-image | Fast image generation | Quick image creation | PNG | | gemini-3-pro-preview | Multimodal understanding | Image analysis without generation | N/A | | gemini-2.5-flash | Fast vision | Quick image analysis | N/A |

Configuration

API Key: Set via $GEMINI_API_KEY environment variable

Usage

Image Generation

python -c "
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

response = client.models.generate_content(
    model='gemini-3-pro-image-preview',  # Returns JPEG | Use gemini-2.5-flash-image for PNG
    contents='Generate an image of a sunset over mountains',
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE', 'TEXT']
    )
)

# Map mime types to file extensions
mime_to_ext = {'image/png': '.png', 'image/jpeg': '.jpg', 'image/gif': '.gif', 'image/webp': '.webp'}

# Save generated image
if response.candidates and response.candidates[0].content:
    for part in response.candidates[0].content.parts:
        if hasattr(part, 'inline_data') and part.inline_data:
            ext = mime_to_ext.get(part.inline_data.mime_type, '.png')
            filename = f'output{ext}'
            # Data is already raw bytes - no base64 decode needed
            with open(filename, 'wb') as f:
                f.write(part.inline_data.data)
            print(f'Image saved to {filename} ({part.inline_data.mime_type})')
        elif hasattr(part, 'text'):
            print(part.text)
"

Image Understanding (Analyze Image from File)

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

# Read image file - must be base64 encoded for INPUT
with open('IMAGE_PATH', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Describe this image in detail'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

Image Understanding (From URL)

python -c "
import os
from google import genai
from google.genai import types
import urllib.request
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

# Fetch image from URL - must be base64 encoded for INPUT
url = 'IMAGE_URL_HERE'
with urllib.request.urlopen(url) as response:
    image_data = base64.b64encode(response.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='What is in this image?'),
            types.Part(inline_data=types.Blob(mime_type='image/jpeg', data=image_data))
        ])
    ]
)
print(response.text)
"

Workflow

When this skill is invoked:

Determine the task type:
- Image Generation: User wants to create an image
- Image Understanding: User wants to analyze an existing image
- Image Editing: User wants to modify an image (generation with reference)
Select the appropriate model:
- Image generation → gemini-3-pro-image-preview (JPEG) or gemini-2.5-flash-image (PNG)
- Image analysis → gemini-3-pro-preview or gemini-2.5-flash
Prepare the input:
- For generation: Text prompt describing desired image
- For understanding: Load image file as base64
Execute and handle output:
- Generation: Save binary image data to file
- Understanding: Return text description

Example Invocations

Generate Product Image

python -c "
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

response = client.models.generate_content(
    model='gemini-3-pro-image-preview',
    contents='Create a professional product photo of a sleek wireless headphone on a white background, studio lighting',
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE', 'TEXT']
    )
)

mime_to_ext = {'image/png': '.png', 'image/jpeg': '.jpg', 'image/gif': '.gif', 'image/webp': '.webp'}

if response.candidates and response.candidates[0].content:
    for part in response.candidates[0].content.parts:
        if hasattr(part, 'inline_data') and part.inline_data:
            ext = mime_to_ext.get(part.inline_data.mime_type, '.png')
            with open(f'headphone{ext}', 'wb') as f:
                f.write(part.inline_data.data)
            print(f'Image saved to headphone{ext}')
"

Analyze Screenshot

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('screenshot.png', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Analyze this UI screenshot. Identify any usability issues and suggest improvements.'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

OCR / Extract Text from Image

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('document.png', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Extract all text from this image. Preserve formatting where possible.'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

Compare Two Images

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('image1.png', 'rb') as f:
    img1_data = base64.b64encode(f.read()).decode('utf-8')
with open('image2.png', 'rb') as f:
    img2_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Compare these two images. What are the key differences?'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=img1_data)),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=img2_data))
        ])
    ]
)
print(response.text)
"

Image Generation Parameters

When generating images, you can customize:

config=types.GenerateContentConfig(
    response_modalities=['IMAGE', 'TEXT'],  # Request both image and description
    temperature=1.0,  # Higher = more creative
    # Additional parameters may be model-specific
)

Supported Image Formats

Input (for understanding):

PNG (image/png)
JPEG (image/jpeg)
GIF (image/gif)
WebP (image/webp)

Output (from generation):

PNG (default, image/png)
The API returns raw bytes in part.inline_data.data (NOT base64 encoded)
Check part.inline_data.mime_type to determine the actual format returned

Error Handling

Common errors and solutions:

Image too large: Resize image before sending (max varies by model)
Unsupported format: Convert to PNG/JPEG
Generation blocked: Adjust prompt to comply with safety guidelines
Rate limiting: Implement retry with exponential backoff

Notes

Image generation requires response_modalities=['IMAGE', 'TEXT'] in config
For best results with generation, be specific and descriptive in prompts
Image understanding works with both local files and URLs
Multiple images can be sent in a single request for comparison
Gemini 3 Pro Image is NOT available via CLI - must use Python SDK

Tools to Use

Bash: Execute Python commands
Read: Load image files (binary mode)
Write: Save generated images
Glob: Find image files in directories

rdfitted/gemini-image

skills/gemini-image/SKILL.md

Invoke Google Gemini for image generation and understanding using the Python google-genai SDK. Supports gemini-3-pro-image-preview (generation + understanding), gemini-2.5-flash-image (fast generation), and vision models for analysis.

1 stars

development

Updated May 22, 2026

$ install --global

skillsauth

npx skillsauth add rdfitted/claude-code-setup gemini-image

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 22, 2026, 5:10 AM178.3s1 file scanned

SKILL.md

name:: gemini-image
description:: Invoke Google Gemini for image generation and understanding using the Python google-genai SDK. Supports gemini-3-pro-image-preview (generation + understanding), gemini-2.5-flash-image (fast generation), and vision models for analysis.

Gemini Image Skill

Invoke Google Gemini models for image generation, image understanding, and visual analysis using the Python google-genai SDK.

Available Models

Configuration

API Key: Set via $GEMINI_API_KEY environment variable

Usage

Image Generation

python -c "
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

response = client.models.generate_content(
    model='gemini-3-pro-image-preview',  # Returns JPEG | Use gemini-2.5-flash-image for PNG
    contents='Generate an image of a sunset over mountains',
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE', 'TEXT']
    )
)

# Map mime types to file extensions
mime_to_ext = {'image/png': '.png', 'image/jpeg': '.jpg', 'image/gif': '.gif', 'image/webp': '.webp'}

# Save generated image
if response.candidates and response.candidates[0].content:
    for part in response.candidates[0].content.parts:
        if hasattr(part, 'inline_data') and part.inline_data:
            ext = mime_to_ext.get(part.inline_data.mime_type, '.png')
            filename = f'output{ext}'
            # Data is already raw bytes - no base64 decode needed
            with open(filename, 'wb') as f:
                f.write(part.inline_data.data)
            print(f'Image saved to {filename} ({part.inline_data.mime_type})')
        elif hasattr(part, 'text'):
            print(part.text)
"

Image Understanding (Analyze Image from File)

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

# Read image file - must be base64 encoded for INPUT
with open('IMAGE_PATH', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Describe this image in detail'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

Image Understanding (From URL)

python -c "
import os
from google import genai
from google.genai import types
import urllib.request
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

# Fetch image from URL - must be base64 encoded for INPUT
url = 'IMAGE_URL_HERE'
with urllib.request.urlopen(url) as response:
    image_data = base64.b64encode(response.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='What is in this image?'),
            types.Part(inline_data=types.Blob(mime_type='image/jpeg', data=image_data))
        ])
    ]
)
print(response.text)
"

Workflow

When this skill is invoked:

Determine the task type:
- Image Generation: User wants to create an image
- Image Understanding: User wants to analyze an existing image
- Image Editing: User wants to modify an image (generation with reference)
Select the appropriate model:
- Image generation → gemini-3-pro-image-preview (JPEG) or gemini-2.5-flash-image (PNG)
- Image analysis → gemini-3-pro-preview or gemini-2.5-flash
Prepare the input:
- For generation: Text prompt describing desired image
- For understanding: Load image file as base64
Execute and handle output:
- Generation: Save binary image data to file
- Understanding: Return text description

Example Invocations

Generate Product Image

python -c "
import os
from google import genai
from google.genai import types

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

response = client.models.generate_content(
    model='gemini-3-pro-image-preview',
    contents='Create a professional product photo of a sleek wireless headphone on a white background, studio lighting',
    config=types.GenerateContentConfig(
        response_modalities=['IMAGE', 'TEXT']
    )
)

mime_to_ext = {'image/png': '.png', 'image/jpeg': '.jpg', 'image/gif': '.gif', 'image/webp': '.webp'}

if response.candidates and response.candidates[0].content:
    for part in response.candidates[0].content.parts:
        if hasattr(part, 'inline_data') and part.inline_data:
            ext = mime_to_ext.get(part.inline_data.mime_type, '.png')
            with open(f'headphone{ext}', 'wb') as f:
                f.write(part.inline_data.data)
            print(f'Image saved to headphone{ext}')
"

Analyze Screenshot

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('screenshot.png', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Analyze this UI screenshot. Identify any usability issues and suggest improvements.'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

OCR / Extract Text from Image

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('document.png', 'rb') as f:
    image_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Extract all text from this image. Preserve formatting where possible.'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=image_data))
        ])
    ]
)
print(response.text)
"

Compare Two Images

python -c "
import os
from google import genai
from google.genai import types
import base64

client = genai.Client(api_key=os.environ['GEMINI_API_KEY'])

with open('image1.png', 'rb') as f:
    img1_data = base64.b64encode(f.read()).decode('utf-8')
with open('image2.png', 'rb') as f:
    img2_data = base64.b64encode(f.read()).decode('utf-8')

response = client.models.generate_content(
    model='gemini-3-pro-preview',
    contents=[
        types.Content(parts=[
            types.Part(text='Compare these two images. What are the key differences?'),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=img1_data)),
            types.Part(inline_data=types.Blob(mime_type='image/png', data=img2_data))
        ])
    ]
)
print(response.text)
"

Image Generation Parameters

When generating images, you can customize:

config=types.GenerateContentConfig(
    response_modalities=['IMAGE', 'TEXT'],  # Request both image and description
    temperature=1.0,  # Higher = more creative
    # Additional parameters may be model-specific
)

Supported Image Formats

Input (for understanding):

PNG (image/png)
JPEG (image/jpeg)
GIF (image/gif)
WebP (image/webp)

Output (from generation):

PNG (default, image/png)
The API returns raw bytes in part.inline_data.data (NOT base64 encoded)
Check part.inline_data.mime_type to determine the actual format returned

Error Handling

Common errors and solutions:

Image too large: Resize image before sending (max varies by model)
Unsupported format: Convert to PNG/JPEG
Generation blocked: Adjust prompt to comply with safety guidelines
Rate limiting: Implement retry with exponential backoff

Notes

Image generation requires response_modalities=['IMAGE', 'TEXT'] in config
For best results with generation, be specific and descriptive in prompts
Image understanding works with both local files and URLs
Multiple images can be sent in a single request for comparison
Gemini 3 Pro Image is NOT available via CLI - must use Python SDK

Tools to Use

Bash: Execute Python commands
Read: Load image files (binary mode)
Write: Save generated images
Glob: Find image files in directories

Related Skills

rdfitted/restore

development

VerifiedTrustedCommunity

Restore from the Kopia backup repo in one of two opinionated modes. **wikis** (frequent, default) syncs per-project `.ai-docs/` directories from backup to local project trees — used to move compound-knowledge wikis between machines via the backup drive as sneakernet. **full** (rare) restores all sources to original paths for greenfield machine rebuild. Use when the user says "restore wikis", "sync wikis from backup", "pull the wikis", "I plugged in the backup drive on this machine", "rebuild this machine", "greenfield restore", or "restore everything". For ad-hoc single-file restores, use `backup-ops restore` instead.

1SKILL.mdUpdated May 24, 2026

rdfitted/skills/bp-iterate

documentation

VerifiedTrustedCommunity

# /bp-iterate Iterate the Fitted Business Plan(s). Manages the **internal canonical** and the **external partner/investor variant**, snapshot-on-version-bump lineage, redaction enforcement between variants, and cross-document coupling. ## When this runs - User says `/bp-iterate`, "iterate the BP," "bump the BP," "update the business plan," "version up the BP," "create / update / refresh the external variant" - A material trigger fires per the BP's own Iteration Log (first 2 new closes / fundi

1SKILL.mdUpdated May 24, 2026

rdfitted/skills/bp-iterate

rdfitted/backup

tools

VerifiedTrustedCommunity

Run Kopia-based backups of key Windows files and config to an external drive. Use when the user says "back up", "run a backup", "snapshot", "the backup drive is plugged in", or wants to set up / configure backups for the first time. Handles initial repo setup, drive detection by volume label, source enumeration, and snapshot creation with structured exclusions.

1SKILL.mdUpdated May 24, 2026

rdfitted/backup-ops

testing

VerifiedTrustedCommunity

Secondary backup operations against the Kopia repo — verify integrity, run maintenance/prune, mirror to a second destination, restore files/folders, or run a quick top-up snapshot of hot directories. Use when the user says "verify backups", "check backup integrity", "prune old snapshots", "restore from backup", "mirror backups to cloud", "quick backup", "top up the backup", or asks about backup health. For the primary backup run, use the `backup` skill instead.

1SKILL.mdUpdated May 24, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/rdfitted/claude-code-setup.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-setup/skills/gemini-image ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

rdfitted/claude-code-setup

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT