.claude/skills/imagen-generation/SKILL.md
Google Imagen image generation via Vertex AI — text-to-image, image editing, inpainting, and upscaling using ImageGenerationModel
npx skillsauth add oimiragieo/agent-studio imagen-generationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Skill({ skill: 'imagen-generation' });
Use when:
pip install google-cloud-aiplatform pillow
gcloud auth application-default login
gcloud config set project YOUR_PROJECT_ID
gcloud services enable aiplatform.googleapis.com
import vertexai
from vertexai.preview.vision_models import ImageGenerationModel
vertexai.init(project='YOUR_PROJECT_ID', location='us-central1')
import vertexai
from vertexai.preview.vision_models import ImageGenerationModel
from PIL import Image
import io
vertexai.init(project='YOUR_PROJECT_ID', location='us-central1')
model = ImageGenerationModel.from_pretrained('imagegeneration@006')
# Basic generation
response = model.generate_images(
prompt='A futuristic city skyline at sunset, photorealistic, 4K',
number_of_images=1,
aspect_ratio='1:1', # '1:1', '9:16', '16:9', '3:4', '4:3'
guidance_scale=7.5, # 1-20; higher = closer to prompt
seed=42, # Optional: for reproducibility
)
# Save the image
image = response.images[0]
image.save('output.png')
# Or convert to PIL
pil_image = Image.open(io.BytesIO(image._image_bytes))
pil_image.show()
prompts = [
'A serene mountain lake at dawn',
'Abstract digital art with geometric shapes',
'A cozy coffee shop interior',
]
for i, prompt in enumerate(prompts):
response = model.generate_images(
prompt=prompt,
number_of_images=1,
)
response.images[0].save(f'image_{i}.png')
print(f'Saved image_{i}.png for: {prompt[:50]}')
from vertexai.preview.vision_models import ImageGenerationModel, Image as VertexImage
model = ImageGenerationModel.from_pretrained('imagegeneration@006')
# Load source image
source_image = VertexImage.load_from_file('source.png')
response = model.edit_image(
base_image=source_image,
prompt='Make the sky more dramatic with storm clouds',
edit_mode='inpainting-insert', # 'inpainting-insert' | 'inpainting-remove' | 'outpainting'
mask_mode='background', # 'background' | 'foreground' | 'semantic'
number_of_images=1,
guidance_scale=8.0,
)
response.images[0].save('edited.png')
import numpy as np
from PIL import Image, ImageDraw
# Create a mask (white = area to inpaint, black = keep)
source_pil = Image.open('source.png')
mask = Image.new('L', source_pil.size, 0) # Black background
draw = ImageDraw.Draw(mask)
draw.rectangle([100, 100, 300, 300], fill=255) # White region to replace
mask.save('mask.png')
# Load for Vertex AI
source_image = VertexImage.load_from_file('source.png')
mask_image = VertexImage.load_from_file('mask.png')
response = model.edit_image(
base_image=source_image,
mask=mask_image,
prompt='A beautiful garden fountain',
edit_mode='inpainting-insert',
number_of_images=1,
)
response.images[0].save('inpainted.png')
from vertexai.preview.vision_models import ImageGenerationModel, Image as VertexImage
model = ImageGenerationModel.from_pretrained('imagegeneration@006')
source_image = VertexImage.load_from_file('low_res.png')
response = model.upscale_image(
image=source_image,
upscale_factor='x2', # 'x2' or 'x4'
)
response.save('upscaled.png')
Use negative prompts to exclude unwanted elements:
response = model.generate_images(
prompt='Portrait of a professional business person in an office',
negative_prompt='blurry, low quality, cartoon, anime, watermark, text, logo',
number_of_images=2,
guidance_scale=9.0,
)
for i, img in enumerate(response.images):
img.save(f'portrait_{i}.png')
# Imagen 3 — highest quality, best prompt adherence
model = ImageGenerationModel.from_pretrained('imagen-3.0-generate-001')
response = model.generate_images(
prompt='A photorealistic macro photograph of a dewdrop on a spider web at sunrise',
number_of_images=1,
aspect_ratio='3:4',
safety_filter_level='block_some', # 'block_most' | 'block_some' | 'block_few'
person_generation='allow_adult', # 'dont_allow' | 'allow_adult'
)
response.images[0].save('imagen3_output.png')
| Model ID | Use Case | Notes |
| ------------------------------ | ------------------------------ | -------------------------- |
| imagen-3.0-generate-001 | Highest quality generation | Latest, best prompt follow |
| imagen-3.0-fast-generate-001 | Fast/cost-effective generation | Lower latency |
| imagegeneration@006 | Stable production model | Well-tested |
| imagegeneration@005 | Previous generation | Legacy |
| imagen-3.0-capability-001 | Editing and transformations | Inpaint, outpaint |
For local or non-GCP environments, use Stable Diffusion via diffusers:
pip install diffusers transformers accelerate torch
from diffusers import StableDiffusionPipeline
import torch
pipe = StableDiffusionPipeline.from_pretrained(
'runwayml/stable-diffusion-v1-5',
torch_dtype=torch.float16,
)
pipe = pipe.to('cuda') # or 'cpu' (slow)
image = pipe(
prompt='A futuristic city at sunset',
negative_prompt='blurry, low quality',
num_inference_steps=30,
guidance_scale=7.5,
).images[0]
image.save('output.png')
imagen-3.0-fast-generate-001 for iteration/drafts; switch to imagen-3.0-generate-001 for final outputseed for reproducibility to avoid regenerating identical imagescloud.google.com/vertex-ai/pricing (billed per image)safety_filter_level controls strictness: block_most (safest) → block_few (permissive)person_generation='dont_allow' disables human face generation for child-safety compliancegeneration_parameters from response for audit/reproducibility requirementstools
Comprehensive biosignal processing toolkit for analyzing physiological data including ECG, EEG, EDA, RSP, PPG, EMG, and EOG signals. Use this skill when processing cardiovascular signals, brain activity, electrodermal responses, respiratory patterns, muscle activity, or eye movements. Applicable for heart rate variability analysis, event-related potentials, complexity measures, autonomic nervous system assessment, psychophysiology research, and multi-modal physiological signal integration.
tools
Comprehensive toolkit for creating, analyzing, and visualizing complex networks and graphs in Python. Use when working with network/graph data structures, analyzing relationships between entities, computing graph algorithms (shortest paths, centrality, clustering), detecting communities, generating synthetic networks, or visualizing network topologies. Applicable to social networks, biological networks, transportation systems, citation networks, and any domain involving pairwise relationships.
data-ai
Molecular featurization for ML (100+ featurizers). ECFP, MACCS, descriptors, pretrained models (ChemBERTa), convert SMILES to features, for QSAR and molecular ML.
development
Run Python code in the cloud with serverless containers, GPUs, and autoscaling. Use when deploying ML models, running batch processing jobs, scheduling compute-intensive tasks, or serving APIs that require GPU acceleration or dynamic scaling.