Gemini Batch API Skill

Large-scale asynchronous document processing using Google's Gemini models.

When to Use

Process thousands of documents with the same prompt
Cost-effective bulk extraction (50% cheaper than synchronous API)
Jobs that can tolerate 24-hour completion windows

IRON LAW: Use Examples First, Never Guess API

READ EXAMPLES BEFORE WRITING ANY CODE. NO EXCEPTIONS.

The Rule

User asks for batch API work
    ↓
MANDATORY: Read examples/batch_processor.py or examples/icon_batch_vision.py
    ↓
Copy the pattern exactly
    ↓
DO NOT guess parameter names
DO NOT try wrapper types
DO NOT improvise API calls

Why This Matters

The Batch API has non-obvious requirements that will fail silently:

Metadata must be flat primitives - Nested objects cause cryptic errors
dest is a config field, not a kwarg - Pass via config={"dest": "gs://..."}. Older SDKs accepted dest= directly; newer ones raise TypeError.
Config is plain dict - Not a wrapper type
Examples are authoritative - Working code beats assumptions

Rationale: Previous agents wasted hours debugging API errors that the examples would have prevented. The patterns in examples/ are battle-tested production code.

Red Flags

About to pass dest= as a kwarg → STOP. That works on older SDKs only; the current SDK puts dest inside config={}. Read the examples.
About to instantiate a CreateBatchJobConfig object → STOP. The config is a plain dict, not a wrapper type.
About to nest metadata like a normal API → STOP. Nested objects trigger BigQuery type errors; flatten the data.
About to assume this works like other Google APIs → STOP. This API is different; the examples are authoritative.
About to improvise the JSONL format → STOP. Copy the structure from the examples instead.

MANDATORY Checklist Before ANY Batch API Code

[ ] Read examples/batch_processor.py OR examples/icon_batch_vision.py
[ ] Identify which example matches the use case (Standard API vs Vertex AI)
[ ] Copy the example's API call pattern exactly
[ ] Copy the example's JSONL structure exactly
[ ] Copy the example's metadata structure exactly
[ ] Adapt for specific needs only after copying base pattern

Enforcement: Writing batch API code without reading examples first violates this IRON LAW and will result in preventable errors.

Prerequisites

Install gcloud SDK

# macOS: Install via nix-darwin (add to ~/nix/ configuration)
# Or if already available: gcloud --version

# Linux: Install Google Cloud SDK from official sources
curl https://sdk.cloud.google.com | bash

Authentication Setup

# Authenticate with Google Cloud Platform
gcloud auth login

# Set up Application Default Credentials for Python libraries
gcloud auth application-default login

# Enable Vertex AI API in your project
gcloud services enable aiplatform.googleapis.com

Why both auth methods?

gcloud auth login: For gsutil and gcloud CLI commands
gcloud auth application-default login: For google-generativeai Python library
CRITICAL: Vertex AI requires ADC (step 2), not just API key

Create GCS Bucket

# Create bucket in us-central1 (required region)
gsutil mb -l us-central1 gs://your-batch-bucket

# Verify bucket location is us-central1
gsutil ls -L -b gs://your-batch-bucket | grep "Location"

See references/gcs-setup.md for complete setup guide.

Quick Start

Standard Gemini API (API Key)

Uses the Gemini File API for input. Results returned via batch_job.dest.file_name.

from google import genai

client = genai.Client()  # Uses GOOGLE_API_KEY env var

# Upload JSONL to File API
uploaded = client.files.upload(
    file="requests.jsonl",
    config={"mime_type": "application/jsonl"}
)

# Submit batch job
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded.name,  # "files/..." URI
    config={"display_name": "my-batch-job"}
)

# Results available at job.dest.file_name after completion

Vertex AI (Recommended for GCS workflows)

Uses GCS URIs directly. dest is a field of the config dict in the current SDK (older SDKs accepted dest= as a kwarg — that now raises TypeError: Batches.create() got an unexpected keyword argument 'dest').

from google import genai

# Use Vertex AI with ADC (not API key)
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Submit batch job with GCS paths.
# Current SDK signature: create(*, model, src, config)
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/requests.jsonl",     # GCS input
    config={
        "display_name": "my-job",
        "dest": "gs://bucket/outputs/",   # GCS output (Vertex AI only!)
    },
)

Verify your SDK before changing: inspect.signature(client.batches.create). If dest is in the kwargs, the kwarg form works; otherwise use config.

Key difference: Standard API uses File API (files/...), Vertex AI uses GCS (gs://...) with dest (now a config field).

Core Workflow

Standard API:

Create JSONL request file with prompts
Upload JSONL to File API via client.files.upload()
Submit batch job via client.batches.create(src=uploaded.name)
Monitor for completion — use Monitor tool (jobs expire after 24 hours)
Download results from job.dest.file_name

Vertex AI:

Upload files to GCS bucket (us-central1 region required)
Create JSONL request file with document URIs and prompts
Submit batch job via client.batches.create(src=..., config={"dest": ...})
Monitor for completion — use Monitor tool (jobs expire after 24 hours)
Download and parse results from GCS output URI
Handle failures gracefully (partial failures are common)

Monitoring Batch Jobs with Monitor Tool

After submitting a batch job, use Monitor instead of sleep-polling in Python:

Monitor(
  description="Gemini batch job progress",
  persistent=true,
  timeout_ms=3600000,
  command="while true; do uv run python3 -c \"import google.genai as genai; j=genai.batches.get(name='$JOB_NAME'); print(f'{j.state} | {j.name}'); exit(0 if j.state in ('JOB_STATE_SUCCEEDED','JOB_STATE_FAILED','JOB_STATE_CANCELLED') else 1)\" && break; sleep 60; done"
)

This frees the conversation to continue working while the batch runs. You get notified when the job completes or fails — no polling loop blocking your context.

Key Gotchas (API Structure)

Metadata must be flat primitives (no nested objects — BigQuery-backed storage). dest is a config field, not a top-level kwarg in the current SDK (Vertex AI only). Config is a plain dict (not a wrapper type).

See the Red Flags in the first Iron Law section above — the same gotchas apply here. The Key Gotchas table below summarizes all critical issues.

Key Gotchas

| Issue | Solution | |-------|----------| | Nested metadata fails | Use flat primitives or json.dumps() for complex data | | TypeError: unexpected keyword dest | Move dest inside config={} (Vertex AI; current SDK) | | Mixing API patterns | Standard API: File API + no dest. Vertex AI: GCS + dest | | Auth errors with Vertex AI | Run gcloud auth application-default login | | vertexai=True requires ADC | API key is ignored with vertexai=True | | Missing aiplatform API | Run gcloud services enable aiplatform.googleapis.com | | Region mismatch (Vertex) | Use us-central1 bucket only | | Wrong URI format (Vertex) | Use gs:// not https:// | | Invalid JSONL | Use scripts/validate_jsonl.py | | Image batch: inline data | Use fileData.fileUri for batch, not inline | | Duplicate IDs | Hash file content + prompt for unique IDs | | Large PDFs fail | Split at 50 pages / 50MB max | | JSON parsing fails | Use robust extraction (see gotchas.md) | | Output not found (Vertex) | Output URI is prefix, not file path | | uploadToFileSearchStore 503 for files >10KB | Use two-step: files.upload() then fileSearchStores.importFile() | | File stuck in PROCESSING state | Poll files.get() until state is ACTIVE before importing | | SDK Pager stops after first page | Use pager.hasNextPage() + pager.nextPage(), NOT for await | | Batch inlinedResponse.response.text is undefined | Response is raw JSON, not hydrated class. Use candidates[0].content.parts[0].text | | Store document displayName is random ID after importFile | Read bibkey from customMetadata, not displayName | | responseMimeType + tools in batch = error code 3 | Omit responseMimeType when using tools; use prompt-based JSON instructions |

Top 3 mistakes (bolded above):

Using nested objects in metadata instead of flat primitives
Mixing Standard API and Vertex AI patterns
Passing dest= as a kwarg instead of inside config={} (Vertex AI; current SDK)

See references/gotchas.md for detailed solutions (now with Gotchas 10-16).

Rate Limits

| Limit | Value | |-------|-------| | Max requests per JSONL | 10,000 | | Max concurrent jobs | 10 | | Max job size | 100MB | | Job expiration | 24 hours |

Recommended Models

| Model | Use Case | Cost | Location | Thinking default | |-------|----------|------|----------|------------------| | gemini-2.5-flash-lite | Most batch jobs | Lowest | us-central1 | OFF | | gemini-2.5-flash | Complex extraction | Medium | us-central1 | OFF | | gemini-2.5-pro | Highest accuracy | Highest | us-central1 | ON (cannot disable) | | gemini-3-flash-preview | New gen, larger context | 5× flash-lite | global | HIGH (set MINIMAL!) | | gemini-3.1-flash-lite-preview | Cheapest gen-3 | ~2× 2.5 flash-lite | global | HIGH (set MINIMAL!) | | gemini-embedding-001 | Default for text-only (short titles, classification, retrieval over text) | Low | Standard API | n/a | | gemini-embedding-2 | Multimodal (text+image) inputs | Low | Standard API | n/a | | text-embedding-005 | Need Vertex Batch console visibility (legacy) | Low | us-central1 | n/a |

Critical for Gemini 3.x: Always set thinkingConfig: {thinkingLevel: "MINIMAL"} in generationConfig or batch responses will silently fail with MAX_TOKENS and empty content. See references/gotchas.md Gotcha 12.

Critical for embedding batches: Embedding work has its own rules and failure modes — use file-based JSONL with per-row key on the Standard API; never inlined_requests (scrambles order at scale). Default to gemini-embedding-001 for text-only tasks. See references/embeddings.md and examples/embeddings_batch.py.

Additional Resources

References

references/embeddings.md - NEW: Dedicated reference for embedding batches (model choice, file-based + keyed pattern, sentinel verification)
references/gcs-setup.md - Complete GCS and Vertex AI setup guide
references/gotchas.md - 14 critical production gotchas (Gemini 3.x thinking, location='global'; embedding gotcha now lives in embeddings.md)
references/best-practices.md - Idempotent IDs, state tracking, validation
references/scale-up-testing.md - Incremental scale-up testing (LangExtract prototyping, LLM-as-judge, Vertex AI batch)
references/troubleshooting.md - Common errors and debugging
references/vertex-ai.md - Enterprise alternative with comparison
references/cli-reference.md - gsutil and gcloud commands
references/files-api.md - Files API: upload, poll-until-ACTIVE, 48h expiry, size limits
references/file-search.md - File Search (managed RAG): store creation, metadata filtering, grounding metadata
references/structured-output.md - responseJsonSchema / responseSchema: the supported schema subset, enums

Examples

examples/icon_batch_vision.py - NEW: Batch vision analysis with Vertex AI
examples/batch_processor.py - Complete GeminiBatchProcessor class
examples/embeddings_batch.py - NEW: gemini-embedding-2 via client.batches.create_embeddings() (the only supported production path; Vertex Batch rejects this model)
examples/pipeline_template.py - Customizable pipeline template

Scripts

scripts/validate_jsonl.py - Validate JSONL before submission
scripts/test_single.py - Test single request before batch

External Documentation

Gemini Batch API Guide
Google Cloud Storage
Vertex AI Batch Prediction

Date Awareness

Gemini API evolves rapidly. For API features or model names with uncertainty, verify against current documentation.

Gemini Batch API Skill

Large-scale asynchronous document processing using Google's Gemini models.

When to Use

Process thousands of documents with the same prompt
Cost-effective bulk extraction (50% cheaper than synchronous API)
Jobs that can tolerate 24-hour completion windows

IRON LAW: Use Examples First, Never Guess API

READ EXAMPLES BEFORE WRITING ANY CODE. NO EXCEPTIONS.

The Rule

User asks for batch API work
    ↓
MANDATORY: Read examples/batch_processor.py or examples/icon_batch_vision.py
    ↓
Copy the pattern exactly
    ↓
DO NOT guess parameter names
DO NOT try wrapper types
DO NOT improvise API calls

Why This Matters

The Batch API has non-obvious requirements that will fail silently:

Metadata must be flat primitives - Nested objects cause cryptic errors
dest is a config field, not a kwarg - Pass via config={"dest": "gs://..."}. Older SDKs accepted dest= directly; newer ones raise TypeError.
Config is plain dict - Not a wrapper type
Examples are authoritative - Working code beats assumptions

Rationale: Previous agents wasted hours debugging API errors that the examples would have prevented. The patterns in examples/ are battle-tested production code.

Red Flags

About to pass dest= as a kwarg → STOP. That works on older SDKs only; the current SDK puts dest inside config={}. Read the examples.
About to instantiate a CreateBatchJobConfig object → STOP. The config is a plain dict, not a wrapper type.
About to nest metadata like a normal API → STOP. Nested objects trigger BigQuery type errors; flatten the data.
About to assume this works like other Google APIs → STOP. This API is different; the examples are authoritative.
About to improvise the JSONL format → STOP. Copy the structure from the examples instead.

MANDATORY Checklist Before ANY Batch API Code

[ ] Read examples/batch_processor.py OR examples/icon_batch_vision.py
[ ] Identify which example matches the use case (Standard API vs Vertex AI)
[ ] Copy the example's API call pattern exactly
[ ] Copy the example's JSONL structure exactly
[ ] Copy the example's metadata structure exactly
[ ] Adapt for specific needs only after copying base pattern

Enforcement: Writing batch API code without reading examples first violates this IRON LAW and will result in preventable errors.

Prerequisites

Install gcloud SDK

# macOS: Install via nix-darwin (add to ~/nix/ configuration)
# Or if already available: gcloud --version

# Linux: Install Google Cloud SDK from official sources
curl https://sdk.cloud.google.com | bash

Authentication Setup

# Authenticate with Google Cloud Platform
gcloud auth login

# Set up Application Default Credentials for Python libraries
gcloud auth application-default login

# Enable Vertex AI API in your project
gcloud services enable aiplatform.googleapis.com

Why both auth methods?

gcloud auth login: For gsutil and gcloud CLI commands
gcloud auth application-default login: For google-generativeai Python library
CRITICAL: Vertex AI requires ADC (step 2), not just API key

Create GCS Bucket

# Create bucket in us-central1 (required region)
gsutil mb -l us-central1 gs://your-batch-bucket

# Verify bucket location is us-central1
gsutil ls -L -b gs://your-batch-bucket | grep "Location"

See references/gcs-setup.md for complete setup guide.

Quick Start

Standard Gemini API (API Key)

Uses the Gemini File API for input. Results returned via batch_job.dest.file_name.

from google import genai

client = genai.Client()  # Uses GOOGLE_API_KEY env var

# Upload JSONL to File API
uploaded = client.files.upload(
    file="requests.jsonl",
    config={"mime_type": "application/jsonl"}
)

# Submit batch job
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src=uploaded.name,  # "files/..." URI
    config={"display_name": "my-batch-job"}
)

# Results available at job.dest.file_name after completion

Vertex AI (Recommended for GCS workflows)

from google import genai

# Use Vertex AI with ADC (not API key)
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Submit batch job with GCS paths.
# Current SDK signature: create(*, model, src, config)
job = client.batches.create(
    model="gemini-2.5-flash-lite",
    src="gs://bucket/requests.jsonl",     # GCS input
    config={
        "display_name": "my-job",
        "dest": "gs://bucket/outputs/",   # GCS output (Vertex AI only!)
    },
)

Verify your SDK before changing: inspect.signature(client.batches.create). If dest is in the kwargs, the kwarg form works; otherwise use config.

Key difference: Standard API uses File API (files/...), Vertex AI uses GCS (gs://...) with dest (now a config field).

Core Workflow

Standard API:

Create JSONL request file with prompts
Upload JSONL to File API via client.files.upload()
Submit batch job via client.batches.create(src=uploaded.name)
Monitor for completion — use Monitor tool (jobs expire after 24 hours)
Download results from job.dest.file_name

Vertex AI:

Upload files to GCS bucket (us-central1 region required)
Create JSONL request file with document URIs and prompts
Submit batch job via client.batches.create(src=..., config={"dest": ...})
Monitor for completion — use Monitor tool (jobs expire after 24 hours)
Download and parse results from GCS output URI
Handle failures gracefully (partial failures are common)

Monitoring Batch Jobs with Monitor Tool

After submitting a batch job, use Monitor instead of sleep-polling in Python:

Monitor(
  description="Gemini batch job progress",
  persistent=true,
  timeout_ms=3600000,
  command="while true; do uv run python3 -c \"import google.genai as genai; j=genai.batches.get(name='$JOB_NAME'); print(f'{j.state} | {j.name}'); exit(0 if j.state in ('JOB_STATE_SUCCEEDED','JOB_STATE_FAILED','JOB_STATE_CANCELLED') else 1)\" && break; sleep 60; done"
)

This frees the conversation to continue working while the batch runs. You get notified when the job completes or fails — no polling loop blocking your context.

Key Gotchas (API Structure)

See the Red Flags in the first Iron Law section above — the same gotchas apply here. The Key Gotchas table below summarizes all critical issues.

Key Gotchas

Top 3 mistakes (bolded above):

Using nested objects in metadata instead of flat primitives
Mixing Standard API and Vertex AI patterns
Passing dest= as a kwarg instead of inside config={} (Vertex AI; current SDK)

See references/gotchas.md for detailed solutions (now with Gotchas 10-16).

Rate Limits

| Limit | Value | |-------|-------| | Max requests per JSONL | 10,000 | | Max concurrent jobs | 10 | | Max job size | 100MB | | Job expiration | 24 hours |

Recommended Models

Additional Resources

References

references/embeddings.md - NEW: Dedicated reference for embedding batches (model choice, file-based + keyed pattern, sentinel verification)
references/gcs-setup.md - Complete GCS and Vertex AI setup guide
references/gotchas.md - 14 critical production gotchas (Gemini 3.x thinking, location='global'; embedding gotcha now lives in embeddings.md)
references/best-practices.md - Idempotent IDs, state tracking, validation
references/scale-up-testing.md - Incremental scale-up testing (LangExtract prototyping, LLM-as-judge, Vertex AI batch)
references/troubleshooting.md - Common errors and debugging
references/vertex-ai.md - Enterprise alternative with comparison
references/cli-reference.md - gsutil and gcloud commands
references/files-api.md - Files API: upload, poll-until-ACTIVE, 48h expiry, size limits
references/file-search.md - File Search (managed RAG): store creation, metadata filtering, grounding metadata
references/structured-output.md - responseJsonSchema / responseSchema: the supported schema subset, enums

Examples

examples/icon_batch_vision.py - NEW: Batch vision analysis with Vertex AI
examples/batch_processor.py - Complete GeminiBatchProcessor class
examples/embeddings_batch.py - NEW: gemini-embedding-2 via client.batches.create_embeddings() (the only supported production path; Vertex Batch rejects this model)
examples/pipeline_template.py - Customizable pipeline template

Scripts

scripts/validate_jsonl.py - Validate JSONL before submission
scripts/test_single.py - Test single request before batch

External Documentation

Gemini Batch API Guide
Google Cloud Storage
Vertex AI Batch Prediction

Date Awareness

Gemini API evolves rapidly. For API features or model names with uncertainty, verify against current documentation.

Adoption

edwinhu/gemini-batch

$ install --global

Security Scan Results

SKILL.md

Gemini Batch API Skill

When to Use

IRON LAW: Use Examples First, Never Guess API

The Rule

Why This Matters

Red Flags

MANDATORY Checklist Before ANY Batch API Code

Prerequisites

Install gcloud SDK

Authentication Setup

Create GCS Bucket

Quick Start

Standard Gemini API (API Key)

Vertex AI (Recommended for GCS workflows)

Core Workflow

Monitoring Batch Jobs with Monitor Tool

Key Gotchas (API Structure)

Key Gotchas

Rate Limits

Recommended Models

Additional Resources

References

Examples

Scripts

External Documentation

Date Awareness

Related Skills

edwinhu/npx-ownership-panel

edwinhu/crsp-v2

edwinhu/fuzzy-name-matching

edwinhu/ds-tables

edwinhu/gemini-batch

$ install --global

Security Scan Results

SKILL.md

Gemini Batch API Skill

When to Use

IRON LAW: Use Examples First, Never Guess API

The Rule

Why This Matters

Red Flags

MANDATORY Checklist Before ANY Batch API Code

Prerequisites

Install gcloud SDK

Authentication Setup

Create GCS Bucket

Quick Start

Standard Gemini API (API Key)

Vertex AI (Recommended for GCS workflows)

Core Workflow

Monitoring Batch Jobs with Monitor Tool

Key Gotchas (API Structure)

Key Gotchas

Rate Limits

Recommended Models

Additional Resources

References

Examples

Scripts

External Documentation

Date Awareness

Related Skills

edwinhu/npx-ownership-panel

edwinhu/crsp-v2

edwinhu/fuzzy-name-matching

edwinhu/ds-tables