skills/gemini-batch/SKILL.md
This skill should be used when the user asks to "use Gemini Batch API", "process documents at scale", "submit a batch job", "upload files to Gemini", or needs large-scale LLM processing.
npx skillsauth add edwinhu/workflows gemini-batchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Large-scale asynchronous document processing using Google's Gemini models.
READ EXAMPLES BEFORE WRITING ANY CODE. NO EXCEPTIONS.
User asks for batch API work
↓
MANDATORY: Read examples/batch_processor.py or examples/icon_batch_vision.py
↓
Copy the pattern exactly
↓
DO NOT guess parameter names
DO NOT try wrapper types
DO NOT improvise API calls
The Batch API has non-obvious requirements that will fail silently:
dest is a config field, not a kwarg - Pass via config={"dest": "gs://..."}. Older SDKs accepted dest= directly; newer ones raise TypeError.Rationale: Previous agents wasted hours debugging API errors that the examples would have prevented. The patterns in examples/ are battle-tested production code.
dest= as a kwarg → STOP. That works on older SDKs only; the current SDK puts dest inside config={}. Read the examples.CreateBatchJobConfig object → STOP. The config is a plain dict, not a wrapper type.examples/batch_processor.py OR examples/icon_batch_vision.pyEnforcement: Writing batch API code without reading examples first violates this IRON LAW and will result in preventable errors.
# macOS: Install via nix-darwin (add to ~/nix/ configuration)
# Or if already available: gcloud --version
# Linux: Install Google Cloud SDK from official sources
curl https://sdk.cloud.google.com | bash
# Authenticate with Google Cloud Platform
gcloud auth login
# Set up Application Default Credentials for Python libraries
gcloud auth application-default login
# Enable Vertex AI API in your project
gcloud services enable aiplatform.googleapis.com
Why both auth methods?
gcloud auth login: For gsutil and gcloud CLI commandsgcloud auth application-default login: For google-generativeai Python library# Create bucket in us-central1 (required region)
gsutil mb -l us-central1 gs://your-batch-bucket
# Verify bucket location is us-central1
gsutil ls -L -b gs://your-batch-bucket | grep "Location"
See references/gcs-setup.md for complete setup guide.
Uses the Gemini File API for input. Results returned via batch_job.dest.file_name.
from google import genai
client = genai.Client() # Uses GOOGLE_API_KEY env var
# Upload JSONL to File API
uploaded = client.files.upload(
file="requests.jsonl",
config={"mime_type": "application/jsonl"}
)
# Submit batch job
job = client.batches.create(
model="gemini-2.5-flash-lite",
src=uploaded.name, # "files/..." URI
config={"display_name": "my-batch-job"}
)
# Results available at job.dest.file_name after completion
Uses GCS URIs directly. dest is a field of the config dict in the
current SDK (older SDKs accepted dest= as a kwarg — that now raises
TypeError: Batches.create() got an unexpected keyword argument 'dest').
from google import genai
# Use Vertex AI with ADC (not API key)
client = genai.Client(
vertexai=True,
project="your-project-id",
location="us-central1"
)
# Submit batch job with GCS paths.
# Current SDK signature: create(*, model, src, config)
job = client.batches.create(
model="gemini-2.5-flash-lite",
src="gs://bucket/requests.jsonl", # GCS input
config={
"display_name": "my-job",
"dest": "gs://bucket/outputs/", # GCS output (Vertex AI only!)
},
)
Verify your SDK before changing: inspect.signature(client.batches.create).
If dest is in the kwargs, the kwarg form works; otherwise use config.
Key difference: Standard API uses File API (files/...), Vertex AI uses GCS (gs://...) with dest (now a config field).
Standard API:
client.files.upload()client.batches.create(src=uploaded.name)job.dest.file_nameVertex AI:
client.batches.create(src=..., config={"dest": ...})After submitting a batch job, use Monitor instead of sleep-polling in Python:
Monitor(
description="Gemini batch job progress",
persistent=true,
timeout_ms=3600000,
command="while true; do uv run python3 -c \"import google.genai as genai; j=genai.batches.get(name='$JOB_NAME'); print(f'{j.state} | {j.name}'); exit(0 if j.state in ('JOB_STATE_SUCCEEDED','JOB_STATE_FAILED','JOB_STATE_CANCELLED') else 1)\" && break; sleep 60; done"
)
This frees the conversation to continue working while the batch runs. You get notified when the job completes or fails — no polling loop blocking your context.
Metadata must be flat primitives (no nested objects — BigQuery-backed storage). dest is a config field, not a top-level kwarg in the current SDK (Vertex AI only). Config is a plain dict (not a wrapper type).
See the Red Flags in the first Iron Law section above — the same gotchas apply here. The Key Gotchas table below summarizes all critical issues.
| Issue | Solution |
|-------|----------|
| Nested metadata fails | Use flat primitives or json.dumps() for complex data |
| TypeError: unexpected keyword dest | Move dest inside config={} (Vertex AI; current SDK) |
| Mixing API patterns | Standard API: File API + no dest. Vertex AI: GCS + dest |
| Auth errors with Vertex AI | Run gcloud auth application-default login |
| vertexai=True requires ADC | API key is ignored with vertexai=True |
| Missing aiplatform API | Run gcloud services enable aiplatform.googleapis.com |
| Region mismatch (Vertex) | Use us-central1 bucket only |
| Wrong URI format (Vertex) | Use gs:// not https:// |
| Invalid JSONL | Use scripts/validate_jsonl.py |
| Image batch: inline data | Use fileData.fileUri for batch, not inline |
| Duplicate IDs | Hash file content + prompt for unique IDs |
| Large PDFs fail | Split at 50 pages / 50MB max |
| JSON parsing fails | Use robust extraction (see gotchas.md) |
| Output not found (Vertex) | Output URI is prefix, not file path |
| uploadToFileSearchStore 503 for files >10KB | Use two-step: files.upload() then fileSearchStores.importFile() |
| File stuck in PROCESSING state | Poll files.get() until state is ACTIVE before importing |
| SDK Pager stops after first page | Use pager.hasNextPage() + pager.nextPage(), NOT for await |
| Batch inlinedResponse.response.text is undefined | Response is raw JSON, not hydrated class. Use candidates[0].content.parts[0].text |
| Store document displayName is random ID after importFile | Read bibkey from customMetadata, not displayName |
| responseMimeType + tools in batch = error code 3 | Omit responseMimeType when using tools; use prompt-based JSON instructions |
Top 3 mistakes (bolded above):
dest= as a kwarg instead of inside config={} (Vertex AI; current SDK)See references/gotchas.md for detailed solutions (now with Gotchas 10-16).
| Limit | Value | |-------|-------| | Max requests per JSONL | 10,000 | | Max concurrent jobs | 10 | | Max job size | 100MB | | Job expiration | 24 hours |
| Model | Use Case | Cost | Location | Thinking default |
|-------|----------|------|----------|------------------|
| gemini-2.5-flash-lite | Most batch jobs | Lowest | us-central1 | OFF |
| gemini-2.5-flash | Complex extraction | Medium | us-central1 | OFF |
| gemini-2.5-pro | Highest accuracy | Highest | us-central1 | ON (cannot disable) |
| gemini-3-flash-preview | New gen, larger context | 5× flash-lite | global | HIGH (set MINIMAL!) |
| gemini-3.1-flash-lite-preview | Cheapest gen-3 | ~2× 2.5 flash-lite | global | HIGH (set MINIMAL!) |
| gemini-embedding-001 | Default for text-only (short titles, classification, retrieval over text) | Low | Standard API | n/a |
| gemini-embedding-2 | Multimodal (text+image) inputs | Low | Standard API | n/a |
| text-embedding-005 | Need Vertex Batch console visibility (legacy) | Low | us-central1 | n/a |
Critical for Gemini 3.x: Always set thinkingConfig: {thinkingLevel: "MINIMAL"} in generationConfig or batch responses will silently fail with MAX_TOKENS and empty content. See references/gotchas.md Gotcha 12.
Critical for embedding batches: Embedding work has its own rules and failure modes — use file-based JSONL with per-row key on the Standard API; never inlined_requests (scrambles order at scale). Default to gemini-embedding-001 for text-only tasks. See references/embeddings.md and examples/embeddings_batch.py.
references/embeddings.md - NEW: Dedicated reference for embedding batches (model choice, file-based + keyed pattern, sentinel verification)references/gcs-setup.md - Complete GCS and Vertex AI setup guidereferences/gotchas.md - 14 critical production gotchas (Gemini 3.x thinking, location='global'; embedding gotcha now lives in embeddings.md)references/best-practices.md - Idempotent IDs, state tracking, validationreferences/scale-up-testing.md - Incremental scale-up testing (LangExtract prototyping, LLM-as-judge, Vertex AI batch)references/troubleshooting.md - Common errors and debuggingreferences/vertex-ai.md - Enterprise alternative with comparisonreferences/cli-reference.md - gsutil and gcloud commandsexamples/icon_batch_vision.py - NEW: Batch vision analysis with Vertex AIexamples/batch_processor.py - Complete GeminiBatchProcessor classexamples/embeddings_batch.py - NEW: gemini-embedding-2 via client.batches.create_embeddings() (the only supported production path; Vertex Batch rejects this model)examples/pipeline_template.py - Customizable pipeline templatescripts/validate_jsonl.py - Validate JSONL before submissionscripts/test_single.py - Test single request before batchGemini API evolves rapidly. For API features or model names with uncertainty, verify against current documentation.
tools
Use when "query Dewey Data", "deweydata.io", "SafeGraph places/patterns/spend", "Advan foot traffic", "POI / points of interest", "mobility data", "dataplor", "Veraset", "PassBy", "crypto/Bitcoin ATM locations", or any pull from the Dewey Data academic marketplace (UVA/NYU Platform Subscription) via the deweypy/deweydatapy client, DuckDB, or the Dewey MCP server.
development
Use when submitting jobs to UVA HPC (Rivanna/Afton), writing Slurm scripts (sbatch/srun/squeue), converting SGE to Slurm, running compute on any Slurm-managed cluster, or building WRDS data pipelines with polars on HPC. Triggers: 'submit to HPC', 'sbatch', 'squeue', 'slurm job', 'run on Rivanna', 'run on Afton', 'HPC array job', 'convert SGE to Slurm', 'polars on HPC', 'WRDS from HPC'.
testing
Internal skill for literature review and source materialization. Called after brainstorm, before setup. NOT user-facing.
development
This skill should be used when the user asks to "add paper", "paperpile add", "fetch PDF for", "find and add", "search paperpile", "find in paperpile", "paperpile search", "label paper", "trash paper", "download paper", "paperpile index", "edit paper metadata", "update paper title", "fix paper author", "paperpile edit", "find PDF online", "search google for PDF", "resolve PDF", "fetch PDF for citation", "get full-text for DOI", "resolve cite to PDF", or any request to manage their Paperpile library or resolve a citation to a local PDF.