Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

akrindev/gemini-files

Name: gemini-files
Author: akrindev

skills/gemini-files/SKILL.md

npx skillsauth add akrindev/google-studio-skills gemini-files

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Gemini File API

Upload and manage files for use with Gemini models through executable scripts, supporting images, audio, video, PDFs, and other file types.

When to Use This Skill

Use this skill when you need to:

Upload images for multimodal analysis
Upload videos for content processing
Upload PDFs for document analysis
Upload audio for transcription or processing
Pre-upload files for batch operations
Check file processing status
List and manage uploaded files
Use files with other Gemini skills (text, image, etc.)

Available Scripts

scripts/upload.js

Purpose: Upload files to Gemini File API

When to use:

Uploading any file for Gemini processing
Preparing files for multimodal generation
Uploading documents for analysis
Batch file preparation

Key parameters: | Parameter | Description | Example | |-----------|-------------|---------| | path | File path (required) | image.jpg | | --name, -n | Display name | "my-document" | | --wait, -w | Wait for processing | Flag |

Output: File name, URI, and status information

Workflows

Workflow 1: Basic File Upload

node scripts/upload.js image.jpg

Best for: Quick uploads, simple files
Output: File name and URI for API use
State: PROCESSING or ACTIVE

Workflow 2: Upload with Custom Name

node scripts/upload.js document.pdf --name "Quarterly Report Q4 2026"

Best for: Organizing files, tracking uploads
Use when: Original filename not descriptive enough
Display name appears in file listings

Workflow 3: Upload and Wait for Processing

node scripts/upload.js video.mp4 --wait

Best for: Large files, videos, audio
Waits for file to be ACTIVE state
Use when: You need to use file immediately after upload

Workflow 4: Upload Image for Analysis

# 1. Upload image
node scripts/upload.js photo.png --name "product-shot"

# 2. Use with gemini-text for analysis
node skills/gemini-text/scripts/generate.js "Describe this image" --image photo.png

Best for: Image analysis, captioning, visual Q&A
Combines with: gemini-text for multimodal processing

Workflow 5: Upload PDF for Content Extraction

# 1. Upload PDF
node scripts/upload.js research-paper.pdf --name "AI-Research-Paper" --wait

# 2. Extract content with gemini-text
node skills/gemini-text/scripts/generate.js "Extract key findings from this document" --image research-paper.pdf

Best for: Document processing, content extraction
Combines with: gemini-text for analysis

Workflow 6: Upload Multiple Files for Batch

# 1. Upload multiple files
for file in *.jpg; do
    node scripts/upload.js "$file"
done

# 2. Create batch job using uploaded files (gemini-batch skill)

Best for: Preparing files for batch processing
Combines with: gemini-batch for bulk operations

Workflow 7: Upload Audio for Transcription

# 1. Upload audio
node scripts/upload.js interview.mp3 --name "interview-001" --wait

# 2. Process with gemini-text (if transcription available)
node skills/gemini-text/scripts/generate.js "Transcribe and summarize this audio" --image interview.mp3

Best for: Audio processing, transcription, podcast analysis
Combines with: gemini-text for audio analysis

Workflow 8: Upload Video for Content Analysis

# 1. Upload video (may take time)
node scripts/upload.js product-demo.mp4 --name "demo-video" --wait

# 2. Analyze with gemini-text
node skills/gemini-text/scripts/generate.js "Analyze this product demo video" --image product-demo.mp4

Best for: Video analysis, content summarization
Note: Videos may require significant processing time

Parameters Reference

Supported File Types

| Type | Extensions | Max Size | Processing Time | |------|------------|----------|-----------------| | Images | jpg, jpeg, png, gif, webp | 20MB | Seconds | | Audio | mp3, wav, aac, flac | 25MB | Seconds-minutes | | Video | mp4, mov, avi, webm | 2GB | Minutes-hours | | Documents | pdf, txt | 50MB | Seconds-minutes |

MIME Types

Script auto-detects based on extension:

Images: image/jpeg, image/png, image/gif, image/webp
Audio: audio/mpeg, audio/wav
Video: video/mp4, video/quicktime, video/webm
Documents: application/pdf, text/plain

File States

| State | Description | Ready for Use | |-------|-------------|-----------------| | PROCESSING | File is being analyzed | No | | ACTIVE | File is ready | Yes | | FAILED | Processing failed | No |

Output Interpretation

Upload Response

Uploading photo.png...
Uploaded: files/abc123...
URI: gs://generation-tmp/abc123...
State: PROCESSING

File name: Use in API calls
URI: Internal Google Cloud Storage reference
State: PROCESSING = wait, ACTIVE = ready

With --wait Flag

Uploading video.mp4...
Uploaded: files/xyz789...
URI: gs://generation-tmp/xyz789...
State: PROCESSING
Waiting for processing...
Still processing...
File ready!

Script polls until state is ACTIVE
Use for large files requiring processing
May take minutes for videos

Using Uploaded Files

Once uploaded, reference file by name:

# With gemini-text
node skills/gemini-text/scripts/generate.js "Analyze" --image <uploaded-file-path>

Common Issues

"google-genai not installed"

npm install @google/genai@latest dotenv@latest

"File not found"

Verify file path is correct
Use absolute paths if relative paths fail
Check file extension matches supported types

"File too large"

Check size limits for file type
Compress images/videos if possible
Split large files into smaller parts

"Unsupported file type"

Check supported extensions
Convert to supported format if possible
Images: jpg, png, gif, webp
Videos: mp4, mov, avi, webm

"Processing failed"

Check file is not corrupted
Try re-uploading the file
Verify file format is valid
Check API quota limits

"File still processing" (without --wait)

File state is PROCESSING, not ACTIVE
Use --wait flag or check status later
Large files (especially videos) take time
Processing can take minutes to hours

Best Practices

Upload Strategy

Use --wait for files you'll use immediately
Skip --wait for batch uploads to save time
Use descriptive --name for organization
Keep track of file names for later use

File Organization

Use consistent naming conventions
Include dates or versions in names
Group related files together
Document file names in your code

Performance Tips

Upload multiple files in parallel (separate processes)
Pre-upload files for batch operations
Check file state before using in API calls
Delete old files to manage storage

Error Handling

Check return state after upload
Retry failed uploads
Verify file integrity before upload
Log file names for audit trails

Integration with Other Skills

gemini-text: Multimodal analysis, document processing
gemini-image: Generate images based on uploaded reference
gemini-batch: Use uploaded files in batch jobs
gemini-embeddings: Create embeddings from file content

File Lifecycle

Upload → PROCESSING → ACTIVE → Use in API
Delete old files to free storage
Files may expire after certain period
Download important files for backup

Related Skills

gemini-text: Analyze uploaded files with text generation
gemini-image: Create images based on uploaded references
gemini-batch: Use uploaded files in batch processing
gemini-embeddings: Generate embeddings from file content

Quick Reference

# Basic upload
node scripts/upload.js image.jpg

# With custom name
node scripts/upload.js document.pdf --name "My Document"

# Wait for processing
node scripts/upload.js video.mp4 --wait

# Multiple files
for file in *.jpg; do node scripts/upload.js "$file"; done

File Management API

While not in scripts, you can also manage files via JavaScript:

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// List all files
for await (const file of client.files.list()) {
  console.log(`${file.name}: ${file.displayName} (${file.state})`);
}

// Get file info
const file = await client.files.get({ name: "files/abc123..." });
console.log(`State: ${file.state}`);

// Delete file
await client.files.delete({ name: "files/abc123..." });

Reference

Get API key: https://aistudio.google.com/apikey
Documentation: https://ai.google.dev/gemini-api/docs/file-upload
File API: https://ai.google.dev/gemini-api/docs/files
Supported formats: Images, audio, video, documents (see table above)

akrindev/gemini-files

skills/gemini-files/SKILL.md

Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking, and file management. Triggers on "upload file", "file API", "upload image", "upload PDF", "upload video", "file management".

1 stars

development

Updated Apr 1, 2026

$ install --global

skillsauth

npx skillsauth add akrindev/google-studio-skills gemini-files

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 1, 2026, 11:50 PM57.2s3 files scanned

SKILL.md

name:: gemini-files
description:: Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking, and file management. Triggers on "upload file", "file API", "upload image", "upload PDF", "upload video", "file management".
license:: MIT
version:: 1.0.0
keywords:: file upload, image upload, video upload, PDF upload, audio upload, file API, multimodal, document processing

Gemini File API

Upload and manage files for use with Gemini models through executable scripts, supporting images, audio, video, PDFs, and other file types.

When to Use This Skill

Use this skill when you need to:

Upload images for multimodal analysis
Upload videos for content processing
Upload PDFs for document analysis
Upload audio for transcription or processing
Pre-upload files for batch operations
Check file processing status
List and manage uploaded files
Use files with other Gemini skills (text, image, etc.)

Available Scripts

scripts/upload.js

Purpose: Upload files to Gemini File API

When to use:

Uploading any file for Gemini processing
Preparing files for multimodal generation
Uploading documents for analysis
Batch file preparation

Output: File name, URI, and status information

Workflows

Workflow 1: Basic File Upload

node scripts/upload.js image.jpg

Best for: Quick uploads, simple files
Output: File name and URI for API use
State: PROCESSING or ACTIVE

Workflow 2: Upload with Custom Name

node scripts/upload.js document.pdf --name "Quarterly Report Q4 2026"

Best for: Organizing files, tracking uploads
Use when: Original filename not descriptive enough
Display name appears in file listings

Workflow 3: Upload and Wait for Processing

node scripts/upload.js video.mp4 --wait

Best for: Large files, videos, audio
Waits for file to be ACTIVE state
Use when: You need to use file immediately after upload

Workflow 4: Upload Image for Analysis

# 1. Upload image
node scripts/upload.js photo.png --name "product-shot"

# 2. Use with gemini-text for analysis
node skills/gemini-text/scripts/generate.js "Describe this image" --image photo.png

Best for: Image analysis, captioning, visual Q&A
Combines with: gemini-text for multimodal processing

Workflow 5: Upload PDF for Content Extraction

# 1. Upload PDF
node scripts/upload.js research-paper.pdf --name "AI-Research-Paper" --wait

# 2. Extract content with gemini-text
node skills/gemini-text/scripts/generate.js "Extract key findings from this document" --image research-paper.pdf

Best for: Document processing, content extraction
Combines with: gemini-text for analysis

Workflow 6: Upload Multiple Files for Batch

# 1. Upload multiple files
for file in *.jpg; do
    node scripts/upload.js "$file"
done

# 2. Create batch job using uploaded files (gemini-batch skill)

Best for: Preparing files for batch processing
Combines with: gemini-batch for bulk operations

Workflow 7: Upload Audio for Transcription

# 1. Upload audio
node scripts/upload.js interview.mp3 --name "interview-001" --wait

# 2. Process with gemini-text (if transcription available)
node skills/gemini-text/scripts/generate.js "Transcribe and summarize this audio" --image interview.mp3

Best for: Audio processing, transcription, podcast analysis
Combines with: gemini-text for audio analysis

Workflow 8: Upload Video for Content Analysis

# 1. Upload video (may take time)
node scripts/upload.js product-demo.mp4 --name "demo-video" --wait

# 2. Analyze with gemini-text
node skills/gemini-text/scripts/generate.js "Analyze this product demo video" --image product-demo.mp4

Best for: Video analysis, content summarization
Note: Videos may require significant processing time

Parameters Reference

Supported File Types

MIME Types

Script auto-detects based on extension:

Images: image/jpeg, image/png, image/gif, image/webp
Audio: audio/mpeg, audio/wav
Video: video/mp4, video/quicktime, video/webm
Documents: application/pdf, text/plain

File States

Output Interpretation

Upload Response

Uploading photo.png...
Uploaded: files/abc123...
URI: gs://generation-tmp/abc123...
State: PROCESSING

File name: Use in API calls
URI: Internal Google Cloud Storage reference
State: PROCESSING = wait, ACTIVE = ready

With --wait Flag

Uploading video.mp4...
Uploaded: files/xyz789...
URI: gs://generation-tmp/xyz789...
State: PROCESSING
Waiting for processing...
Still processing...
File ready!

Script polls until state is ACTIVE
Use for large files requiring processing
May take minutes for videos

Using Uploaded Files

Once uploaded, reference file by name:

# With gemini-text
node skills/gemini-text/scripts/generate.js "Analyze" --image <uploaded-file-path>

Common Issues

"google-genai not installed"

npm install @google/genai@latest dotenv@latest

"File not found"

Verify file path is correct
Use absolute paths if relative paths fail
Check file extension matches supported types

"File too large"

Check size limits for file type
Compress images/videos if possible
Split large files into smaller parts

"Unsupported file type"

Check supported extensions
Convert to supported format if possible
Images: jpg, png, gif, webp
Videos: mp4, mov, avi, webm

"Processing failed"

Check file is not corrupted
Try re-uploading the file
Verify file format is valid
Check API quota limits

"File still processing" (without --wait)

File state is PROCESSING, not ACTIVE
Use --wait flag or check status later
Large files (especially videos) take time
Processing can take minutes to hours

Best Practices

Upload Strategy

Use --wait for files you'll use immediately
Skip --wait for batch uploads to save time
Use descriptive --name for organization
Keep track of file names for later use

File Organization

Use consistent naming conventions
Include dates or versions in names
Group related files together
Document file names in your code

Performance Tips

Upload multiple files in parallel (separate processes)
Pre-upload files for batch operations
Check file state before using in API calls
Delete old files to manage storage

Error Handling

Check return state after upload
Retry failed uploads
Verify file integrity before upload
Log file names for audit trails

Integration with Other Skills

gemini-text: Multimodal analysis, document processing
gemini-image: Generate images based on uploaded reference
gemini-batch: Use uploaded files in batch jobs
gemini-embeddings: Create embeddings from file content

File Lifecycle

Upload → PROCESSING → ACTIVE → Use in API
Delete old files to free storage
Files may expire after certain period
Download important files for backup

Related Skills

gemini-text: Analyze uploaded files with text generation
gemini-image: Create images based on uploaded references
gemini-batch: Use uploaded files in batch processing
gemini-embeddings: Generate embeddings from file content

Quick Reference

# Basic upload
node scripts/upload.js image.jpg

# With custom name
node scripts/upload.js document.pdf --name "My Document"

# Wait for processing
node scripts/upload.js video.mp4 --wait

# Multiple files
for file in *.jpg; do node scripts/upload.js "$file"; done

File Management API

While not in scripts, you can also manage files via JavaScript:

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// List all files
for await (const file of client.files.list()) {
  console.log(`${file.name}: ${file.displayName} (${file.state})`);
}

// Get file info
const file = await client.files.get({ name: "files/abc123..." });
console.log(`State: ${file.state}`);

// Delete file
await client.files.delete({ name: "files/abc123..." });

Reference

Get API key: https://aistudio.google.com/apikey
Documentation: https://ai.google.dev/gemini-api/docs/file-upload
File API: https://ai.google.dev/gemini-api/docs/files
Supported formats: Images, audio, video, documents (see table above)

Related Skills

akrindev/gemini-tts

development

VerifiedTrustedCommunity

Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports multiple voices and streaming. Triggers on "text to speech", "TTS", "generate audio", "voice synthesis", "speak this text".

1SKILL.mdUpdated Apr 1, 2026

akrindev/gemini-text

development

VerifiedTrustedCommunity

Generate text content using Google Gemini models via scripts/. Use for text generation, multimodal prompts with images, thinking mode for complex reasoning, JSON-formatted outputs, and Google Search grounding for real-time information. Triggers on "generate with gemini", "use gemini for text", "AI text generation", "multimodal prompt", "gemini thinking mode", "grounded response".

1SKILL.mdUpdated Apr 1, 2026

akrindev/gemini-image

development

VerifiedTrustedCommunity

Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".

1SKILL.mdUpdated Apr 1, 2026

akrindev/gemini-image

akrindev/gemini-embeddings

development

VerifiedTrustedCommunity

Generate text embeddings using Gemini Embedding API via scripts/. Use for creating vector representations of text, semantic search, similarity matching, clustering, and RAG applications. Triggers on "embeddings", "semantic search", "vector search", "text similarity", "RAG", "retrieval".

1SKILL.mdUpdated Apr 1, 2026

akrindev/gemini-embeddings

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/akrindev/google-studio-skills.git

# Copy into Claude Code skills folder (global)
cp -r google-studio-skills/skills/gemini-files ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

akrindev/google-studio-skills

1 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT