Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

sanjay3290/google-tts

Name: google-tts
Author: sanjay3290

skills/google-tts/SKILL.md

npx skillsauth add sanjay3290/ai-skills google-tts

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Error

VirusTotalMulti-engine malware detection

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Google Cloud Text-to-Speech

Converts text and documents into audio using Google Cloud TTS API. Supports Neural2, WaveNet, Studio, and Standard voices across 40+ languages.

Setup

API key via GOOGLE_TTS_API_KEY env var or skills/google-tts/config.json with {"api_key": "..."}. Requires ffmpeg for multi-chunk documents. Optional: pip install PyPDF2 python-docx for PDF/DOCX.

Commands

List Voices

python skills/google-tts/scripts/google_tts.py voices --language en-US --type Neural2
python skills/google-tts/scripts/google_tts.py voices --json

Text-to-Speech

# From text or document (PDF, DOCX, MD, TXT)
python skills/google-tts/scripts/google_tts.py tts --text "Hello world" --output ~/Downloads/hello.mp3
python skills/google-tts/scripts/google_tts.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3

# With voice, rate, pitch, encoding options
python skills/google-tts/scripts/google_tts.py tts --file doc.md --voice en-US-Neural2-F --rate 0.9 --encoding MP3 --output ~/Downloads/out.mp3

Podcast Generation

Takes a JSON script with alternating speakers, synthesizes each with a different voice.

[
  {"speaker": "host1", "text": "Welcome to our podcast!"},
  {"speaker": "host2", "text": "Thanks for having me..."}
]

python skills/google-tts/scripts/google_tts.py podcast --script /tmp/script.json --output ~/Downloads/podcast.mp3
python skills/google-tts/scripts/google_tts.py podcast --script /tmp/script.json --voice1 en-US-Neural2-J --voice2 en-US-Neural2-H --rate 0.9 --output ~/Downloads/podcast.mp3

Workflow

Single-Voice Narration

If user provides a file path, use --file. For generated content, write clean prose to /tmp/tts_input.md first.
Default voice: en-US-Neural2-D (male) or en-US-Neural2-F (female). Use Neural2 for best quality/cost balance.
Generate: python skills/google-tts/scripts/google_tts.py tts --file /tmp/tts_input.md --output ~/Downloads/recording.mp3
Report file location and size. Default output to ~/Downloads/.

Podcast from Document

Extract text: python skills/google-tts/scripts/extract.py /path/to/document.pdf
Generate a two-host conversation script as JSON:
- Natural discussion, not verbatim reading. Host 1 leads, Host 2 reacts/analyzes.
- Include intro and outro. Vary turn lengths. Keep turns under 4000 chars.
Write script to /tmp/podcast_script.json
Generate: python skills/google-tts/scripts/google_tts.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3
Clean up temp files.

Reference

Recommended voice type: Neural2 (~$4/1M chars, high quality)
Speaking rate: 0.25-4.0 (0.85-0.95 good for technical content)
Pitch: -20.0 to 20.0 semitones
Encodings: MP3 (default), LINEAR16 (.wav), OGG_OPUS (.ogg)
API limit: 5000 bytes/request. Script auto-chunks at sentence boundaries.

sanjay3290/google-tts

skills/google-tts/SKILL.md

Convert documents and text to audio using Google Cloud Text-to-Speech. Use this skill when the user wants to: narrate a document, read aloud text, generate audio from a file, convert text to speech, create a recording of documentation or analysis, create a podcast from a document, or use Google TTS/text-to-speech. Trigger phrases: "read this aloud", "narrate this", "create a recording", "text to speech", "TTS", "convert to audio", "audio from document", "listen to this", "generate audio", "google tts", "create a podcast".

160 stars

development

Updated Mar 20, 2026

$ install --global

skillsauth

npx skillsauth add sanjay3290/ai-skills google-tts

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Error

VirusTotalMulti-engine malware detection

70%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 20, 2026, 2:16 PM130.8s3 files scanned

SKILL.md

name:: google-tts
description:: |
Use this skill when the user wants to:: narrate a document, read aloud text,
Google TTS/text-to-speech. Trigger phrases:: read this aloud", "narrate this",

Google Cloud Text-to-Speech

Converts text and documents into audio using Google Cloud TTS API. Supports Neural2, WaveNet, Studio, and Standard voices across 40+ languages.

Setup

Commands

List Voices

python skills/google-tts/scripts/google_tts.py voices --language en-US --type Neural2
python skills/google-tts/scripts/google_tts.py voices --json

Text-to-Speech

# From text or document (PDF, DOCX, MD, TXT)
python skills/google-tts/scripts/google_tts.py tts --text "Hello world" --output ~/Downloads/hello.mp3
python skills/google-tts/scripts/google_tts.py tts --file /path/to/doc.pdf --output ~/Downloads/narration.mp3

# With voice, rate, pitch, encoding options
python skills/google-tts/scripts/google_tts.py tts --file doc.md --voice en-US-Neural2-F --rate 0.9 --encoding MP3 --output ~/Downloads/out.mp3

Podcast Generation

Takes a JSON script with alternating speakers, synthesizes each with a different voice.

[
  {"speaker": "host1", "text": "Welcome to our podcast!"},
  {"speaker": "host2", "text": "Thanks for having me..."}
]

python skills/google-tts/scripts/google_tts.py podcast --script /tmp/script.json --output ~/Downloads/podcast.mp3
python skills/google-tts/scripts/google_tts.py podcast --script /tmp/script.json --voice1 en-US-Neural2-J --voice2 en-US-Neural2-H --rate 0.9 --output ~/Downloads/podcast.mp3

Workflow

Single-Voice Narration

If user provides a file path, use --file. For generated content, write clean prose to /tmp/tts_input.md first.
Default voice: en-US-Neural2-D (male) or en-US-Neural2-F (female). Use Neural2 for best quality/cost balance.
Generate: python skills/google-tts/scripts/google_tts.py tts --file /tmp/tts_input.md --output ~/Downloads/recording.mp3
Report file location and size. Default output to ~/Downloads/.

Podcast from Document

Extract text: python skills/google-tts/scripts/extract.py /path/to/document.pdf
Generate a two-host conversation script as JSON:
- Natural discussion, not verbatim reading. Host 1 leads, Host 2 reacts/analyzes.
- Include intro and outro. Vary turn lengths. Keep turns under 4000 chars.
Write script to /tmp/podcast_script.json
Generate: python skills/google-tts/scripts/google_tts.py podcast --script /tmp/podcast_script.json --output ~/Downloads/podcast.mp3
Clean up temp files.

Reference

Recommended voice type: Neural2 (~$4/1M chars, high quality)
Speaking rate: 0.25-4.0 (0.85-0.95 good for technical content)
Pitch: -20.0 to 20.0 semitones
Encodings: MP3 (default), LINEAR16 (.wav), OGG_OPUS (.ogg)
API limit: 5000 bytes/request. Script auto-chunks at sentence boundaries.

Related Skills

sanjay3290/outline

devops

VerifiedTrustedCommunity

Search, read, and manage Outline wiki documents. Use when: (1) searching wiki for documentation, (2) reading wiki pages or articles, (3) listing wiki collections or documents, (4) creating or updating wiki content, (5) exporting documents as markdown. Works with any Outline wiki instance (self-hosted or cloud).

238SKILL.mdUpdated Mar 20, 2026

sanjay3290/jules

development

VerifiedTrustedCommunity

Delegate coding tasks to Google Jules AI agent for asynchronous execution. Use when user says: 'have Jules fix', 'delegate to Jules', 'send to Jules', 'ask Jules to', 'check Jules sessions', 'pull Jules results', 'jules add tests', 'jules add docs', 'jules review pr'. Handles: bug fixes, documentation, features, tests, refactoring, code reviews. Works with GitHub repos, creates PRs.

238SKILL.mdUpdated Mar 20, 2026

sanjay3290/imagen

development

VerifiedTrustedCommunity

Generate images using Google Gemini's image generation capabilities. Use this skill when the user needs to create, generate, or produce images for any purpose including UI mockups, icons, illustrations, diagrams, concept art, placeholder images, or visual representations.

238SKILL.mdUpdated Mar 20, 2026

sanjay3290/deep-research

development

VerifiedTrustedCommunity

Execute autonomous multi-step research using Google Gemini Deep Research Agent. Use for: market analysis, competitive landscaping, literature reviews, technical research, due diligence. Takes 2-10 minutes but produces detailed, cited reports. Costs $2-5 per task.

238SKILL.mdUpdated Mar 20, 2026

sanjay3290/deep-research

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/sanjay3290/ai-skills.git

# Copy into Claude Code skills folder (global)
cp -r ai-skills/skills/google-tts ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

sanjay3290/ai-skills

160 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT