skills/sarvam-ai/SKILL.md
Indian AI toolkit powered by Sarvam AI — text-to-speech, speech-to-text, document intelligence, translation, transliteration, language detection, and chat completion across 23 Indian languages. Use when working with Indian languages, Hindi/Tamil/Bengali text, Sarvam AI, or when the user needs translation, transcription, or TTS for South Asian languages.
npx skillsauth add ankitjh4/indic-ai-skills sarvam-aiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive AI toolkit for 23 Indian languages: TTS, STT, Document Intelligence, Translation, Transliteration, Language Detection, and Chat.
export SARVAM_API_KEY="your-api-key"hi-IN Hindi, en-IN English, bn-IN Bengali, gu-IN Gujarati, kn-IN Kannada, ml-IN Malayalam, mr-IN Marathi, or-IN/od-IN Odia, pa-IN Punjabi, ta-IN Tamil, te-IN Telugu, ur-IN Urdu, as-IN Assamese, bodo-IN/brx-IN Bodo, doi-IN Dogri, ks-IN Kashmiri, kok-IN Konkani, mai-IN Maithili, mni-IN Manipuri, ne-IN Nepali, sa-IN Sanskrit, sat-IN Santali, sd-IN Sindhi
python3 scripts/tts.py "नमस्ते, आप कैसे हैं?" --language hi-IN --speaker meera
| Parameter | Default | Description |
|-----------|---------|-------------|
| text | — | Text to convert (max 2500 chars) |
| --language | hi-IN | Language code |
| --speaker | meera | Voice name |
| --output | output.wav | Output file |
| --sample-rate | 24000 | Audio sample rate |
Speakers — Female: Meera, Priya, Neha, Simran, Kavya, Ishita, Shreya, and more. Male: Shubh, Aditya, Rahul, Amit, Dev, Arjun, and more.
Three modes: REST (quick, <30s), WebSocket (real-time streaming), Batch (long audio, diarization).
# REST — quick transcription
python3 scripts/speech_to_text.py rest audio.mp3
# WebSocket — real-time streaming
python3 scripts/speech_to_text.py websocket audio.wav
# Batch — multiple files with speaker diarization
python3 scripts/speech_to_text.py batch audio1.mp3 audio2.mp3 --diarization --num-speakers 3 --output-dir ./transcripts/
Batch workflow: create job → upload files → start → poll status (Accepted → Pending → Running → Completed) → download results.
Formats: WAV, MP3, AAC, AIFF, OGG, OPUS, FLAC, MP4/M4A, AMR, WMA, WebM, PCM
Extract text from PDFs and images (JPEG/PNG).
python3 scripts/document_intelligence.py document.pdf --language hi-IN --format md
python3 scripts/document_intelligence.py --job-id <id> --download -o ./output/
Formats: md (default), html, json. Max 200 MB, 500 pages.
# Auto-detect source, translate to Hindi
python3 scripts/text_processing.py translate "Hello, how are you?" --target hi-IN
# Mayura model with colloquial mode
python3 scripts/text_processing.py translate "What's up?" --target hi-IN --model mayura:v1 --mode modern-colloquial
Models: sarvam-translate:v1 (23 languages), mayura:v1 (12 languages, supports modes and transliteration)
Modes (mayura only): formal, modern-colloquial, classic-colloquial, code-mixed
python3 scripts/text_processing.py transliterate "नमस्ते" --source hi-IN --target en-IN
python3 scripts/text_processing.py transliterate "namaste" --source en-IN --target hi-IN --spoken-form
python3 scripts/text_processing.py detect "नमस्ते दुনিয়া"
# Output: Language: hi-IN, Script: Deva
Two models: sarvam-105b (flagship, complex reasoning) and sarvam-m (efficient, general chat).
python3 scripts/text_processing.py chat "Explain quantum computing" --model sarvam-105b
python3 scripts/text_processing.py chat "What is the capital of India?" --model sarvam-m --temperature 0.8
development
--- name: cashfree description: Use this skill whenever the user wants to integrate Cashfree payment APIs. Triggers include: creating orders or payment sessions, accepting payments via UPI/cards/netbanking/wallets, generating payment links to share via SMS/email, handling refunds, verifying webhook signatures, fetching payment or settlement status, building a checkout flow, writing Python code for Cashfree, switching between test and production environments, or understanding Cashfree error codes
development
Interact with Zoho CRM, Projects, and Meeting APIs. Use when managing deals, contacts, leads, tasks, projects, milestones, meeting recordings, or any Zoho workspace data.
testing
Access Vedic scriptures including Rig Veda, Yajur Veda, Atharva Veda, and Puranas. Use when user asks for "Vedas", "Rig Veda", "Vedic hymns", "ancient Indian scriptures", or wants Hindu religious/philosophical content.
testing
Complete Vedic astrology chart generation and interpretation. Generate birth charts (D1-D60), calculate Panchanga, Shadbala, Vimshottari Dasha, Ashtakavarga, and provide interpretations using Krishnamurthi Paddhati (KP) system, classical Parashara principles, and traditional texts. Supports both natal and horary (Prasna) charts.