skills/audio-format-conversion/SKILL.md
# Audio Format Conversion ## Capability Handles audio encoding and decoding between various formats (mulaw, linear16, PCM) and sample rates, enabling interoperability between Twilio (mulaw 8kHz) and STT/TTS providers that may use different audio formats. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `audio.decode` | `z.object({ data: z.instanceof(Buffer), fromEncoding: z.enum(['mulaw', 'linear16', 'pcm']), toEncoding: z.enum(['mulaw
npx skillsauth add reaatech/voice-agent-kit skills/audio-format-conversionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Handles audio encoding and decoding between various formats (mulaw, linear16, PCM) and sample rates, enabling interoperability between Twilio (mulaw 8kHz) and STT/TTS providers that may use different audio formats.
| Tool | Input Schema | Output | Rate Limit |
|------|-------------|--------|------------|
| audio.decode | z.object({ data: z.instanceof(Buffer), fromEncoding: z.enum(['mulaw', 'linear16', 'pcm']), toEncoding: z.enum(['mulaw', 'linear16', 'pcm']), sampleRate: z.number() }) | { audio: Buffer, encoding: string, sampleRate: number } | 1000 RPM |
| audio.resample | z.object({ data: z.instanceof(Buffer), fromSampleRate: z.number(), toSampleRate: z.number() }) | { audio: Buffer, sampleRate: number } | 1000 RPM |
| audio.encode | z.object({ data: z.instanceof(Buffer), encoding: z.enum(['mulaw', 'linear16', 'pcm']), sampleRate: z.number() }) | { data: Buffer } | 1000 RPM |
| audio.validate | z.object({ data: z.instanceof(Buffer), expectedEncoding: z.string(), expectedSampleRate: z.number() }) | { valid: boolean, detected?: { encoding: string, sampleRate: number } } | 100 RPM |
{
"name": "audio.decode",
"arguments": {
"data": "<Buffer mulaw 8kHz>",
"fromEncoding": "mulaw",
"toEncoding": "linear16",
"sampleRate": 8000
}
}
{
"audio": "<Buffer linear16 8kHz>",
"encoding": "linear16",
"sampleRate": 8000
}
{
"name": "audio.resample",
"arguments": {
"data": "<Buffer 16kHz audio>",
"fromSampleRate": 16000,
"toSampleRate": 8000
}
}
{
"audio": "<Buffer 8kHz audio>",
"sampleRate": 8000
}
{
"name": "audio.encode",
"arguments": {
"data": "<Buffer PCM 24kHz from Deepgram>",
"encoding": "mulaw",
"sampleRate": 8000
}
}
{
"data": "<Buffer mulaw 8kHz>"
}
{
"name": "audio.validate",
"arguments": {
"data": "<Buffer audio>",
"expectedEncoding": "mulaw",
"expectedSampleRate": 8000
}
}
{
"valid": true
}
| Failure | Cause | Recovery | |---------|-------|----------| | Invalid encoding | Unknown format | Return error with supported formats | | Sample rate mismatch | Unexpected sample rate | Auto-detect or return error | | Buffer too small | Incomplete audio chunk | Pad with silence or return error | | Conversion overflow | Values out of range | Clamp values, log warning | | Memory exhaustion | Very large buffer | Process in chunks, return error if too large |
| From → To | Mulaw 8k | Linear16 8k | Linear16 16k | Linear16 24k | Linear16 48k | |-----------|----------|-------------|--------------|--------------|--------------| | Mulaw 8k | — | Decode | Resample+Decode | Resample+Decode | Resample+Decode | | Linear16 8k | Encode | — | Resample | Resample | Resample | | Linear16 16k | Resample+Encode | Resample | — | Resample | Resample | | Linear16 24k | Resample+Encode | Resample | Resample | — | Resample | | Linear16 48k | Resample+Encode | Resample | Resample | Resample | — |
| Operation | Typical Latency (per 20ms chunk) | |-----------|----------------------------------| | Mulaw ↔ Linear16 | < 1ms | | Resample (2x) | < 2ms | | Resample (3x, 6x) | < 5ms |
| Metric | Type | Description |
|--------|------|-------------|
| audio.conversions.total | Counter | Total conversions performed |
| audio.conversions.errors | Counter | Conversion errors |
| audio.conversions.latency_ms | Histogram | Conversion latency |
| audio.resampling.total | Counter | Resampling operations |
| Span | Attributes |
|------|------------|
| audio.decode | from_encoding, to_encoding, sample_rate, buffer_size |
| audio.resample | from_rate, to_rate, buffer_size |
| audio.encode | encoding, sample_rate, buffer_size |
tools
# Twilio Media Streams ## Capability Handles Twilio Media Streams WebSocket connections for real-time bidirectional audio communication, parsing inbound messages, encoding outbound audio, and managing call lifecycle events. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `twilio.handleStart` | `z.object({ message: z.object({ event: z.literal('start'), callSid: z.string(), streamSid: z.string(), format: z.string(), tracks: z.array(z.st
tools
# TTS Provider Interface ## Capability Provides a unified interface for text-to-speech (TTS) providers, enabling streaming audio synthesis with first-byte latency tracking, voice selection, and output format conversion. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `tts.synthesize` | `z.object({ text: z.string(), config: z.object({ provider: z.string(), voice: z.string().optional(), speed: z.number().optional() }) })` | `{ chunks: A
tools
# Telephony Lifecycle ## Capability Manages the complete lifecycle of voice calls from TwiML webhook initiation through call completion, including call connect, transfer, conference, and disconnect handling with proper session cleanup. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `telephony.generateTwiML` | `z.object({ sessionId: z.string(), wsUrl: z.string().url() })` | `{ twiml: string }` | 100 RPM | | `telephony.handleConnect` |
tools
# STT Provider Interface ## Capability Provides a unified interface for speech-to-text (STT) providers, enabling real-time streaming transcription with interim results, endpoint detection, and automatic reconnection handling. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `stt.connect` | `z.object({ provider: z.enum(['deepgram', 'aws-transcribe', 'google-cloud']), config: z.object({ apiKey: z.string().optional(), sampleRate: z.number