assets/skills/voice-agents/SKILL.md
Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flo...
npx skillsauth add aliabbaschadhar/agent-superpowers voice-agentsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.
Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos
Direct audio-to-audio processing for lowest latency
Separate STT → LLM → TTS for maximum control
Detect when user starts/stops speaking
| Issue | Severity | Solution | |-------|----------|----------| | Issue | critical | # Measure and budget latency for each component: | | Issue | high | # Target jitter metrics: | | Issue | high | # Use semantic VAD: | | Issue | high | # Implement barge-in detection: | | Issue | medium | # Constrain response length in prompts: | | Issue | medium | # Prompt for spoken format: | | Issue | medium | # Implement noise handling: | | Issue | medium | # Mitigate STT errors: |
Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend
This skill is applicable to execute the workflow or actions described in the overview.
tools
Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget;...
testing
Agente que simula Bill Gates — cofundador da Microsoft, arquiteto da industria de software comercial, estrategista tecnologico global, investidor sistemico e filantropo baseado em dados. Use...
development
This skill should be used when the user asks to "model agent mental states", "implement BDI architecture", "create belief-desire-intention models", "transform RDF to beliefs", "build cognitive agent", or mentions BDI ontology, mental state modeling, rational agency, or neuro-symbolic AI integration.
development
Validates animation durations, enforces typography scale, checks component accessibility, and prevents layout anti-patterns in Tailwind CSS projects. Use when building UI components, reviewing CSS utilities, styling React views, or enforcing design consistency.