Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

agent-skills-hub/voice-agents

Name: voice-agents
Author: agent-skills-hub

skills/voice-agents/SKILL.md

npx skillsauth add agent-skills-hub/agent-skills-hub voice-agents

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Voice Agents

You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.

Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos

Capabilities

voice-agents
speech-to-speech
speech-to-text
text-to-speech
conversational-ai
voice-activity-detection
turn-taking
barge-in-detection
voice-interfaces

Patterns

Speech-to-Speech Architecture

Direct audio-to-audio processing for lowest latency

Pipeline Architecture

Separate STT → LLM → TTS for maximum control

Voice Activity Detection Pattern

Detect when user starts/stops speaking

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

| Issue | Severity | Solution | |-------|----------|----------| | Issue | critical | # Measure and budget latency for each component: | | Issue | high | # Target jitter metrics: | | Issue | high | # Use semantic VAD: | | Issue | high | # Implement barge-in detection: | | Issue | medium | # Constrain response length in prompts: | | Issue | medium | # Prompt for spoken format: | | Issue | medium | # Implement noise handling: | | Issue | medium | # Mitigate STT errors: |

Related Skills

Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend

agent-skills-hub/voice-agents

skills/voice-agents/SKILL.md

Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flow with sub-800ms latency while handling interruptions, background noise, and emotional nuance. This skill covers two architectures: speech-to-speech (OpenAI Realtime API, lowest latency, most natural) and pipeline (STT→LLM→TTS, more control, easier to debug). Key insight: latency is the constraint. Hu

12 stars

development

Updated Mar 27, 2026

$ install --global

skillsauth

npx skillsauth add agent-skills-hub/agent-skills-hub voice-agents

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 29, 2026, 9:14 PM36.1s1 file scanned

SKILL.md

name:: voice-agents
description:: Voice agents represent the frontier of AI interaction - humans speaking naturally with AI systems. The challenge isn't just speech recognition and synthesis, it's achieving natural conversation flow with sub-800ms latency while handling interruptions, background noise, and emotional nuance. This skill covers two architectures: speech-to-speech (OpenAI Realtime API, lowest latency, most natural) and pipeline (STT→LLM→TTS, more control, easier to debug). Key insight: latency is the constraint. Hu
source:: vibeship-spawner-skills (Apache 2.0)

Voice Agents

Capabilities

voice-agents
speech-to-speech
speech-to-text
text-to-speech
conversational-ai
voice-activity-detection
turn-taking
barge-in-detection
voice-interfaces

Patterns

Speech-to-Speech Architecture

Direct audio-to-audio processing for lowest latency

Pipeline Architecture

Separate STT → LLM → TTS for maximum control

Voice Activity Detection Pattern

Detect when user starts/stops speaking

Anti-Patterns

❌ Ignoring Latency Budget

❌ Silence-Only Turn Detection

❌ Long Responses

⚠️ Sharp Edges

Related Skills

Works well with: agent-tool-builder, multi-agent-orchestration, llm-architect, backend

Related Skills

agent-skills-hub/loki-mode

tools

VerifiedTrustedCommunity

Multi-agent autonomous startup system for Claude Code. Triggers on "Loki Mode". Orchestrates 100+ specialized agents across engineering, QA, DevOps, security, data/ML, business operations, marketing, HR, and customer success. Takes PRD to fully deployed, revenue-generating product with zero human intervention. Features Task tool for subagent dispatch, parallel code review with 3 specialized reviewers, severity-based issue triage, distributed task queue with dead letter handling, automatic deployment to cloud providers, A/B testing, customer feedback loops, incident response, circuit breakers, and self-healing. Handles rate limits via distributed state checkpoints and auto-resume with exponential backoff. Requires --dangerously-skip-permissions flag.

63SKILL.mdUpdated May 21, 2026

agent-skills-hub/loki-mode

agent-skills-hub/bilig-workpaper

tools

VerifiedTrustedCommunity

Formula WorkPaper runtime and MCP server for AI agents and Node.js services. Use when an agent needs spreadsheet-style formulas, cell edits, recalculation, readback verification, or persisted WorkPaper JSON without driving Excel UI.

39SKILL.mdUpdated May 21, 2026

agent-skills-hub/bilig-workpaper

agent-skills-hub/templates

data-ai

VerifiedTrustedCommunity

Project scaffolding templates for new applications. Use when creating new projects from scratch. Contains 12 templates for various tech stacks.

39SKILL.mdUpdated May 21, 2026

agent-skills-hub/templates

agent-skills-hub/app-builder

development

VerifiedTrustedCommunity

Main application building orchestrator. Creates full-stack applications from natural language requests. Determines project type, selects tech stack, coordinates agents.

39SKILL.mdUpdated May 21, 2026

agent-skills-hub/app-builder

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/agent-skills-hub/agent-skills-hub.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills-hub/skills/voice-agents ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

agent-skills-hub/agent-skills-hub

12 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT