Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

reaatech/skills/barge-in-handling

Name: skills/barge-in-handling
Author: reaatech

skills/barge-in-handling/SKILL.md

npx skillsauth add reaatech/voice-agent-kit skills/barge-in-handling

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Barge-In Handling

Capability

Detects user speech during TTS playback and immediately interrupts audio output to handle the new utterance, providing a natural conversational experience where users can interrupt the agent mid-sentence.

MCP Tools

| Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | bargeIn.enable | z.object({ sessionId: z.string(), config: BargeInConfig }) | { enabled: boolean } | 10 RPM | | bargeIn.detect | z.object({ sessionId: z.string(), interimTranscript: z.string(), confidence: z.number() }) | { interrupted: boolean, action: 'continue' \| 'interrupt' } | 1000 RPM | | bargeIn.trigger | z.object({ sessionId: z.string(), reason: z.string() }) | { cancelled: boolean, ttsStopped: boolean } | 100 RPM | | bargeIn.disable | z.object({ sessionId: z.string() }) | { disabled: boolean } | 10 RPM |

Usage Examples

Example 1: Enable barge-in for a session

User intent: Allow user to interrupt TTS playback

Tool call:

{
  "name": "bargeIn.enable",
  "arguments": {
    "sessionId": "sess-abc123",
    "config": {
      "minSpeechDuration": 300,
      "confidenceThreshold": 0.7,
      "silenceThreshold": 200
    }
  }
}

Expected response:
```
{
  "enabled": true
}
```

Example 2: Detect interruption from STT interim results

User intent: Check if interim transcript indicates user is speaking

Tool call:

{
  "name": "bargeIn.detect",
  "arguments": {
    "sessionId": "sess-abc123",
    "interimTranscript": "wait I didn't mean",
    "confidence": 0.85
  }
}

Expected response:

{
  "interrupted": true,
  "action": "interrupt"
}

Example 3: Trigger barge-in (stop TTS and handle new utterance)

User intent: Stop TTS and process user interruption

Tool call:

{
  "name": "bargeIn.trigger",
  "arguments": {
    "sessionId": "sess-abc123",
    "reason": "user_interrupted"
  }
}

Expected response:

{
  "cancelled": true,
  "ttsStopped": true
}

Error Handling

Known Failure Modes

| Failure | Cause | Recovery | |---------|-------|----------| | TTS already stopped | Race condition | Log warning, continue with new utterance | | False positive detection | Background noise | Tune confidence threshold, add min duration | | Missed detection | Low confidence threshold | Lower threshold, increase STT sensitivity | | WebSocket send failure | Connection closed | Cleanup session, end call |

Recovery Strategies

TTS cancel failure: Force close WebSocket connection
Detection errors: Default to allowing interruption (better UX)
Race conditions: Use atomic operations for state changes

Security Considerations

PII Handling

Never log full interim transcripts
Redact potential PII in interruption logs
Hash session IDs in non-operational logs

Permissions

Barge-in requires active TTS playback
Configuration changes require valid session
Disable requires admin privileges in production

Audit Logging

Log all barge-in events with timestamps
Track false positive/negative rates
Record configuration changes

Barge-In Configuration

Configuration Options

# voice-agent-kit.config.ts
bargeIn:
  # Detection settings
  enabled: true
  minSpeechDuration: 300    # ms of speech before triggering
  confidenceThreshold: 0.7  # STT confidence required
  silenceThreshold: 200     # ms silence before considering complete
  
  # Response settings
  immediateCancel: true     # Cancel TTS immediately on detect
  drainQueue: false         # Don't send remaining TTS chunks
  
  # Tuning
  ignoreShortUtterances: true  # Ignore < 2 words
  minWords: 2

Detection Logic

function shouldInterrupt(interimTranscript: string, confidence: number): boolean {
  // Don't interrupt if confidence too low
  if (confidence < 0.7) return false;
  
  // Don't interrupt for very short utterances (likely noise)
  const words = interimTranscript.split(/\s+/).filter(w => w.length > 0);
  if (words.length < 2) return false;
  
  // Interrupt for common interruption patterns
  const interruptionPatterns = [
    /\b(wait|stop|hold on|never mind|actually|no|that's not)/i,
    /\b(let me|I want|I need|can I)\b/i
  ];
  
  return interruptionPatterns.some(pattern => pattern.test(interimTranscript));
}

TTS Cancellation Flow

1. STT emits interim transcript during TTS playback
   │
2. Barge-in detector evaluates transcript + confidence
   │
3. If interruption detected:
   │  a. Send 'clear' message to Twilio (stops playback)
   │  b. Cancel in-flight TTS synthesis
   │  c. Emit 'barge_in' event
   │  d. Feed new utterance to pipeline
   │
4. Previous turn abandoned (no history update)
   │
5. New turn begins with user's interruption

Metrics and Observability

Key Metrics

| Metric | Type | Description | |--------|------|-------------| | voice.barge_in.count | Counter | Total barge-in events | | voice.barge_in.false_positives | Counter | Incorrectly triggered interruptions | | voice.barge_in.missed | Counter | Missed interruptions (user repeated) | | voice.barge_in.latency_ms | Histogram | Time from speech detect to TTS cancel |

Tracing

| Span | Attributes | |------|------------| | voice.barge_in.detect | session_id, transcript_length, confidence | | voice.barge_in.trigger | session_id, reason, tts_position_ms | | voice.barge_in.cancel_tts | session_id, chunks_cancelled |

Related Skills

Pipeline Orchestration
STT Provider Interface
TTS Provider Interface
Twilio Media Streams

reaatech/skills/barge-in-handling

skills/barge-in-handling/SKILL.md

# Barge-In Handling ## Capability Detects user speech during TTS playback and immediately interrupts audio output to handle the new utterance, providing a natural conversational experience where users can interrupt the agent mid-sentence. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `bargeIn.enable` | `z.object({ sessionId: z.string(), config: BargeInConfig })` | `{ enabled: boolean }` | 10 RPM | | `bargeIn.detect` | `z.object({ se

tools

Updated Apr 25, 2026

$ install --global

skillsauth

npx skillsauth add reaatech/voice-agent-kit skills/barge-in-handling

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 2:58 PM56.8s1 file scanned

SKILL.md

Barge-In Handling

Capability

MCP Tools

Usage Examples

Example 1: Enable barge-in for a session

User intent: Allow user to interrupt TTS playback

Tool call:

{
  "name": "bargeIn.enable",
  "arguments": {
    "sessionId": "sess-abc123",
    "config": {
      "minSpeechDuration": 300,
      "confidenceThreshold": 0.7,
      "silenceThreshold": 200
    }
  }
}

Expected response:
```
{
  "enabled": true
}
```

Example 2: Detect interruption from STT interim results

User intent: Check if interim transcript indicates user is speaking

Tool call:

{
  "name": "bargeIn.detect",
  "arguments": {
    "sessionId": "sess-abc123",
    "interimTranscript": "wait I didn't mean",
    "confidence": 0.85
  }
}

Expected response:

{
  "interrupted": true,
  "action": "interrupt"
}

Example 3: Trigger barge-in (stop TTS and handle new utterance)

User intent: Stop TTS and process user interruption

Tool call:

{
  "name": "bargeIn.trigger",
  "arguments": {
    "sessionId": "sess-abc123",
    "reason": "user_interrupted"
  }
}

Expected response:

{
  "cancelled": true,
  "ttsStopped": true
}

Error Handling

Known Failure Modes

Recovery Strategies

TTS cancel failure: Force close WebSocket connection
Detection errors: Default to allowing interruption (better UX)
Race conditions: Use atomic operations for state changes

Security Considerations

PII Handling

Never log full interim transcripts
Redact potential PII in interruption logs
Hash session IDs in non-operational logs

Permissions

Barge-in requires active TTS playback
Configuration changes require valid session
Disable requires admin privileges in production

Audit Logging

Log all barge-in events with timestamps
Track false positive/negative rates
Record configuration changes

Barge-In Configuration

Configuration Options

# voice-agent-kit.config.ts
bargeIn:
  # Detection settings
  enabled: true
  minSpeechDuration: 300    # ms of speech before triggering
  confidenceThreshold: 0.7  # STT confidence required
  silenceThreshold: 200     # ms silence before considering complete
  
  # Response settings
  immediateCancel: true     # Cancel TTS immediately on detect
  drainQueue: false         # Don't send remaining TTS chunks
  
  # Tuning
  ignoreShortUtterances: true  # Ignore < 2 words
  minWords: 2

Detection Logic

function shouldInterrupt(interimTranscript: string, confidence: number): boolean {
  // Don't interrupt if confidence too low
  if (confidence < 0.7) return false;
  
  // Don't interrupt for very short utterances (likely noise)
  const words = interimTranscript.split(/\s+/).filter(w => w.length > 0);
  if (words.length < 2) return false;
  
  // Interrupt for common interruption patterns
  const interruptionPatterns = [
    /\b(wait|stop|hold on|never mind|actually|no|that's not)/i,
    /\b(let me|I want|I need|can I)\b/i
  ];
  
  return interruptionPatterns.some(pattern => pattern.test(interimTranscript));
}

TTS Cancellation Flow

1. STT emits interim transcript during TTS playback
   │
2. Barge-in detector evaluates transcript + confidence
   │
3. If interruption detected:
   │  a. Send 'clear' message to Twilio (stops playback)
   │  b. Cancel in-flight TTS synthesis
   │  c. Emit 'barge_in' event
   │  d. Feed new utterance to pipeline
   │
4. Previous turn abandoned (no history update)
   │
5. New turn begins with user's interruption

Metrics and Observability

Key Metrics

Tracing

Related Skills

Pipeline Orchestration
STT Provider Interface
TTS Provider Interface
Twilio Media Streams

Related Skills

reaatech/skills/twilio-media-streams

tools

VerifiedTrustedCommunity

# Twilio Media Streams ## Capability Handles Twilio Media Streams WebSocket connections for real-time bidirectional audio communication, parsing inbound messages, encoding outbound audio, and managing call lifecycle events. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `twilio.handleStart` | `z.object({ message: z.object({ event: z.literal('start'), callSid: z.string(), streamSid: z.string(), format: z.string(), tracks: z.array(z.st

SKILL.mdUpdated Apr 25, 2026

reaatech/skills/twilio-media-streams

reaatech/skills/tts-provider-interface

tools

VerifiedTrustedCommunity

# TTS Provider Interface ## Capability Provides a unified interface for text-to-speech (TTS) providers, enabling streaming audio synthesis with first-byte latency tracking, voice selection, and output format conversion. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `tts.synthesize` | `z.object({ text: z.string(), config: z.object({ provider: z.string(), voice: z.string().optional(), speed: z.number().optional() }) })` | `{ chunks: A

SKILL.mdUpdated Apr 25, 2026

reaatech/skills/tts-provider-interface

reaatech/skills/telephony-lifecycle

tools

VerifiedTrustedCommunity

# Telephony Lifecycle ## Capability Manages the complete lifecycle of voice calls from TwiML webhook initiation through call completion, including call connect, transfer, conference, and disconnect handling with proper session cleanup. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `telephony.generateTwiML` | `z.object({ sessionId: z.string(), wsUrl: z.string().url() })` | `{ twiml: string }` | 100 RPM | | `telephony.handleConnect` |

SKILL.mdUpdated Apr 25, 2026

reaatech/skills/telephony-lifecycle

reaatech/skills/stt-provider-interface

tools

VerifiedTrustedCommunity

# STT Provider Interface ## Capability Provides a unified interface for speech-to-text (STT) providers, enabling real-time streaming transcription with interim results, endpoint detection, and automatic reconnection handling. ## MCP Tools | Tool | Input Schema | Output | Rate Limit | |------|-------------|--------|------------| | `stt.connect` | `z.object({ provider: z.enum(['deepgram', 'aws-transcribe', 'google-cloud']), config: z.object({ apiKey: z.string().optional(), sampleRate: z.number

SKILL.mdUpdated Apr 25, 2026

reaatech/skills/stt-provider-interface

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/reaatech/voice-agent-kit.git

# Copy into Claude Code skills folder (global)
cp -r voice-agent-kit/skills/barge-in-handling ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

reaatech/voice-agent-kit

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT