Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

swigerb/.copilot/skills/echo-suppression

Name: .copilot/skills/echo-suppression
Author: swigerb

.copilot/skills/echo-suppression/SKILL.md

npx skillsauth add swigerb/sonicaidrivethru .copilot/skills/echo-suppression

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Skill: WebSocket Echo Suppression for Realtime Voice AI

Problem

In a WebSocket middleware that bridges client audio ↔ AI model (e.g., OpenAI Realtime API), the AI's audio response leaks from the client's speakers back into the microphone, gets forwarded to the model, and is transcribed as phantom user input — creating an infinite self-conversation loop.

Pattern

Multi-layered suppression combining client-side early muting and server-side audio gating to eliminate the timing gap where echo leaks through.

Client-Side (Frontend)

Early mute on response.created — mute the mic gain node at the earliest possible server event, BEFORE audio deltas arrive. Muting on response.audio.delta is too late — audio samples have already been sent.
input_audio_buffer.clear on response.created — flush any already-buffered echo from the server's audio pipeline.
Gain node mute/unmute — set gain to 0 (muted) / 1 (unmuted). Keeps the media stream alive so there's no permission re-prompt, and hardware echo cancellation stays active.
Unmute on response.done — re-open the mic when the AI finishes speaking.
Barge-in handling — input_audio_buffer.speech_started stops playback, unmutes mic, resets state.

Server-Side (Middleware)

State tracking (in server→client message path):
- response.audio.delta → set ai_speaking = True
- response.audio.done → clear flag, start cooldown timer, send input_audio_buffer.clear
- input_audio_buffer.speech_started → clear flag + cooldown (barge-in)
Audio gating (in client→server message path):
- If ai_speaking or within cooldown window → drop input_audio_buffer.append
Performance: Use fast substring markers ('"response.audio.delta"' in data) instead of JSON parse on the hot path.

Key Insight: Event Ordering

The OpenAI Realtime API sends events in this order:

response.created ← mute here (earliest signal)
response.output_item.added
response.content_part.added
response.audio.delta (repeated) ← too late to mute
response.audio_transcript.delta (interleaved)
response.done ← unmute here

Circular Dependency Pattern (React)

When a useCallback inside a hook needs to call sendJsonMessage from useWebSocket, but useWebSocket takes that callback as a parameter, use a useRef to break the cycle:

const sendRef = useRef<(msg: object) => void>(() => {});
const onMessage = useCallback(() => { sendRef.current({...}); }, []);
const { sendJsonMessage } = useWebSocket(url, { onMessage });
useEffect(() => { sendRef.current = sendJsonMessage; }, [sendJsonMessage]);

When to Use

Any WebSocket middleware sitting between a voice client and an AI realtime API
Client-side mic muting alone isn't enough — you also need buffer clearing
Server-side gating alone isn't enough — there's a timing gap before the gate activates

Tunables

_ECHO_COOLDOWN_SEC (default 0.3s): Post-response suppression window. Increase if echo persists on high-latency audio hardware.
VAD threshold (default 0.8): Higher rejects weak echo; too high may miss soft-spoken users.
VAD silence_duration_ms (default 500): Buffer before committing detected speech.

swigerb/.copilot/skills/echo-suppression

.copilot/skills/echo-suppression/SKILL.md

# Skill: WebSocket Echo Suppression for Realtime Voice AI ## Problem In a WebSocket middleware that bridges client audio ↔ AI model (e.g., OpenAI Realtime API), the AI's audio response leaks from the client's speakers back into the microphone, gets forwarded to the model, and is transcribed as phantom user input — creating an infinite self-conversation loop. ## Pattern Multi-layered suppression combining **client-side early muting** and **server-side audio gating** to eliminate the timing ga

tools

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add swigerb/sonicaidrivethru .copilot/skills/echo-suppression

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 4:22 AM4.1s1 file scanned

SKILL.md

Skill: WebSocket Echo Suppression for Realtime Voice AI

Problem

Pattern

Multi-layered suppression combining client-side early muting and server-side audio gating to eliminate the timing gap where echo leaks through.

Client-Side (Frontend)

Early mute on response.created — mute the mic gain node at the earliest possible server event, BEFORE audio deltas arrive. Muting on response.audio.delta is too late — audio samples have already been sent.
input_audio_buffer.clear on response.created — flush any already-buffered echo from the server's audio pipeline.
Gain node mute/unmute — set gain to 0 (muted) / 1 (unmuted). Keeps the media stream alive so there's no permission re-prompt, and hardware echo cancellation stays active.
Unmute on response.done — re-open the mic when the AI finishes speaking.
Barge-in handling — input_audio_buffer.speech_started stops playback, unmutes mic, resets state.

Server-Side (Middleware)

State tracking (in server→client message path):
- response.audio.delta → set ai_speaking = True
- response.audio.done → clear flag, start cooldown timer, send input_audio_buffer.clear
- input_audio_buffer.speech_started → clear flag + cooldown (barge-in)
Audio gating (in client→server message path):
- If ai_speaking or within cooldown window → drop input_audio_buffer.append
Performance: Use fast substring markers ('"response.audio.delta"' in data) instead of JSON parse on the hot path.

Key Insight: Event Ordering

The OpenAI Realtime API sends events in this order:

response.created ← mute here (earliest signal)
response.output_item.added
response.content_part.added
response.audio.delta (repeated) ← too late to mute
response.audio_transcript.delta (interleaved)
response.done ← unmute here

Circular Dependency Pattern (React)

When a useCallback inside a hook needs to call sendJsonMessage from useWebSocket, but useWebSocket takes that callback as a parameter, use a useRef to break the cycle:

const sendRef = useRef<(msg: object) => void>(() => {});
const onMessage = useCallback(() => { sendRef.current({...}); }, []);
const { sendJsonMessage } = useWebSocket(url, { onMessage });
useEffect(() => { sendRef.current = sendJsonMessage; }, [sendJsonMessage]);

When to Use

Any WebSocket middleware sitting between a voice client and an AI realtime API
Client-side mic muting alone isn't enough — you also need buffer clearing
Server-side gating alone isn't enough — there's a timing gap before the gate activates

Tunables

_ECHO_COOLDOWN_SEC (default 0.3s): Post-response suppression window. Increase if echo persists on high-latency audio hardware.
VAD threshold (default 0.8): Higher rejects weak echo; too high may miss soft-spoken users.
VAD silence_duration_ms (default 500): Buffer before committing detected speech.

Related Skills

swigerb/{skill-name}

data-ai

VerifiedTrustedCommunity

{what this skill teaches agents}

SKILL.mdUpdated Apr 15, 2026

swigerb/{skill-name}

data-ai

VerifiedTrustedCommunity

{what this skill teaches agents}

SKILL.mdUpdated Apr 15, 2026

swigerb/windows-compatibility

tools

VerifiedTrustedCommunity

Cross-platform path handling and command patterns

SKILL.mdUpdated Apr 15, 2026

swigerb/windows-compatibility

swigerb/test-discipline

development

VerifiedTrustedCommunity

Update tests when changing APIs — no exceptions

SKILL.mdUpdated Apr 15, 2026

swigerb/test-discipline

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/swigerb/sonicaidrivethru.git

# Copy into Claude Code skills folder (global)
cp -r sonicaidrivethru/.copilot/skills/echo-suppression ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

swigerb/sonicaidrivethru

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT