Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

eliferjunior/deepgram

Name: deepgram
Author: eliferjunior

.claude/skills/ts-deepgram/SKILL.md

npx skillsauth add eliferjunior/Claude deepgram

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Deepgram — Real-Time Speech-to-Text API

Overview

You are an expert in Deepgram, the speech-to-text platform optimized for real-time transcription. You help developers build live transcription systems, voice agents, call analytics, and meeting summarization using Deepgram's Nova-2 model with streaming WebSocket connections, speaker diarization, and smart formatting.

Instructions

Streaming Transcription (Real-Time)

// Real-time transcription via WebSocket
import { createClient, LiveTranscriptionEvents } from "@deepgram/sdk";

const deepgram = createClient(process.env.DEEPGRAM_API_KEY);

async function transcribeLive(audioStream: ReadableStream) {
  const connection = deepgram.listen.live({
    model: "nova-2",                    // Fastest, most accurate model
    language: "en",
    smart_format: true,                 // Auto-punctuation, casing, numbers
    interim_results: true,              // Get partial results as user speaks
    utterance_end_ms: 1000,             // 1s silence = end of utterance
    vad_events: true,                   // Voice activity detection
    diarize: true,                      // Speaker identification
    endpointing: 500,                   // 500ms endpointing for responsiveness
  });

  connection.on(LiveTranscriptionEvents.Transcript, (data) => {
    const transcript = data.channel.alternatives[0];
    if (transcript.transcript) {
      if (data.is_final) {
        console.log(`[Final] Speaker ${data.channel.alternatives[0].words?.[0]?.speaker}: ${transcript.transcript}`);
        // Send to LLM for response generation
      } else {
        console.log(`[Interim] ${transcript.transcript}`);
        // Show real-time text as user speaks
      }
    }
  });

  connection.on(LiveTranscriptionEvents.UtteranceEnd, () => {
    console.log("[Utterance complete — user stopped speaking]");
  });

  // Pipe audio to Deepgram (16kHz, 16-bit PCM or any supported format)
  for await (const chunk of audioStream) {
    connection.send(chunk);
  }
}

Pre-Recorded Transcription

# Batch transcription for recorded audio/video
from deepgram import DeepgramClient, PrerecordedOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

options = PrerecordedOptions(
    model="nova-2",
    smart_format=True,
    diarize=True,                       # Identify speakers
    summarize="v2",                     # Auto-generate summary
    topics=True,                        # Extract topics
    intents=True,                       # Detect intent (question, command, statement)
    sentiment=True,                     # Sentiment per utterance
    paragraphs=True,                    # Auto-paragraph formatting
    utterances=True,                    # Split by speaker turns
)

# From URL
response = dg.listen.rest.v("1").transcribe_url(
    {"url": "https://example.com/meeting.mp3"}, options
)

# From file
with open("recording.wav", "rb") as f:
    response = dg.listen.rest.v("1").transcribe_file(
        {"buffer": f.read(), "mimetype": "audio/wav"}, options
    )

# Access results
transcript = response.results.channels[0].alternatives[0]
print(f"Transcript: {transcript.transcript}")
print(f"Confidence: {transcript.confidence}")
print(f"Summary: {response.results.summary.short}")
for utterance in response.results.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.transcript}")

Text-to-Speech (Aura)

# Deepgram Aura TTS — low-latency voice synthesis
response = dg.speak.rest.v("1").stream_raw(
    {"text": "Thanks for calling. How can I help you today?"},
    options={
        "model": "aura-asteria-en",     # Female, warm tone
        "encoding": "linear16",          # 16-bit PCM
        "sample_rate": 24000,
    },
)

# Stream audio chunks to speaker/WebRTC
for chunk in response.iter_bytes():
    audio_output.write(chunk)

Installation

npm install @deepgram/sdk                # Node.js
pip install deepgram-sdk                  # Python

Examples

Example 1: User asks to set up deepgram

User: "Help me set up deepgram for my project"

The agent should:

Check system requirements and prerequisites
Install or configure deepgram
Set up initial project structure
Verify the setup works correctly

Example 2: User asks to build a feature with deepgram

User: "Create a dashboard using deepgram"

The agent should:

Scaffold the component or configuration
Connect to the appropriate data source
Implement the requested feature
Test and validate the output

Guidelines

Nova-2 for everything — Nova-2 is Deepgram's best model for accuracy and speed; use it unless you need a specific language model
Streaming for real-time — Use WebSocket connections for live audio; batch API for pre-recorded files
Endpointing tuning — Set endpointing to 300-500ms for voice agents (responsive) or 1000ms for transcription (accurate)
Smart formatting — Always enable smart_format for proper capitalization, punctuation, and number formatting
Diarization for meetings — Enable diarize when multiple speakers are present; Deepgram identifies up to 10 speakers
Interim results for UX — Enable interim_results for real-time text display; show partial transcripts as users speak
Multichannel for calls — Use multichannel: true for phone calls where each speaker is on a separate audio channel
Callback for async — Use callback_url for large file transcription; Deepgram POSTs results when done

eliferjunior/deepgram

.claude/skills/ts-deepgram/SKILL.md

Transcribe and analyze audio with the Deepgram API. Use when a user asks to convert speech to text, implement real-time transcription, analyze audio intelligence, detect languages, or build voice-enabled applications with Deepgram SDKs.

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add eliferjunior/Claude deepgram

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 1:34 AM306.5s1 file scanned

SKILL.md

name:: deepgram
description:: >-
license:: Apache-2.0
compatibility:: No special requirements
author:: terminal-skills
version:: 1.0.0
category:: data-ai
tags:: ["speech-to-text", "transcription", "realtime", "voice", "audio"]

Deepgram — Real-Time Speech-to-Text API

Overview

Instructions

Streaming Transcription (Real-Time)

// Real-time transcription via WebSocket
import { createClient, LiveTranscriptionEvents } from "@deepgram/sdk";

const deepgram = createClient(process.env.DEEPGRAM_API_KEY);

async function transcribeLive(audioStream: ReadableStream) {
  const connection = deepgram.listen.live({
    model: "nova-2",                    // Fastest, most accurate model
    language: "en",
    smart_format: true,                 // Auto-punctuation, casing, numbers
    interim_results: true,              // Get partial results as user speaks
    utterance_end_ms: 1000,             // 1s silence = end of utterance
    vad_events: true,                   // Voice activity detection
    diarize: true,                      // Speaker identification
    endpointing: 500,                   // 500ms endpointing for responsiveness
  });

  connection.on(LiveTranscriptionEvents.Transcript, (data) => {
    const transcript = data.channel.alternatives[0];
    if (transcript.transcript) {
      if (data.is_final) {
        console.log(`[Final] Speaker ${data.channel.alternatives[0].words?.[0]?.speaker}: ${transcript.transcript}`);
        // Send to LLM for response generation
      } else {
        console.log(`[Interim] ${transcript.transcript}`);
        // Show real-time text as user speaks
      }
    }
  });

  connection.on(LiveTranscriptionEvents.UtteranceEnd, () => {
    console.log("[Utterance complete — user stopped speaking]");
  });

  // Pipe audio to Deepgram (16kHz, 16-bit PCM or any supported format)
  for await (const chunk of audioStream) {
    connection.send(chunk);
  }
}

Pre-Recorded Transcription

# Batch transcription for recorded audio/video
from deepgram import DeepgramClient, PrerecordedOptions

dg = DeepgramClient(os.environ["DEEPGRAM_API_KEY"])

options = PrerecordedOptions(
    model="nova-2",
    smart_format=True,
    diarize=True,                       # Identify speakers
    summarize="v2",                     # Auto-generate summary
    topics=True,                        # Extract topics
    intents=True,                       # Detect intent (question, command, statement)
    sentiment=True,                     # Sentiment per utterance
    paragraphs=True,                    # Auto-paragraph formatting
    utterances=True,                    # Split by speaker turns
)

# From URL
response = dg.listen.rest.v("1").transcribe_url(
    {"url": "https://example.com/meeting.mp3"}, options
)

# From file
with open("recording.wav", "rb") as f:
    response = dg.listen.rest.v("1").transcribe_file(
        {"buffer": f.read(), "mimetype": "audio/wav"}, options
    )

# Access results
transcript = response.results.channels[0].alternatives[0]
print(f"Transcript: {transcript.transcript}")
print(f"Confidence: {transcript.confidence}")
print(f"Summary: {response.results.summary.short}")
for utterance in response.results.utterances:
    print(f"Speaker {utterance.speaker}: {utterance.transcript}")

Text-to-Speech (Aura)

# Deepgram Aura TTS — low-latency voice synthesis
response = dg.speak.rest.v("1").stream_raw(
    {"text": "Thanks for calling. How can I help you today?"},
    options={
        "model": "aura-asteria-en",     # Female, warm tone
        "encoding": "linear16",          # 16-bit PCM
        "sample_rate": 24000,
    },
)

# Stream audio chunks to speaker/WebRTC
for chunk in response.iter_bytes():
    audio_output.write(chunk)

Installation

npm install @deepgram/sdk                # Node.js
pip install deepgram-sdk                  # Python

Examples

Example 1: User asks to set up deepgram

User: "Help me set up deepgram for my project"

The agent should:

Check system requirements and prerequisites
Install or configure deepgram
Set up initial project structure
Verify the setup works correctly

Example 2: User asks to build a feature with deepgram

User: "Create a dashboard using deepgram"

The agent should:

Scaffold the component or configuration
Connect to the appropriate data source
Implement the requested feature
Test and validate the output

Guidelines

Nova-2 for everything — Nova-2 is Deepgram's best model for accuracy and speed; use it unless you need a specific language model
Streaming for real-time — Use WebSocket connections for live audio; batch API for pre-recorded files
Endpointing tuning — Set endpointing to 300-500ms for voice agents (responsive) or 1000ms for transcription (accurate)
Smart formatting — Always enable smart_format for proper capitalization, punctuation, and number formatting
Diarization for meetings — Enable diarize when multiple speakers are present; Deepgram identifies up to 10 speakers
Interim results for UX — Enable interim_results for real-time text display; show partial transcripts as users speak
Multichannel for calls — Use multichannel: true for phone calls where each speaker is on a separate audio channel
Callback for async — Use callback_url for large file transcription; Deepgram POSTs results when done

Related Skills

eliferjunior/fireworks-ai

development

VerifiedTrustedCommunity

Expert guidance for Fireworks AI, the platform for running open-source LLMs (Llama, Mixtral, Qwen, etc.) with enterprise-grade speed and reliability. Helps developers integrate Fireworks' inference API, fine-tune models, and deploy custom model endpoints with function calling and structured output support.

SKILL.mdUpdated Apr 17, 2026

eliferjunior/fireworks-ai

eliferjunior/firecrawl

development

VerifiedTrustedCommunity

Convert any website into clean, structured data with Firecrawl — API-first web scraping service. Use when someone asks to "turn a website into markdown", "scrape website for LLM", "Firecrawl", "extract website content as clean text", "crawl and convert to structured data", or "scrape website for RAG". Covers single-page scraping, full-site crawling, structured extraction, and LLM-ready output.

SKILL.mdUpdated Apr 16, 2026

eliferjunior/firecrawl

eliferjunior/firebase

tools

VerifiedTrustedCommunity

Expert guidance for Firebase, Google's platform for building and scaling web and mobile applications. Helps developers set up authentication, Firestore/Realtime Database, Cloud Functions, hosting, storage, and analytics using Firebase's SDK and CLI.

SKILL.mdUpdated Apr 16, 2026

eliferjunior/firebase

eliferjunior/file-upload-processor

development

VerifiedTrustedCommunity

When the user needs to build file upload functionality for a web application. Use when the user mentions "file upload," "image upload," "upload endpoint," "multipart upload," "presigned URL," "S3 upload," "file validation," "upload to cloud storage," or "accept user files." Handles upload endpoints, file validation (type, size, magic bytes), cloud storage integration, and upload status tracking. For image/video processing after upload, see media-transcoder.

SKILL.mdUpdated Apr 16, 2026

eliferjunior/file-upload-processor

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/eliferjunior/Claude.git

# Copy into Claude Code skills folder (global)
cp -r Claude/.claude/skills/ts-deepgram ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

eliferjunior/Claude

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT