Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

hasna/transcript

Name: transcript
Author: hasna

skills/skill-transcript/SKILL.md

npx skillsauth add hasna/skills transcript

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Audio Transcription Skill

This skill provides high-quality speech-to-text transcription using multiple AI providers. It automatically handles large files through compression and chunking.

This CLI is API-backed. Set SKILL_API_KEY when routing through the hosted skills/connectors runtime; provider-specific keys are managed by that runtime.

Supported Providers

ElevenLabs Scribe

Accuracy: 96.7% for English (industry-leading)
Max file size: 3GB / 10 hours
Features: Speaker diarization (up to 32 speakers), word-level timestamps
Cost: $0.40/hour
Best for: Multi-speaker recordings, highest accuracy needs

OpenAI Whisper

Accuracy: Excellent
Max file size: 25MB (automatic chunking for larger files)
Features: Segment timestamps, language detection
Cost: $0.006/min ($0.003/min with GPT-4o Mini)
Best for: Standard transcription, good balance of cost and quality

Google Gemini

Accuracy: Very good
Max file size: 2GB
Features: Multimodal analysis, summarization capabilities
Cost: ~$0.09-0.23/hour (generous free tier available)
Best for: Cost-sensitive projects, multimodal needs

Usage

Basic Transcription

bun run src/index.ts transcribe \
  --provider openai \
  --input ./recording.mp3

With Speaker Diarization

bun run src/index.ts transcribe \
  --provider elevenlabs \
  --input ./meeting.mp3 \
  --diarize \
  --timestamps \
  --format srt

Export to Subtitles

bun run src/index.ts transcribe \
  --provider gemini \
  --input ./video.mp4 \
  --format vtt \
  --output ./captions.vtt

View Provider Info

bun run src/index.ts providers

Output Formats

| Format | Extension | Description | |--------|-----------|-------------| | text | .txt | Plain text transcript | | srt | .srt | SubRip subtitle format | | vtt | .vtt | WebVTT subtitle format | | json | .json | Full structured data with metadata |

Large File Handling

The skill automatically handles files larger than provider limits:

Compression: For OpenAI, files are first compressed using Opus codec
Chunking: Files are split into 10-minute segments with overlap
Merging: Results are intelligently merged to avoid duplicates

Configuration

# ElevenLabs
export ELEVENLABS_API_KEY=your_key

# OpenAI
export OPENAI_API_KEY=your_key

# Google Gemini
export GOOGLE_API_KEY=your_key

Dependencies

For chunking support (OpenAI with large files):

ffmpeg - Audio processing
ffprobe - Duration detection

Install on macOS:

brew install ffmpeg

hasna/transcript

skills/skill-transcript/SKILL.md

Transcribe audio and video files using ElevenLabs Scribe, OpenAI Whisper, or Google Gemini. Supports automatic chunking for large files, speaker diarization, timestamps, and multiple output formats (text, SRT, VTT, JSON).

development

Updated Apr 25, 2026

$ install --global

skillsauth

npx skillsauth add hasna/skills transcript

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 25, 2026, 3:34 AM187.1s13 files scanned

SKILL.md

name:: transcript
description:: Transcribe audio and video files using ElevenLabs Scribe, OpenAI Whisper, or Google Gemini. Supports automatic chunking for large files, speaker diarization, timestamps, and multiple output formats (text, SRT, VTT, JSON).

Audio Transcription Skill

This skill provides high-quality speech-to-text transcription using multiple AI providers. It automatically handles large files through compression and chunking.

This CLI is API-backed. Set SKILL_API_KEY when routing through the hosted skills/connectors runtime; provider-specific keys are managed by that runtime.

Supported Providers

ElevenLabs Scribe

Accuracy: 96.7% for English (industry-leading)
Max file size: 3GB / 10 hours
Features: Speaker diarization (up to 32 speakers), word-level timestamps
Cost: $0.40/hour
Best for: Multi-speaker recordings, highest accuracy needs

OpenAI Whisper

Accuracy: Excellent
Max file size: 25MB (automatic chunking for larger files)
Features: Segment timestamps, language detection
Cost: $0.006/min ($0.003/min with GPT-4o Mini)
Best for: Standard transcription, good balance of cost and quality

Google Gemini

Accuracy: Very good
Max file size: 2GB
Features: Multimodal analysis, summarization capabilities
Cost: ~$0.09-0.23/hour (generous free tier available)
Best for: Cost-sensitive projects, multimodal needs

Usage

Basic Transcription

bun run src/index.ts transcribe \
  --provider openai \
  --input ./recording.mp3

With Speaker Diarization

bun run src/index.ts transcribe \
  --provider elevenlabs \
  --input ./meeting.mp3 \
  --diarize \
  --timestamps \
  --format srt

Export to Subtitles

bun run src/index.ts transcribe \
  --provider gemini \
  --input ./video.mp4 \
  --format vtt \
  --output ./captions.vtt

View Provider Info

bun run src/index.ts providers

Output Formats

Large File Handling

The skill automatically handles files larger than provider limits:

Compression: For OpenAI, files are first compressed using Opus codec
Chunking: Files are split into 10-minute segments with overlap
Merging: Results are intelligently merged to avoid duplicates

Configuration

# ElevenLabs
export ELEVENLABS_API_KEY=your_key

# OpenAI
export OPENAI_API_KEY=your_key

# Google Gemini
export GOOGLE_API_KEY=your_key

Dependencies

For chunking support (OpenAI with large files):

ffmpeg - Audio processing
ffprobe - Duration detection

Install on macOS:

brew install ffmpeg

Related Skills

hasna/merge-pr

testing

VerifiedTrustedCommunity

Merge a GitHub pull request, merge when green, use a merge queue, or decide whether a pull request is mergeable. Use only for explicit merge intent, not ordinary review.

19SKILL.mdUpdated Jul 24, 2026

hasna/performance-audit-report

development

VerifiedTrustedCommunity

Generate premium performance audit reports for web apps, APIs, or SaaS surfaces with metrics, findings, budgets, remediation plans, and manifest metadata.

19SKILL.mdUpdated May 15, 2026

hasna/performance-audit-report

hasna/customer-feedback-report

data-ai

VerifiedTrustedCommunity

Generate premium customer feedback reports from reviews, support tickets, surveys, call notes, or raw feedback with clusters, sentiment, root causes, roadmap recommendations, evidence, and manifest metadata.

19SKILL.mdUpdated May 15, 2026

hasna/customer-feedback-report

hasna/pdf-generate

development

VerifiedTrustedCommunity

Generate high-quality PDF documents from markdown, HTML, or templates

19SKILL.mdUpdated May 12, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/hasna/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/skills/skill-transcript ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

hasna/skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT