Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

psycho-baller/speech-video-transcriber

Name: speech-video-transcriber
Author: psycho-baller

skills/speech-video-transcriber/SKILL.md

npx skillsauth add psycho-baller/ai-agents-config speech-video-transcriber

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

speech-video-transcriber

turn a local media file into a markdown transcript with minimal local compute.

the intended path is:

extract compact mono audio with ffmpeg
send that audio to OpenAI transcription
write a markdown transcript into ../transcriptions/ relative to the skills/ directory

for this repo, that means outputs land in:

/Users/rami/Documents/life-os/ai-agents-config/transcriptions/

when to use it

use this skill when the user wants any of the following from a local media file:

a transcript from a speaking video
a transcript from an audio file
a markdown transcript saved to disk for later analysis
a first pass before communication coaching, speaking feedback, vocabulary analysis, or camera-performance review

if the user asks for speaking feedback but no transcript exists yet, use this skill first so later steps can work from a clean markdown source.

backends

local (default when user mentions whisper or offline):

flag: --local
model: small (uses ~/.cache/whisper/small.pt — no download needed if already cached)
no API key required
whisper handles audio internally, no chunking needed

cloud (default otherwise):

model: gpt-4o-transcribe
requires OPENAI_API_KEY
auto-chunks audio if file exceeds 24 MB upload limit
only switch to gpt-4o-mini-transcribe if the user explicitly wants the cheaper model

defaults

default output folder: ai-agents-config/transcriptions/
default preprocessing: mono mp3 at 16 kHz and 48 kbps

required setup

from /Users/rami/Documents/life-os/ai-agents-config/skills:

uv pip install -r speech-video-transcriber/scripts/requirements.txt

the machine also needs:

ffmpeg
OPENAI_API_KEY

workflow

resolve the media path to an absolute path before running anything
if the user gives a language hint, pass --language
if the file contains names, jargon, brand terms, or unusual words, pass a short --prompt with those tokens to improve recognition
run the script
return the markdown path you wrote

command

local (no API key):

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py "/absolute/path/to/video.mov" --local

cloud (OpenAI API):

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py "/absolute/path/to/video.mov"

common options:

# local with language hint
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --local \
  --language en

# cloud with jargon hint
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --language en \
  --prompt "rami, chalant, purpose os, posthog" \
  --model gpt-4o-transcribe

# save to specific path
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --local \
  --output "/path/to/output.md"

output contract

the script writes one markdown file to the shared transcriptions directory and prints the final path.

the markdown includes:

source media path
generation timestamp
model used
optional language hint
chunk count
full transcript text

if a file with the same name already exists, the script appends a timestamp suffix instead of overwriting it.

examples

example 1

user request: transcribe /Users/rami/Documents/life-os/speech/founder-story-take-01.mov

run:

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/Users/rami/Documents/life-os/speech/founder-story-take-01.mov"

example 2

user request: make a transcript of /Users/rami/Documents/life-os/speech/camera-practice/clarity.mp4 and keep the names right. the language is english.

run:

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/Users/rami/Documents/life-os/speech/camera-practice/clarity.mp4" \
  --language en \
  --prompt "rami, chalant, purpose os"

failure handling

if OPENAI_API_KEY is missing, try to run source .env to load it and if it still fails stop and ask for it
if ffmpeg is missing, stop and report that dependency clearly
if transcription fails on a chunk, surface the chunk number and the upstream error
do not paraphrase the transcript in place of the output file

psycho-baller/speech-video-transcriber

skills/speech-video-transcriber/SKILL.md

Transcribes a local video or audio file into a markdown transcript using Whisper or OpenAI cloud. Use when the user wants a transcript from a video, audio, or voice note.

devops

Updated May 21, 2026

$ install --global

skillsauth

npx skillsauth add psycho-baller/ai-agents-config speech-video-transcriber

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 21, 2026, 5:41 AM177.9s4 files scanned

SKILL.md

name:: speech-video-transcriber
version:: 0.2.0
description:: Transcribes a local video or audio file into a markdown transcript using Whisper or OpenAI cloud. Use when the user wants a transcript from a video, audio, or voice note.

speech-video-transcriber

turn a local media file into a markdown transcript with minimal local compute.

the intended path is:

extract compact mono audio with ffmpeg
send that audio to OpenAI transcription
write a markdown transcript into ../transcriptions/ relative to the skills/ directory

for this repo, that means outputs land in:

/Users/rami/Documents/life-os/ai-agents-config/transcriptions/

when to use it

use this skill when the user wants any of the following from a local media file:

a transcript from a speaking video
a transcript from an audio file
a markdown transcript saved to disk for later analysis
a first pass before communication coaching, speaking feedback, vocabulary analysis, or camera-performance review

if the user asks for speaking feedback but no transcript exists yet, use this skill first so later steps can work from a clean markdown source.

backends

local (default when user mentions whisper or offline):

flag: --local
model: small (uses ~/.cache/whisper/small.pt — no download needed if already cached)
no API key required
whisper handles audio internally, no chunking needed

cloud (default otherwise):

model: gpt-4o-transcribe
requires OPENAI_API_KEY
auto-chunks audio if file exceeds 24 MB upload limit
only switch to gpt-4o-mini-transcribe if the user explicitly wants the cheaper model

defaults

default output folder: ai-agents-config/transcriptions/
default preprocessing: mono mp3 at 16 kHz and 48 kbps

required setup

from /Users/rami/Documents/life-os/ai-agents-config/skills:

uv pip install -r speech-video-transcriber/scripts/requirements.txt

the machine also needs:

ffmpeg
OPENAI_API_KEY

workflow

resolve the media path to an absolute path before running anything
if the user gives a language hint, pass --language
if the file contains names, jargon, brand terms, or unusual words, pass a short --prompt with those tokens to improve recognition
run the script
return the markdown path you wrote

command

local (no API key):

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py "/absolute/path/to/video.mov" --local

cloud (OpenAI API):

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py "/absolute/path/to/video.mov"

common options:

# local with language hint
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --local \
  --language en

# cloud with jargon hint
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --language en \
  --prompt "rami, chalant, purpose os, posthog" \
  --model gpt-4o-transcribe

# save to specific path
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/absolute/path/to/video.mov" \
  --local \
  --output "/path/to/output.md"

output contract

the script writes one markdown file to the shared transcriptions directory and prints the final path.

the markdown includes:

source media path
generation timestamp
model used
optional language hint
chunk count
full transcript text

if a file with the same name already exists, the script appends a timestamp suffix instead of overwriting it.

examples

example 1

user request: transcribe /Users/rami/Documents/life-os/speech/founder-story-take-01.mov

run:

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/Users/rami/Documents/life-os/speech/founder-story-take-01.mov"

example 2

user request: make a transcript of /Users/rami/Documents/life-os/speech/camera-practice/clarity.mp4 and keep the names right. the language is english.

run:

cd /Users/rami/Documents/life-os/ai-agents-config/skills
uv run python speech-video-transcriber/scripts/transcribe_video.py \
  "/Users/rami/Documents/life-os/speech/camera-practice/clarity.mp4" \
  --language en \
  --prompt "rami, chalant, purpose os"

failure handling

if OPENAI_API_KEY is missing, try to run source .env to load it and if it still fails stop and ask for it
if ffmpeg is missing, stop and report that dependency clearly
if transcription fails on a chunk, surface the chunk number and the upstream error
do not paraphrase the transcript in place of the output file

Related Skills

psycho-baller/task-framing

tools

VerifiedTrustedCommunity

Pre-task clarity ritual for Rami. Surfaces real intention, maps work to his pillars, and produces a Session Brief. Use when Rami is about to start a task or plan a work session.

SKILL.mdUpdated May 21, 2026

psycho-baller/task-framing

psycho-baller/obsidian-search

testing

VerifiedTrustedCommunity

Find and retrieve notes from Rami's Obsidian vault by topic or theme using semantic search against Smart Connections embeddings. Use when asked to find notes about a specific subject, retrieve relevant vault content, or surface what Rami has written about a topic.

SKILL.mdUpdated May 21, 2026

psycho-baller/obsidian-search

psycho-baller/update-patterns

testing

VerifiedTrustedCommunity

Update living Obsidian pattern files from metadata-enriched transcriptions. Use when asked to populate or update pattern files for communication flaws, beliefs, fears, principles, or people/projects mentioned.

SKILL.mdUpdated May 2, 2026

psycho-baller/update-patterns

psycho-baller/generate-principles

testing

VerifiedTrustedCommunity

Generate evidence-backed personal principles from markdown notes, reflections, and transcripts. Use when extracting life principles, decision rules, or lessons from journal entries and reflections.

SKILL.mdUpdated May 2, 2026

psycho-baller/generate-principles

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/psycho-baller/ai-agents-config.git

# Copy into Claude Code skills folder (global)
cp -r ai-agents-config/skills/speech-video-transcriber ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

psycho-baller/ai-agents-config

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT