transcribe/SKILL.md
Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.
npx skillsauth add lidge-jun/cli-jaw-skills transcribeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Transcribe audio using OpenAI, with optional speaker diarization when requested. Prefer the bundled CLI for deterministic, repeatable runs.
OPENAI_API_KEY is set. If missing, ask the user to set it locally (do not ask them to paste the key).transcribe_diarize.py CLI with sensible defaults (fast text transcription).output/transcribe/ when working in this repo.gpt-4o-mini-transcribe with --response-format text for fast transcription.--model gpt-4o-transcribe-diarize --response-format diarized_json.--chunking-strategy auto.gpt-4o-transcribe-diarize.output/transcribe/<job-id>/ for evaluation runs.--out-dir for multiple files to avoid overwriting.uv pip install openai # preferred
python3 -m pip install openai # fallback
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
export TRANSCRIBE_CLI="$CODEX_HOME/skills/transcribe/scripts/transcribe_diarize.py"
User-scoped skills install under $CODEX_HOME/skills (default: ~/.codex/skills).
Single file (fast text default):
python3 "$TRANSCRIBE_CLI" \
path/to/audio.wav \
--out transcript.txt
Diarization with known speakers (up to 4):
python3 "$TRANSCRIBE_CLI" \
meeting.m4a \
--model gpt-4o-transcribe-diarize \
--known-speaker "Alice=refs/alice.wav" \
--known-speaker "Bob=refs/bob.wav" \
--response-format diarized_json \
--out-dir output/transcribe/meeting
Plain text output (explicit):
python3 "$TRANSCRIBE_CLI" \
interview.mp3 \
--response-format text \
--out interview.txt
references/api.md: supported formats, limits, response formats, and known-speaker notes.development
Native Web UI structured renderer schemas for compose-block drafts, search-results cards, dataframe tables, chart-json charts, and diff output
tools
Unified search hub. Route any web/real-time/X lookup through a 4-tier escalation: built-in web search → cli-jaw browser CDP → progrok Grok OAuth → web-ai (Grok Expert / GPT Pro). Use for: search, 검색, web search, latest news, real-time info, X/Twitter, fact lookup, deep research.
development
UI/UX intent discovery, design vocabulary, product personalities, UX state patterns, typography line break judgment, favicon/product logo design, and logo trust section design. Use when user design direction is vague, when building onboarding/empty/error states, when setting up favicons or product logos, or when referencing a product aesthetic.
development
Canonical owner of module boundary rules, circular dependency detection/prevention, implicit coupling taxonomy, barrel/re-export discipline, and boundary-only defensive programming. Referenced by dev, dev-code-reviewer, dev-backend, dev-frontend stubs.