skills/process-audio/SKILL.md
Transcribes audio files (voice memos, recordings, meetings) into text using a local ASR model (qwen3_asr_rs). Processes all audio in the configured input directory and saves transcripts as text files. Use this skill whenever the user wants to transcribe audio, convert speech to text, process voice memos, or get spoken content into written form — even if they don't use the word "transcribe".
npx skillsauth add neurongraph/skills_repo process-audioInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill uses qwen3_asr_rs for transcription and ffmpeg for audio format conversion.
If the required variables are not already set in the shell, load them from .env in the current working directory:
set -a; source .env; set +a
The required variables are:
ASR_CLI — path to the directory containing the asr executable (e.g. ~/.local/bin)MODEL_PATH — full path to the Qwen3 ASR model file used for transcriptionAUDIO_TEMP_DIR — base directory for audio processing; inputs, converted WAVs, and transcripts all live under hereAfter loading, check each of the following. Treat any failure the same as a missing variable — stop and tell the user exactly what is wrong:
ASR_CLI is set — the variable must have a value.$ASR_CLI/asr exists — even if ASR_CLI is set, verify the executable is actually present:
[ -x "$ASR_CLI/asr" ] || echo "MISSING: $ASR_CLI/asr not found or not executable"
MODEL_PATH is set — the variable must have a value.AUDIO_TEMP_DIR is set — the variable must have a value.If any check fails, stop here and tell the user what is missing. Provide these pointers:
mkdir path_to_/qwen3_asr_rscurl -sSf https://raw.githubusercontent.com/second-state/qwen3_asr_rs/main/install.sh | bashMODEL_PATH to its full path.env file in the current directorycommand -v ffmpeg
If ffmpeg is not found, stop and tell the user it is required. Install instructions by platform:
brew install ffmpegsudo apt install ffmpegsudo dnf install ffmpegwinget install Gyan.FFmpegDo not continue to Step 2 until all variables, paths, and tools are confirmed.
Set up the working directories, creating them if they don't exist:
mkdir -p "$AUDIO_TEMP_DIR/outputs" "$AUDIO_TEMP_DIR/transcripts"
$AUDIO_TEMP_DIR/inputs — must already exist and contain audio files$AUDIO_TEMP_DIR/outputs$AUDIO_TEMP_DIR/transcriptsLook for files in $AUDIO_TEMP_DIR/inputs with these extensions: .m4a, .mp3, .wav, .flac, .aac, .ogg, .opus. Skip any other file types.
If no matching audio files are found, tell the user and stop.
For each audio file found:
1. Convert to WAV (skip this step if the file is already .wav):
ffmpeg -i "$AUDIO_TEMP_DIR/inputs/<filename>" -ar 16000 -ac 1 "$AUDIO_TEMP_DIR/outputs/<basename>.wav"
2. Transcribe using the ASR CLI:
"$ASR_CLI/asr" "$MODEL_PATH" "$AUDIO_TEMP_DIR/outputs/<basename>.wav" \
> "$AUDIO_TEMP_DIR/transcripts/<basename>.txt" \
2>"$AUDIO_TEMP_DIR/transcripts/<basename>.err"
Stderr is redirected to a separate .err file so errors don't pollute the transcript. After each transcription, check whether the .err file is non-empty and warn the user if it contains anything.
When all files are processed, tell the user how many were transcribed and where the transcripts are saved: $AUDIO_TEMP_DIR/transcripts/.
The converted WAV files in $AUDIO_TEMP_DIR/outputs are intermediate artifacts — the original audio and the transcripts are what matter. Ask the user whether to delete the WAV files now.
The original files in $AUDIO_TEMP_DIR/inputs and the transcripts in $AUDIO_TEMP_DIR/transcripts are left in place.
development
Use this skill any time you need to create or edit a .pptx presentation for Surjit. This skill enforces the IBM Plex design language — typography-forward, flat geometry, sharp corners, restrained color. Trigger whenever the user asks for a deck, slides, or presentation, or references a .pptx file, and especially when they want slides that feel clean, modern, or 'IBM-style'. If the user just says 'make me a deck' or 'build slides', use this skill — it overrides the generic pptx skill for this user.
testing
--- name: obsidian-todo-action description: Action a single Obsidian todo: reads project context and related tasks, adaptively assesses what's needed (sub-tasks, email drafts, calendar invites), generates all artifacts into the project folder, and updates project.md — all in one session. --- # Obsidian Todo Action Skill Actions a single todo from the user's Obsidian vault in one focused session. Reads project context, decides adaptively what help is needed, generates artifacts (sub-tasks, emai
devops
--- name: obsidian-daily-process description: Orchestrates the full Obsidian vault processing pipeline: transcribes voice memos and audio recordings, classifies them into todos, ideas, or daily notes, and files each into the right place in the vault. Also triggers downstream Obsidian pipelines (wiki update, ArtMind knowledge graph). Use this skill whenever the user wants to process voice memos, audio recordings, or run any Obsidian vault update — even if they only mention "voice memo", "recordin
development
Answer questions about "The Geek Way" by Andrew McAfee — including its four core norms (Science, Ownership, Speed, Openness), supporting concepts like Homo ultrasocialis, cultural evolution, the OODA loop, the press secretary brain module, and real-world case studies. Use this skill whenever someone asks about The Geek Way, geek culture in business, McAfee's framework for organizational excellence, or concepts from the book such as ultrasociality, agile iteration, radical candor, or why tech companies outperform traditional firms. Also trigger when users ask how to apply these ideas to their own organization, team, or leadership style.