Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

lazygophers/cortex-audio-understand

Name: cortex-audio-understand
Author: lazygophers

plugins/tools/cortex/skills/cortex-audio-understand/SKILL.md

npx skillsauth add lazygophers/ccplugin cortex-audio-understand

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

cortex-audio-understand

把音频喂给 ASR / 音频 LLM 拿文本结果。镜像 cortex-image-understand, 双模式适配转录与问答两类需求。

调用优先级 (P1)

优先 CLI: bash ~/.cortex/scripts/audio_understand.sh <subcommand> ...
- transcribe <audio> [--config NAME] [--language LANG] — ASR 纯转录
- describe <audio> [--config NAME] [--prompt TEXT] — 概述音频
- ask <audio> "<question>" [--config NAME] — 音频问答
- probe [--config NAME] [--all]
- list [--all]
输入: 本地文件路径 (mp3/wav/m4a/webm/flac/ogg/opus)
JSON 输出: {ok, text, provider, model, mode, usage}

两种模式

| 模式 | 子命令 | provider 例 | 原理 | |---|---|---|---| | asr | transcribe | openai whisper-1, zhipu glm-asr | multipart upload /v1/audio/transcriptions | | chat | describe / ask | openai gpt-4o-audio-preview, qwen-audio, zhipu glm-4-voice | chat completions + input_audio content |

provider yaml 里写 mode: asr|chat。transcribe 强制 asr, describe/ask 强制 chat — 不需要手 override。

触发场景

录音转文字 (会议 / 访谈 / 语音笔记) → transcribe
听完总结 ("讲了啥") → describe
带问题听 ("说了几个产品名") → ask

不触发: TTS (本 skill 不合成) / 实时流 / 说话人分离

决策树

用户给音频文件 + ?
  │
  ├─ "转成文字" / "转录" / "字幕"        → transcribe (asr 模式)
  ├─ 想要内容概述, 无具体问题            → describe (chat 模式)
  ├─ 带具体问题 ("说了什么/几个人/几次")  → ask (chat 模式)

Provider 速查

| name | endpoint | model | mode | 备注 | |---|---|---|---|---| | openai-whisper | api.openai.com/v1/audio/transcriptions | whisper-1 | asr | 业界标杆, 多语言强 | | zhipu-glm-asr | bigmodel.cn/api/paas/v4/audio/transcriptions | glm-asr | asr | 中文场景默认推荐 | | openai-gpt4o-audio | api.openai.com/v1/chat/completions | gpt-4o-audio-preview | chat | 支持问答 + 推理 | | qwen-audio | dashscope.aliyuncs.com/compatible-mode/v1/chat/completions | qwen-audio-turbo | chat | 中文 + 多任务 |

完整模板见 references/providers.md。

文件格式

支持 mp3 / wav / m4a / webm / flac / ogg / opus。MIME 按后缀自动判定。

whisper 上限 25MB / 文件; 超出建议先切片 (ffmpeg -i in.wav -t 600 -ss 0 out.wav)
chat 模式走 base64, 上限通常更紧 (~10MB), 注意 timeout

AUTO_MODE

不询问 mode (子命令决定)
不询问 provider, 用 default_provider 或第一个 active
transcribe 无 --language 时由 provider 自动检测

输出格式

✓ 音频转录完成
  provider: openai-whisper (whisper-1) mode=asr
  text:
  <transcript>

References

| 文件 | 内容 | |---|---| | references/providers.md | 4 provider 配置模板 + asr/chat 模式字段 + language | | references/prompts.md | describe / 摘要 / 说话人区分 / 时间戳标注 prompt | | references/modes.md | asr vs chat 决策 + 子命令路由 + 文件格式坑 |

不做

不真跑 API 当用户只问 "能不能"
不流式 ASR (实时转录需 websocket, 本 skill 走完整文件上传)
不 TTS (语音合成不属本 skill)
不说话人分离 (diarization)
不 git commit

lazygophers/cortex-audio-understand

plugins/tools/cortex/skills/cortex-audio-understand/SKILL.md

音频理解 — ASR 转录 + 音频问答。多 provider (openai whisper / zhipu glm-asr / openai gpt-4o-audio / qwen-audio); 两种模式 asr (Whisper 风格 multipart 转录) 与 chat (OpenAI gpt-4o-audio / 通义 qwen-audio 问答)。从 vault/.cortex/config/audio-understand.yaml 选 provider。Triggers on "转录", "转写", "听音频", "audio transcription", "ASR", "音频问答", "音频理解", "听这段录音", "/cortex:audio-understand".

3 stars

data-ai

Updated May 23, 2026

$ install --global

skillsauth

npx skillsauth add lazygophers/ccplugin cortex-audio-understand

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 23, 2026, 7:55 AM200.9s4 files scanned

SKILL.md

name:: cortex-audio-understand
description:: 音频理解 — ASR 转录 + 音频问答。多 provider (openai whisper / zhipu glm-asr / openai gpt-4o-audio / qwen-audio); 两种模式 asr (Whisper 风格 multipart 转录) 与 chat (OpenAI gpt-4o-audio / 通义 qwen-audio 问答)。从 vault/.cortex/config/audio-understand.yaml 选 provider。Triggers on "转录", "转写", "听音频", "audio transcription", "ASR", "音频问答", "音频理解", "听这段录音", "/cortex:audio-understand".
disable-model-invocation:: false
allowed-tools:: Bash Read Write

cortex-audio-understand

把音频喂给 ASR / 音频 LLM 拿文本结果。镜像 cortex-image-understand, 双模式适配转录与问答两类需求。

调用优先级 (P1)

优先 CLI: bash ~/.cortex/scripts/audio_understand.sh <subcommand> ...
- transcribe <audio> [--config NAME] [--language LANG] — ASR 纯转录
- describe <audio> [--config NAME] [--prompt TEXT] — 概述音频
- ask <audio> "<question>" [--config NAME] — 音频问答
- probe [--config NAME] [--all]
- list [--all]
输入: 本地文件路径 (mp3/wav/m4a/webm/flac/ogg/opus)
JSON 输出: {ok, text, provider, model, mode, usage}

两种模式

provider yaml 里写 mode: asr|chat。transcribe 强制 asr, describe/ask 强制 chat — 不需要手 override。

触发场景

录音转文字 (会议 / 访谈 / 语音笔记) → transcribe
听完总结 ("讲了啥") → describe
带问题听 ("说了几个产品名") → ask

不触发: TTS (本 skill 不合成) / 实时流 / 说话人分离

决策树

用户给音频文件 + ?
  │
  ├─ "转成文字" / "转录" / "字幕"        → transcribe (asr 模式)
  ├─ 想要内容概述, 无具体问题            → describe (chat 模式)
  ├─ 带具体问题 ("说了什么/几个人/几次")  → ask (chat 模式)

Provider 速查

完整模板见 references/providers.md。

文件格式

支持 mp3 / wav / m4a / webm / flac / ogg / opus。MIME 按后缀自动判定。

whisper 上限 25MB / 文件; 超出建议先切片 (ffmpeg -i in.wav -t 600 -ss 0 out.wav)
chat 模式走 base64, 上限通常更紧 (~10MB), 注意 timeout

AUTO_MODE

不询问 mode (子命令决定)
不询问 provider, 用 default_provider 或第一个 active
transcribe 无 --language 时由 provider 自动检测

输出格式

✓ 音频转录完成
  provider: openai-whisper (whisper-1) mode=asr
  text:
  <transcript>

References

不做

不真跑 API 当用户只问 "能不能"
不流式 ASR (实时转录需 websocket, 本 skill 走完整文件上传)
不 TTS (语音合成不属本 skill)
不说话人分离 (diarization)
不 git commit

Related Skills

lazygophers/design-uiux

tools

VerifiedTrustedCommunity

UI/UX 与布局设计——做界面布局/结构/导航/组件/交互的设计决策。触发：做UI/UX/布局/排版/导航/组件/交互/栅格/响应式/图表选型/字体配对。按媒介路由 HTML/Web、原生 App(iOS/Android/桌面)、CLI、TUI。需后端动态系统不适用；配色/主题/色板走姊妹 skill design-color。

4SKILL.mdUpdated Jul 22, 2026

lazygophers/design-uiux

lazygophers/design-color

tools

VerifiedTrustedCommunity

主题与配色设计——做颜色搭配/调色板/主题/品牌色阶/暗模式的设计决策。触发：选配色/调色/主题/色板/品牌色/暗模式/对比度/色盲/UI风格。按媒介路由 HTML/Web(CSS变量)、原生App(平台token)、CLI(ANSI)、TUI(真彩/256/16降级)。保证可访问性（对比度/色盲安全）。需后端动态系统不适用；UI/UX 布局/组件/交互走姊妹 skill design-uiux。

4SKILL.mdUpdated Jul 22, 2026

lazygophers/design-color

lazygophers/optimize-any

tools

VerifiedTrustedCommunity

跨任意组件（plugin/skill/agent/command）的验证驱动优化循环纪律 skill。当用户要优化某个已有组件却无明确方向、或要防止改了反而更差（自评乐观偏差 / 多维同改归因失效 / 为凑分加废话膨胀）、或要把一套通用「评分→单变量改→改后验证严格更好才留否则回滚→触顶停」的纪律套到任意组件上时使用。管优化过程本身的纪律（validation gate / ratchet / 独立验证 / 触顶停），不评单组件深度（交 skill-dev），不查插件接线（交 plugin-dev）。仅手动 /optimize-any 触发。

4SKILL.mdUpdated Jul 18, 2026

lazygophers/optimize-any

lazygophers/skein-spec

data-ai

VerifiedTrustedCommunity

两层规则记忆 (基于 .skein/spec)。planning 时 recall 召回相关规则、task finish 后 sediment 沉淀学习 + prune 自动精简过期/重复/断链规则。core 常驻硬规 + recall 按需召回, 经判定门自动写盘 (不逐次问用户)。产出 .skein/spec 下 core/recall 规则文件 + index。另支持空仓 bootstrap 播种规则基线、记忆大面积失效 (大重构/换栈) 时 reconstruct 可逆归档后按项目类型分型重建、maintain 手动体检 (超预算/stale/断链/重复/废弃, --apply 自动修复)、auto-fix (Stop hook 写 .pending-fix 标记 → main 派 skein-specer bg 跑 maintain --apply 全自动修, 断链只报告)。

4SKILL.mdUpdated Jul 18, 2026

lazygophers/skein-spec

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/lazygophers/ccplugin.git

# Copy into Claude Code skills folder (global)
cp -r ccplugin/plugins/tools/cortex/skills/cortex-audio-understand ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

lazygophers/ccplugin

3 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT