skills/docs-to-voice/SKILL.md
Convert text and document content into audio files and sentence-level subtitle timelines under project_dir/audio/{project_name}/. Supports both macOS say and Alibaba Cloud Model Studio API modes.
npx skillsauth add laitszkin/apollo-toolkit docs-to-voiceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
project_dir, input source, mode, and environment-backed settings before generation.apltk docs-to-voice to write audio plus matching timeline and subtitle files under project_dir/audio/{project_name}/.ffmpeg for speed changes..timeline.json and .srt companions.Collect inputs.
project_dir.project_name; default to basename of project_dir.Select mode.
--mode say for local generation.--mode api for Model Studio API generation.DOCS_TO_VOICE_MODE from .env, then shell environment variables; fallback say.Prepare output path.
project_dir/audio/{project_name}/.Generate audio.
say mode supports --voice, --rate, and punctuation-pause enhancement.api mode supports --api-endpoint, --api-model, --api-voice, and reads DASHSCOPE_API_KEY.api mode sends one request per sentence and concatenates all sentence audio into one final file.api mode auto discovers model max input length; only oversized sentences are split by that limit.--max-chars (or DOCS_TO_VOICE_MAX_CHARS) can override the sentence split limit; 0 disables chunking.--speech-rate (or DOCS_TO_VOICE_SPEECH_RATE) applies optional post-process speed adjustment and requires ffmpeg when value is not 1.qwen3-tts, CJK chars count as 2 units).Generate sentence-level timeline files.
api mode, timeline start/end uses per-sentence audio durations whenever available.Return completion details.
references/docs-to-voice.md — apltk docs-to-voice 工具的完整參數說明。在步驟 2 選擇 mode 前閱讀。在執行產出前先閱讀 references/docs-to-voice.md 了解各 mode 的參數與環境變數設定方式。
say mode: confirm command -v say and command -v python3.api mode: confirm command -v python3 and valid DASHSCOPE_API_KEY.command -v ffmpeg.apltk docs-to-voice --help.development
Read a user-specified PDF that marks the week's key financial events, deeply research each marked event with current sources, capture any additional breaking financial developments, and produce a concise Chinese-capable PDF briefing that explains what happened and why it matters.
documentation
Generate long-form videos (more than 10 minutes) by following user instructions and invoking related skills only when needed (`openai-text-to-image-storyboard`, `docs-to-voice`, `remotion-best-practices`). For text inputs, extract a complete long-form story arc, generate fresh storyboard images (no reuse of previously generated pictures), and render a 16:9 animated long-form video.
tools
協助完成自動化版本發佈。同步文檔、更新版本號、推送 tag 並建立 GitHub Release。
development
Incrementally refresh the architecture atlas when the project diagram drifts from actual code. Measures drift before updating to determine scope, then updates the base atlas and re-renders HTML.