docs-to-voice/SKILL.md
Convert text and document content into audio files and sentence-level subtitle timelines under project_dir/audio/{project_name}/. Supports both macOS say and Alibaba Cloud Model Studio API modes.
npx skillsauth add laitszkin/apollo-toolkit docs-to-voiceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
project_dir, input source, mode, and environment-backed settings before generation.apltk docs-to-voice to write audio plus matching timeline and subtitle files under project_dir/audio/{project_name}/.ffmpeg for speed changes..timeline.json and .srt companions.Collect inputs.
project_dir.project_name; default to basename of project_dir.Select mode.
--mode say for local generation.--mode api for Model Studio API generation.DOCS_TO_VOICE_MODE from .env, then shell environment variables; fallback say.Prepare output path.
project_dir/audio/{project_name}/.Generate audio.
say mode supports --voice, --rate, and punctuation-pause enhancement.api mode supports --api-endpoint, --api-model, --api-voice, and reads DASHSCOPE_API_KEY.api mode sends one request per sentence and concatenates all sentence audio into one final file.api mode auto discovers model max input length; only oversized sentences are split by that limit.--max-chars (or DOCS_TO_VOICE_MAX_CHARS) can override the sentence split limit; 0 disables chunking.--speech-rate (or DOCS_TO_VOICE_SPEECH_RATE) applies optional post-process speed adjustment and requires ffmpeg when value is not 1.qwen3-tts, CJK chars count as 2 units).Generate sentence-level timeline files.
api mode, timeline start/end uses per-sentence audio durations whenever available.Return completion details.
Use apltk docs-to-voice --help as the live command reference for required inputs, mode-specific flags, environment variables, examples, and expected output paths.
say mode: confirm command -v say and command -v python3.api mode: confirm command -v python3 and valid DASHSCOPE_API_KEY.command -v ffmpeg.apltk docs-to-voice --help.development
Review a pull request — interactive PR selection via `gh`, 4-dimension code review (hallucinated code, architecture, performance, test validity), then post severity-graded comments with fix suggestions on the PR. Not for spec-based review — use `review` instead.
development
Read a user-specified PDF that marks the week's key financial events, deeply research each marked event with current sources, capture any additional breaking financial developments, and produce a concise Chinese-capable PDF briefing that explains what happened and why it matters.
documentation
Generate long-form videos (more than 10 minutes) by following user instructions and invoking related skills only when needed (`openai-text-to-image-storyboard`, `docs-to-voice`, `remotion-best-practices`). For text inputs, extract a complete long-form story arc, generate fresh storyboard images (no reuse of previously generated pictures), and render a 16:9 animated long-form video.
tools
協助完成自動化版本發佈。同步文檔、更新版本號、推送 tag 並建立 GitHub Release。