skills/giggle-generation-speech/SKILL.md
Use when the user wants to generate speech, voiceover, or text-to-audio. Converts text to AI voice via Giggle.pro TTS API. Keep the user informed until audio is ready: message before long waits, use Cron/sync poll so the user need not ask for progress. Triggers: generate speech, text-to-speech, TTS, voiceover, read this text aloud, synthesize speech.
npx skillsauth add giggle-official/skills giggle-generation-speechInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Synthesizes text into AI voice/voiceover via giggle.pro. Supports multiple voice tones, emotions, and speaking rates.
Please review the following before installing. This skill will:
~/.openclaw/skills/giggle-generation-speech/logs/ – Task state files for Cron deduplicationRequirements: python3, GIGGLE_API_KEY (system environment variable), pip packages: requests
API Key: Set system environment variable GIGGLE_API_KEY. Obtain it at giggle.pro while logged in: left sidebar → API Key (API 密钥). The script will prompt if not configured.
No inline Python: All commands must be executed via the
exectool. Never use heredoc inline code.
No Retry on Error: If script execution encounters an error, do not retry. Report the error to the user directly and stop.
Speech generation typically takes 10–30 seconds. Uses "fast submit + Cron poll + sync fallback" three-phase architecture.
Important: Never pass
GIGGLE_API_KEYin exec'senvparameter. API Key is read from system environment variable.
TTS is usually 10–30 seconds; Phase 3 sync wait can run up to --max-wait (default 120s). The user does not need to nag for progress.
task_id, and set expectation: usually 10–30s, may need up to ~2 minutes on slow runs.--query, and always run Phase 3 --query --poll as fallback—do not wait for the user to say “check now.”--query): if still JSON/processing, paraphrase and say you are still waiting; from the 2nd JSON poll onward, on every other Cron tick you may send one minimal line (e.g. “Still generating…”) so the thread is not dead silent.task_id, remove Cron, suggest retry.--query when they ask; Phase 3 --poll can remain your primary path if appropriate.Before submitting, you must guide the user to select voice and emotion. Do not use defaults.
--list-voices to get available voices:python3 scripts/text_to_audio_api.py --list-voices
First send a message to the user: Speech task submitted; usually finishes in 10–30 seconds. You will query automatically and send the link as soon as it is ready—no need to nag for progress.
# Must specify user-selected voice and emotion
python3 scripts/text_to_audio_api.py \
--text "The weather is nice today" \
--voice-id "Calm_Woman" \
--emotion "joy" \
--speed 1.2 \
--no-wait --json
# View available voices
python3 scripts/text_to_audio_api.py --list-voices
Response example:
{"status": "started", "task_id": "xxx"}
Immediately store task_id in memory (addMemory):
giggle-generation-speech task_id: xxx (submitted: YYYY-MM-DD HH:mm)
Use the cron tool to register the polling job. Strictly follow the parameter format:
{
"action": "add",
"job": {
"name": "giggle-generation-speech-<first 8 chars of task_id>",
"schedule": {
"kind": "every",
"everyMs": 30000
},
"payload": {
"kind": "systemEvent",
"text": "Speech task poll: exec python3 scripts/text_to_audio_api.py --query --task-id <full task_id>, handle stdout per Cron logic. If stdout is non-JSON plain text, forward to user and remove Cron. If stdout is JSON (still processing), keep Cron; from the 2nd JSON poll onward, alternate: every other poll send user one brief line (e.g. Speech still generating…). If stdout is empty, remove Cron immediately, do not send message."
},
"sessionTarget": "main"
}
}
Cron trigger handling (based on exec stdout):
| stdout pattern | Action |
|----------------|--------|
| Non-empty plain text (not starting with {) | Forward to user as-is, remove Cron |
| stdout empty | Already pushed, remove Cron immediately, do not send message |
| JSON (starts with {, has "status" field) | Keep Cron; per this section, from 2nd JSON poll onward send a brief line every other poll |
Execute this step whether or not Cron registration succeeded.
Immediately before this exec, send the user a short line: Waiting for synthesis (usually quick, up to ~2 minutes)—please wait.
python3 scripts/text_to_audio_api.py --query --task-id <task_id> --poll --max-wait 120
Handling logic:
When the user wants to see available voices, run:
python3 scripts/text_to_audio_api.py --list-voices
The script calls GET /api/v1/project/preset_tones and displays voice_id, name, style, gender, age, language to the user.
Audio links returned to the user must be full signed URLs (with Policy, Key-Pair-Id, Signature query params). Do not strip response-content-disposition=attachment when the API returns it. The script only normalizes ~ → %7E; forward URLs as-is otherwise.
When the user initiates a new speech generation request, must run Phase 1 to submit a new task. Do not reuse old task_id from memory.
For the in-flight task, follow Phases 2–3 and this section—do not wait for the user to ask. For an older task, query that task_id when the user asks (or poll if they want updates).
| Parameter | Required | Default | Description |
|-----------|----------|--------|-------------|
| --text | yes | - | Text to synthesize |
| --voice-id | yes | - | Voice ID; must get via --list-voices and guide user to choose |
| --emotion | yes | - | Emotion: joy, sad, neutral, angry, surprise, etc. Guide user to choose |
| --speed | no | 1 | Speaking rate multiplier |
| --list-voices | - | - | Get available voice list |
| --query | - | - | Query task status |
| --task-id | required for query | - | Task ID |
| --poll | no | - | Sync poll with --query |
| --max-wait | no | 120 | Max wait seconds |
Before each speech generation, complete this interaction:
--list-voices, display list, have user choose. Do not use default voicedevelopment
Talking-head video from image + driving audio: submit tasks via the wrapped generation API and poll for results; requests go through the Giggle gateway.
development
Discord-specific markdown syntax for formatting message strings. Use when constructing strings that will be sent as Discord messages, including: text formatting (bold, italic, underline, strikethrough, spoilers), headers, subtext, code blocks, block quotes, lists, and masked links.
development
Discord bot development - community management, moderation, notifications, and AI integration
development
Auto-deploy a Discord AI bot connected to OpenClaw Gateway. Handles Node.js bot setup, PM2 process management, and Gateway API integration. Use when the user has completed Discord Bot onboarding and needs the bot deployed and running.