skills/polymer-pay-x402engine-audio/SKILL.md
USE THIS SKILL WHEN: the user wants to generate speech audio from text (TTS) or transcribe audio files to text. Provides pay-per-use text-to-speech and transcription via x402engine through the Polymer Pay proxy.
npx skillsauth add polymerdao/pay-apis polymer-pay-x402engine-audioInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
x402engine provides pay-per-call text-to-speech and audio transcription endpoints. Convert text to speech using OpenAI or ElevenLabs voices, or transcribe audio files to text with speaker diarization. No API key needed — payment is handled automatically by the Polymer Pay proxy.
All requests route through the Polymer Pay proxy. Include your Polymer Pay API key in every request:
{
"headers": {
"Content-Type": "application/json",
"x-polymer-pay-api-key": "{{POLYMER_PAY_API_KEY}}"
}
}
Base URL: https://pay.polymerlabs.org/proxy/https/x402engine.app
To get an Polymer Pay API key, sign up at https://my.pay.polymerlabs.org/dashboard/api-keys.
Generate speech audio from text using OpenAI's TTS models.
Pricing: $0.01
{
"method": "POST",
"url": "https://pay.polymerlabs.org/proxy/https/x402engine.app/api/tts/openai",
"headers": {
"Content-Type": "application/json",
"x-polymer-pay-api-key": "{{POLYMER_PAY_API_KEY}}"
},
"body": {
"text": "Hello, this is a test of text-to-speech generation.",
"voice": "alloy"
}
}
Available voices: alloy, echo, fable, onyx, nova, shimmer
Response: Returns audio data (MP3 or other format). Save to a file with -o output.mp3.
Generate ultra-realistic speech using ElevenLabs voices.
Pricing: $0.02
{
"method": "POST",
"url": "https://pay.polymerlabs.org/proxy/https/x402engine.app/api/tts/elevenlabs",
"headers": {
"Content-Type": "application/json",
"x-polymer-pay-api-key": "{{POLYMER_PAY_API_KEY}}"
},
"body": {
"text": "Welcome to the future of AI-generated speech.",
"voice": "rachel"
}
}
Response: Returns ultra-realistic audio data. ElevenLabs voices are more natural and expressive but cost 2x more.
Transcribe audio files to text with speaker diarization.
Pricing: $0.10
{
"method": "POST",
"url": "https://pay.polymerlabs.org/proxy/https/x402engine.app/api/transcribe",
"headers": {
"Content-Type": "multipart/form-data",
"x-polymer-pay-api-key": "{{POLYMER_PAY_API_KEY}}"
},
"body": {
"file": "@recording.mp3"
}
}
Response: JSON with transcribed text, speaker diarization labels, and timestamps.
-o filename.mp3 to save outputdevelopment
USE THIS SKILL WHEN: the user wants to use Z.ai's GLM models for chat, translation, image generation, video generation, or web search. Z.ai provides GLM-4.5 and GLM-4.6 with advanced reasoning and agentic capabilities.
development
USE THIS SKILL WHEN: the user wants a quick single-page scrape to markdown or a webpage screenshot. Provides lightweight web scraping and screenshots via x402engine through the Polymer Pay proxy.
data-ai
USE THIS SKILL WHEN: the user wants to generate AI images with FLUX models or create text-in-image with Ideogram. Provides pay-per-use image generation via x402engine through the Polymer Pay proxy.
data-ai
USE THIS SKILL WHEN: the user wants wallet balances, transactions, PnL, ENS resolution, token prices, or transaction simulation. Provides pay-per-use blockchain operations via x402engine through the Polymer Pay proxy.