fal-audio/SKILL.md
Text-to-speech and speech-to-text using fal.ai audio models. Use when the user requests "Convert text to speech", "Transcribe audio", "Generate voice", "Speech to text", "TTS", "STT", or similar audio tasks.
npx skillsauth add abanoub-ashraf/manus-skills-import fal-audioInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Text-to-speech and speech-to-text using state-of-the-art audio models on fal.ai.
To discover the best and latest audio models, use the search API:
# Search for text-to-speech models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --category "text-to-speech"
# Search for speech-to-text models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --category "speech-to-text"
# Search for music generation models
bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "music generation"
Or use the search_models MCP tool with relevant keywords like "tts", "speech", "music".
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh [options]
Arguments:
--text - Text to convert to speech (required)--model - TTS model (defaults to fal-ai/minimax/speech-2.8-turbo)--voice - Voice ID or name (model-specific)Examples:
# Basic TTS (fast, good quality)
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Hello, welcome to the future of AI."
# High quality with MiniMax HD
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "This is premium quality speech." \
--model "fal-ai/minimax/speech-2.8-hd"
# Natural voices with ElevenLabs
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Natural sounding voice generation" \
--model "fal-ai/elevenlabs/tts/eleven-v3"
# Multi-language TTS
bash /mnt/skills/user/fal-audio/scripts/text-to-speech.sh \
--text "Bonjour, bienvenue dans le futur." \
--model "fal-ai/chatterbox/text-to-speech/multilingual"
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh [options]
Arguments:
--audio-url - URL of audio file to transcribe (required)--model - STT model (defaults to fal-ai/whisper)--language - Language code (optional, auto-detected)Examples:
# Transcribe with Whisper
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/audio.mp3"
# Transcribe with speaker diarization
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/meeting.mp3" \
--model "fal-ai/elevenlabs/speech-to-text/scribe-v2"
# Transcribe specific language
bash /mnt/skills/user/fal-audio/scripts/speech-to-text.sh \
--audio-url "https://example.com/spanish.mp3" \
--language "es"
Use search_models MCP tool or search-models.sh to find the best current model, then call mcp__fal-ai__generate with the discovered modelId.
Generating speech...
Model: fal-ai/minimax/speech-2.8-turbo
Speech generated!
Audio URL: https://v3.fal.media/files/abc123/speech.mp3
Duration: 5.2s
Transcribing audio...
Model: fal-ai/whisper
Transcription complete!
Text: "Hello, this is the transcribed text from the audio file."
Duration: 12.5s
Language: en
Here's the generated speech:
[Download audio](https://v3.fal.media/files/.../speech.mp3)
• Duration: 5.2s | Model: Maya TTS
Here's the transcription:
"Hello, this is the transcribed text from the audio file."
• Duration: 12.5s | Language: English
text-to-speech category. Consider quality vs speed tradeoffs.music generation. Some models specialize in vocals, others in instrumental.speech-to-text category. Consider whether you need speaker diarization or multi-language support.Error: Generated audio is empty
Check that your text is not empty and contains valid content.
Error: Audio format not supported
Supported formats: MP3, WAV, M4A, FLAC, OGG
Convert your audio to a supported format.
Warning: Could not detect language, defaulting to English
Specify the language explicitly with --language option.
development
Design principles for building polished, native-feeling SwiftUI apps and widgets. Use this skill when creating or modifying SwiftUI views, iOS widgets (WidgetKit), or any native Apple UI. Ensures proper spacing, typography, colors, and widget implementations that look and feel like quality apps rather than AI-generated slop.
data-ai
Design and implement SwiftUI views, components, and app architecture. Use when creating new SwiftUI views, implementing MVVM/TCA patterns, managing state with @Observable, @State, @Binding, or @Environment, designing navigation flows, or structuring iOS app architecture. Triggers on SwiftUI, view model, state management, navigation, coordinator pattern.
development
Implement, review, or improve SwiftUI animations and transitions. Use when adding implicit or explicit animations with withAnimation, configuring spring animations (.smooth, .snappy, .bouncy), building phase or keyframe animations with PhaseAnimator/KeyframeAnimator, creating hero transitions with matchedGeometryEffect or matchedTransitionSource, adding SF Symbol effects (bounce, pulse, variableColor, breathe, rotate, wiggle), implementing custom Transition or CustomAnimation types, or ensuring animations respect accessibilityReduceMotion.
testing
Audit SwiftUI views for accessibility (iOS + macOS) with patch-ready fixes