transcript-fixer/SKILL.md
Corrects speech-to-text transcription errors in meeting notes, lectures, and interviews using dictionary rules and AI. Learns patterns to build personalized correction databases. Use when working with transcripts containing ASR/STT errors, homophones, or Chinese/English mixed content requiring cleanup.
npx skillsauth add fernandezbaptiste/claude-code-skills transcript-fixerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Correct speech-to-text transcription errors through dictionary-based rules, AI-powered corrections, and automatic pattern detection. Build a personalized knowledge base that learns from each correction.
Python execution must use uv - never use system Python directly.
If uv is not installed:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows PowerShell
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
Recommended: Use Enhanced Wrapper (auto-detects API key, opens HTML diff):
# First time: Initialize database
uv run scripts/fix_transcription.py --init
# Process transcript with enhanced UX
uv run scripts/fix_transcript_enhanced.py input.md --output ./corrected
The enhanced wrapper automatically:
ANTHROPIC_BASE_URL)Alternative: Use Core Script Directly:
# 1. Set API key (if not auto-detected)
export GLM_API_KEY="<api-key>" # From https://open.bigmodel.cn/
# 2. Add common corrections (5-10 terms)
uv run scripts/fix_transcription.py --add "错误词" "正确词" --domain general
# 3. Run full correction pipeline
uv run scripts/fix_transcription.py --input meeting.md --stage 3
# 4. Review learned patterns after 3-5 runs
uv run scripts/fix_transcription.py --review-learned
Output files:
*_stage1.md - Dictionary corrections applied*_stage2.md - AI corrections applied (final version)*_对比.html - Visual diff (open in browser for best experience)Generate word-level diff (recommended for reviewing corrections):
uv run scripts/generate_word_diff.py original.md corrected.md output.html
This creates an HTML file showing word-by-word differences with clear highlighting:
japanese 3 pro → 🟢 Gemini 3 Pro (complete word replacements)Input transcript (meeting.md):
今天我们讨论了巨升智能的最新进展。
股价系统需要优化,目前性能不够好。
After Stage 1 (meeting_stage1.md):
今天我们讨论了具身智能的最新进展。 ← "巨升"→"具身" corrected
股价系统需要优化,目前性能不够好。 ← Unchanged (not in dictionary)
After Stage 2 (meeting_stage2.md):
今天我们讨论了具身智能的最新进展。
框架系统需要优化,目前性能不够好。 ← "股价"→"框架" corrected by AI
Learned pattern detected:
✓ Detected: "股价" → "框架" (confidence: 85%, count: 1)
Run --review-learned after 2 more occurrences to approve
Three-stage pipeline stores corrections in ~/.transcript-fixer/corrections.db:
uv run scripts/fix_transcription.py --init--add "错误词" "正确词" --domain <domain>--input file.md --stage 3--review-learned and --approve high-confidence suggestionsStages: Dictionary (instant, free) → AI via GLM API (parallel) → Full pipeline
Domains: general, embodied_ai, finance, medical, or custom names including Chinese (e.g., 火星加速器, 具身智能)
Learning: Patterns appearing ≥3 times at ≥80% confidence move from AI to dictionary
See references/workflow_guide.md for detailed workflows, references/script_parameters.md for complete CLI reference, and references/team_collaboration.md for collaboration patterns.
MUST save corrections after each fix. This is the skill's core value.
After fixing errors manually, immediately save to dictionary:
uv run scripts/fix_transcription.py --add "错误词" "正确词" --domain general
See references/iteration_workflow.md for complete iteration guide with checklist.
When GLM API is unavailable (503, network issues), the script outputs [CLAUDE_FALLBACK] marker.
Claude Code should then:
--addMUST read references/database_schema.md before any database operations.
Quick reference:
# View all corrections
sqlite3 ~/.transcript-fixer/corrections.db "SELECT * FROM active_corrections;"
# Check schema version
sqlite3 ~/.transcript-fixer/corrections.db "SELECT value FROM system_config WHERE key='schema_version';"
| Stage | Description | Speed | Cost | |-------|-------------|-------|------| | 1 | Dictionary only | Instant | Free | | 2 | AI only | ~10s | API calls | | 3 | Full pipeline | ~10s | API calls |
Scripts:
ensure_deps.py - Initialize shared virtual environment (run once, optional)fix_transcript_enhanced.py - Enhanced wrapper (recommended for interactive use)fix_transcription.py - Core CLI (for automation)generate_word_diff.py - Generate word-level diff HTML for reviewing correctionsexamples/bulk_import.py - Bulk import exampleReferences (load as needed):
database_schema.md (read before DB operations), iteration_workflow.md (dictionary iteration best practices)installation_setup.md, glm_api_setup.md, workflow_guide.mdquick_reference.md, script_parameters.md, dictionary_guide.mdsql_queries.md, file_formats.md, architecture.md, best_practices.mdtroubleshooting.md, team_collaboration.mdVerify setup health with uv run scripts/fix_transcription.py --validate. Common issues:
--initexport GLM_API_KEY="<key>" (obtain from https://open.bigmodel.cn/)~/.transcript-fixer/ ownershipSee references/troubleshooting.md for detailed error resolution and references/glm_api_setup.md for API configuration.
data-ai
Download YouTube videos and HLS streams (m3u8) from platforms like Mux, Vimeo, etc. using yt-dlp and ffmpeg. Use this skill when users request downloading videos, extracting audio, handling protected streams with authentication headers, or troubleshooting download issues like nsig extraction failures, 403 errors, or cookie extraction problems.
development
Diagnose Windows App (Microsoft Remote Desktop / Azure Virtual Desktop / W365) connection quality issues on macOS. Analyze transport protocol selection (UDP Shortpath vs WebSocket), detect VPN/proxy interference with STUN/TURN negotiation, and parse Windows App logs for Shortpath failures. This skill should be used when VDI connections are slow, when transport shows WebSocket instead of UDP, when RDP Shortpath fails to establish, or when RTT is unexpectedly high.
development
This skill should be used when comparing two videos to analyze compression results or quality differences. Generates interactive HTML reports with quality metrics (PSNR, SSIM) and frame-by-frame visual comparisons. Triggers when users mention "compare videos", "video quality", "compression analysis", "before/after compression", or request quality assessment of compressed videos.
development
Extract design systems from reference UI images and generate implementation-ready UI design prompts. Use when users provide UI screenshots/mockups and want to create consistent designs, generate design systems, or build MVP UIs matching reference aesthetics.