.claude/skills/utf8-double-encoding-fix/SKILL.md
Fix UTF-8 double-encoding corruption where special characters like arrows (→, ↔) become garbled sequences like "âÂÂ" or "â†Â". Use when: (1) Non-ASCII characters display as mojibake after migration/serialization, (2) Arrows, emojis, or accented characters become Ã-prefixed garbage, (3) Content looks correct in source but corrupted after processing through gray-matter, YAML, or text pipelines. Covers detection via hex inspection and fix via latin-1 decode chain.
npx skillsauth add Dbochman/dotfiles utf8-double-encoding-fixInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
UTF-8 characters become garbled after passing through text processing pipelines.
For example, the arrow → (U+2192) becomes â or similar mojibake sequences.
This happens when UTF-8 bytes are incorrectly interpreted as Latin-1 (ISO-8859-1) and then re-encoded as UTF-8, sometimes multiple times.
âÂÂ, é, ü instead of →, é, üCheck the raw bytes to understand the corruption level:
with open('corrupted-file.md', 'rb') as f:
content = f.read()
# Find corrupted section
pos = content.find(b'55 ') # or other known text near corruption
print("Bytes:", content[pos:pos+20].hex())
print("As UTF-8:", content[pos:pos+20].decode('utf-8', errors='replace'))
Corruption patterns:
c3 a2 c2 86 c2 92 = double-encoded → (one decode needed)c3 83 c2 a2 c3 82 c2 86 c3 82 c2 92 = triple-encoded → (two decodes needed)The fix is to decode as Latin-1 then re-encode as UTF-8, repeating until clean:
import os
files_to_fix = ['file1.md', 'file2.md']
for filepath in files_to_fix:
with open(filepath, 'rb') as f:
content = f.read()
text = content.decode('utf-8')
# Check if it contains double-encoded UTF-8 (Ã pattern)
while 'Ã' in text:
try:
text = text.encode('latin-1').decode('utf-8')
except (UnicodeDecodeError, UnicodeEncodeError):
break # Can't decode further
with open(filepath, 'w', encoding='utf-8') as f:
f.write(text)
print(f"Fixed: {filepath}")
with open('fixed-file.md', 'r', encoding='utf-8') as f:
content = f.read()
# Check for proper arrow character
if '→' in content:
print("✓ Arrow character restored")
elif 'Ã' in content:
print("✗ Still corrupted - may need another decode pass")
After fixing:
grep "→" should find the arrows (not grep "Ã")Before (corrupted):
55 â 98 Lighthouse, system fonts
Blog LCP 5.6s â 3.1s (45% faster)
After (fixed):
55 → 98 Lighthouse, system fonts
Blog LCP 5.6s → 3.1s (45% faster)
The encoding chain that causes this:
→ stored as UTF-8 bytes e2 86 92â, †, 'c3 a2 c2 86 c2 92â when displayedIf this happens twice (triple-encoding), you need two decode passes.
charset=utf-8 in Content-Type header when sending to GitHub APIencoding='utf-8' when reading/writing files in Pythonfs.readFileSync(path, 'utf-8') explicitlyContent-Type: application/json; charset=utf-8\xe2\x86\x92 (escaped bytes), it's a different issue -
the file was written in binary mode or bytes weren't decoded at all? or �, the data was actually lost (replacement character)
and may not be recoverabledevelopment
Search the web for current information, news, facts, and answers. Use when asked questions about current events, needing to look something up, finding websites, researching topics, or when you need up-to-date information beyond your training data.
development
Summarize any URL, YouTube video, podcast, PDF, or file into concise text. Use when asked to read an article, summarize a link, get the gist of a video or podcast, extract content from a URL, or when you need to understand what a web page or document contains.
development
Play music via Spotify and control Google Home speakers. Use when asked to play music, songs, artists, playlists, podcasts, or control speakers/volume/audio.
testing
Create new OpenClaw skills, modify and improve existing skills, and measure skill performance with evals. Use when users want to create a skill from scratch, update or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy. Also use when asked to "make a skill", "turn this into a skill", "improve this skill", or "test this skill".