skills/unicode-security-anti-pattern/SKILL.md
Security anti-pattern for Unicode-related vulnerabilities (CWE-176). Use when generating or reviewing code that handles usernames, displays text, validates input, or compares strings. Detects confusable characters, normalization issues, and bidirectional text attacks.
npx skillsauth add igbuend/grimbard unicode-security-anti-patternInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Severity: Medium
Applications fail to handle Unicode character representation variants, enabling username spoofing, phishing, and validation bypasses through:
exe.pdf as fdp.exe).The anti-pattern is processing Unicode strings without normalization, confusable detection, or control character stripping.
# VULNERABLE: Comparing strings without normalization or confusable detection.
def authenticate_user(provided_username, password):
# This example assumes `fetch_user_from_db` expects the exact string from the DB.
stored_user = fetch_user_from_db(provided_username)
if stored_user and stored_user.password == hash_password(password):
# The 'admin' account exists.
# An attacker registers an account with username "аdmin" (Cyrillic 'a').
# The database stores "аdmin".
# When an attacker tries to log in as "аdmin", `provided_username` is "аdmin".
# `fetch_user_from_db` finds the attacker's "аdmin" user.
# But if the application internally processes "аdmin" to "admin"
# in some other place for a check like `if username == "admin"`,
# then "аdmin" might bypass this check.
# Or, more directly, if `fetch_user_from_db` is case/normalization insensitive:
# Attacker registers "Admin" (Latin A).
# Legitimate user is "admin" (Latin a).
# Both may resolve to the same internal user, or one user can spoof another.
return True
return False
# Another example: Allowing confusable characters for domain names.
# Attacker registers "pаypal.com" (with Cyrillic 'a')
# This looks identical to "paypal.com" (with Latin 'a'), enabling phishing.
# SECURE: Normalize, filter, and compare Unicode strings consistently.
import unicodedata
import re
def normalize_and_sanitize_username(username):
# 1. Normalize to canonical form (NFC) for consistent comparison.
# NFC replaces combining characters with precomposed equivalents.
normalized = unicodedata.normalize('NFC', username)
# 2. Strip zero-width and bidirectional control characters.
# Prevents display manipulation and hidden content.
sanitized = re.sub(r'[\u200B-\u200F\u202A-\u202E\u2066-\u2069]', '', normalized)
# 3. Apply confusable detection (recommended for critical identifiers).
# Convert to "skeleton" form or use confusables database.
# (Implementation depends on specific libraries/algorithms).
# 4. Enforce allowlist of permitted characters.
# For usernames, restrict to ASCII alphanumeric and limited symbols.
if not re.fullmatch(r'^[a-zA-Z0-9_.-]+$', sanitized):
raise ValueError("Username contains disallowed characters.")
return sanitized
def authenticate_user_secure(provided_username, password):
# All usernames should be normalized and sanitized consistently before storage and comparison.
sanitized_username = normalize_and_sanitize_username(provided_username)
stored_user = fetch_user_from_db(sanitized_username)
if stored_user and stored_user.password == hash_password(password):
return True
return False
# When displaying usernames or domain names, consider using Punycode for internationalized domain names (IDNs)
# to make spoofing more obvious to users.
\u200B) into inputs to see if they bypass length checks or string comparisons.\u200B), bidirectional overrides (\u202E), and non-printing characters.development
Security anti-pattern for Cross-Site Scripting vulnerabilities (CWE-79). Use when generating or reviewing code that renders HTML, handles user input in web pages, uses innerHTML/document.write, or builds dynamic web content. Covers Reflected, Stored, and DOM-based XSS. AI code has 86% XSS failure rate.
development
Security anti-pattern for XPath injection vulnerabilities (CWE-643). Use when generating or reviewing code that queries XML documents, constructs XPath expressions, or handles user input in XML operations. Detects unescaped quotes and special characters in XPath queries.
development
Security anti-pattern for weak password hashing (CWE-327, CWE-759). Use when generating or reviewing code that stores or verifies user passwords. Detects use of MD5, SHA1, SHA256 without salt, or missing password hashing entirely. Recommends bcrypt, Argon2, or scrypt.
development
Security anti-pattern for weak encryption (CWE-326, CWE-327). Use when generating or reviewing code that encrypts data, handles encryption keys, or uses cryptographic modes. Detects DES, ECB mode, static IVs, and custom crypto implementations.