Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

igbuend/unicode-security-anti-pattern

Name: unicode-security-anti-pattern
Author: igbuend

skills/unicode-security-anti-pattern/SKILL.md

npx skillsauth add igbuend/grimbard unicode-security-anti-pattern

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Unicode Security Anti-Pattern

Severity: Medium

Summary

Applications fail to handle Unicode character representation variants, enabling username spoofing, phishing, and validation bypasses through:

Confusable Characters (Homoglyphs): Identical-looking characters from different scripts (Latin 'a' vs. Cyrillic 'а').
Normalization Issues: Multiple byte sequences for the same character (precomposed vs. base + combining accent).
Zero-Width Characters: Non-printing characters hiding malicious content or altering string lengths.
Bidirectional Text Overrides: Control characters reordering display (obfuscating exe.pdf as fdp.exe).

The Anti-Pattern

The anti-pattern is processing Unicode strings without normalization, confusable detection, or control character stripping.

BAD Code Example

# VULNERABLE: Comparing strings without normalization or confusable detection.
def authenticate_user(provided_username, password):
    # This example assumes `fetch_user_from_db` expects the exact string from the DB.
    stored_user = fetch_user_from_db(provided_username)

    if stored_user and stored_user.password == hash_password(password):
        # The 'admin' account exists.
        # An attacker registers an account with username "аdmin" (Cyrillic 'a').
        # The database stores "аdmin".
        # When an attacker tries to log in as "аdmin", `provided_username` is "аdmin".
        # `fetch_user_from_db` finds the attacker's "аdmin" user.
        # But if the application internally processes "аdmin" to "admin"
        # in some other place for a check like `if username == "admin"`,
        # then "аdmin" might bypass this check.
        # Or, more directly, if `fetch_user_from_db` is case/normalization insensitive:
        # Attacker registers "Admin" (Latin A).
        # Legitimate user is "admin" (Latin a).
        # Both may resolve to the same internal user, or one user can spoof another.
        return True
    return False

# Another example: Allowing confusable characters for domain names.
# Attacker registers "pаypal.com" (with Cyrillic 'a')
# This looks identical to "paypal.com" (with Latin 'a'), enabling phishing.

GOOD Code Example

# SECURE: Normalize, filter, and compare Unicode strings consistently.
import unicodedata
import re

def normalize_and_sanitize_username(username):
    # 1. Normalize to canonical form (NFC) for consistent comparison.
    #    NFC replaces combining characters with precomposed equivalents.
    normalized = unicodedata.normalize('NFC', username)

    # 2. Strip zero-width and bidirectional control characters.
    #    Prevents display manipulation and hidden content.
    sanitized = re.sub(r'[\u200B-\u200F\u202A-\u202E\u2066-\u2069]', '', normalized)

    # 3. Apply confusable detection (recommended for critical identifiers).
    #    Convert to "skeleton" form or use confusables database.
    #    (Implementation depends on specific libraries/algorithms).

    # 4. Enforce allowlist of permitted characters.
    #    For usernames, restrict to ASCII alphanumeric and limited symbols.
    if not re.fullmatch(r'^[a-zA-Z0-9_.-]+$', sanitized):
        raise ValueError("Username contains disallowed characters.")

    return sanitized

def authenticate_user_secure(provided_username, password):
    # All usernames should be normalized and sanitized consistently before storage and comparison.
    sanitized_username = normalize_and_sanitize_username(provided_username)
    stored_user = fetch_user_from_db(sanitized_username)

    if stored_user and stored_user.password == hash_password(password):
        return True
    return False

# When displaying usernames or domain names, consider using Punycode for internationalized domain names (IDNs)
# to make spoofing more obvious to users.

Detection

Review string comparisons: Look for any comparisons of user-controlled strings, especially for authentication, authorization, or access control decisions.
Check input processing: See how input strings are handled from reception to storage and display. Are normalization steps applied consistently?
Test with confusable characters: Try registering usernames or domains that use homoglyphs (e.g., Cyrillic 'a' instead of Latin 'a') for common reserved names (admin, root) or well-known brands (paypal, apple).
Test with zero-width characters: Insert zero-width characters (e.g., \u200B) into inputs to see if they bypass length checks or string comparisons.

Prevention

[ ] Normalize all Unicode input: Convert all strings to NFC (Normalization Form C) before validation, storage, or comparison.
[ ] Strip dangerous control characters: Remove zero-width spaces (\u200B), bidirectional overrides (\u202E), and non-printing characters.
[ ] Implement confusable detection: For critical identifiers (usernames, domains), check for homoglyphs using skeleton forms or confusables databases.
[ ] Restrict character sets: For sensitive identifiers, limit to well-defined character sets (ASCII alphanumeric preferred).
[ ] Apply consistently: Use identical Unicode processing (normalization, stripping, filtering) throughout application (input, storage, comparison, display).

Related Security Patterns & Anti-Patterns

Encoding Bypass Anti-Pattern: Unicode issues are a specific type of encoding manipulation that can bypass security filters.
Missing Input Validation Anti-Pattern: Failure to handle Unicode correctly is a form of improper input validation.
Cross-Site Scripting (XSS) Anti-Pattern: Malicious Unicode characters can sometimes be used to bypass XSS filters.

References

OWASP Top 10 A05:2025 - Injection
OWASP GenAI LLM05:2025 - Improper Output Handling
OWASP API Security API8:2023 - Security Misconfiguration
CWE-176: Improper Handling of Unicode
CAPEC-71: Using Unicode Encoding to Bypass Validation
Unicode Security Considerations
Unicode Confusables
Source: sec-context

igbuend/unicode-security-anti-pattern

skills/unicode-security-anti-pattern/SKILL.md

Security anti-pattern for Unicode-related vulnerabilities (CWE-176). Use when generating or reviewing code that handles usernames, displays text, validates input, or compares strings. Detects confusable characters, normalization issues, and bidirectional text attacks.

4 stars

development

Updated Apr 5, 2026

$ install --global

skillsauth

npx skillsauth add igbuend/grimbard unicode-security-anti-pattern

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 5, 2026, 8:21 PM37.1s1 file scanned

SKILL.md

name:: unicode-security-anti-pattern
description:: Security anti-pattern for Unicode-related vulnerabilities (CWE-176). Use when generating or reviewing code that handles usernames, displays text, validates input, or compares strings. Detects confusable characters, normalization issues, and bidirectional text attacks.

Unicode Security Anti-Pattern

Severity: Medium

Summary

Applications fail to handle Unicode character representation variants, enabling username spoofing, phishing, and validation bypasses through:

Confusable Characters (Homoglyphs): Identical-looking characters from different scripts (Latin 'a' vs. Cyrillic 'а').
Normalization Issues: Multiple byte sequences for the same character (precomposed vs. base + combining accent).
Zero-Width Characters: Non-printing characters hiding malicious content or altering string lengths.
Bidirectional Text Overrides: Control characters reordering display (obfuscating exe.pdf as fdp.exe).

The Anti-Pattern

The anti-pattern is processing Unicode strings without normalization, confusable detection, or control character stripping.

BAD Code Example

# VULNERABLE: Comparing strings without normalization or confusable detection.
def authenticate_user(provided_username, password):
    # This example assumes `fetch_user_from_db` expects the exact string from the DB.
    stored_user = fetch_user_from_db(provided_username)

    if stored_user and stored_user.password == hash_password(password):
        # The 'admin' account exists.
        # An attacker registers an account with username "аdmin" (Cyrillic 'a').
        # The database stores "аdmin".
        # When an attacker tries to log in as "аdmin", `provided_username` is "аdmin".
        # `fetch_user_from_db` finds the attacker's "аdmin" user.
        # But if the application internally processes "аdmin" to "admin"
        # in some other place for a check like `if username == "admin"`,
        # then "аdmin" might bypass this check.
        # Or, more directly, if `fetch_user_from_db` is case/normalization insensitive:
        # Attacker registers "Admin" (Latin A).
        # Legitimate user is "admin" (Latin a).
        # Both may resolve to the same internal user, or one user can spoof another.
        return True
    return False

# Another example: Allowing confusable characters for domain names.
# Attacker registers "pаypal.com" (with Cyrillic 'a')
# This looks identical to "paypal.com" (with Latin 'a'), enabling phishing.

GOOD Code Example

# SECURE: Normalize, filter, and compare Unicode strings consistently.
import unicodedata
import re

def normalize_and_sanitize_username(username):
    # 1. Normalize to canonical form (NFC) for consistent comparison.
    #    NFC replaces combining characters with precomposed equivalents.
    normalized = unicodedata.normalize('NFC', username)

    # 2. Strip zero-width and bidirectional control characters.
    #    Prevents display manipulation and hidden content.
    sanitized = re.sub(r'[\u200B-\u200F\u202A-\u202E\u2066-\u2069]', '', normalized)

    # 3. Apply confusable detection (recommended for critical identifiers).
    #    Convert to "skeleton" form or use confusables database.
    #    (Implementation depends on specific libraries/algorithms).

    # 4. Enforce allowlist of permitted characters.
    #    For usernames, restrict to ASCII alphanumeric and limited symbols.
    if not re.fullmatch(r'^[a-zA-Z0-9_.-]+$', sanitized):
        raise ValueError("Username contains disallowed characters.")

    return sanitized

def authenticate_user_secure(provided_username, password):
    # All usernames should be normalized and sanitized consistently before storage and comparison.
    sanitized_username = normalize_and_sanitize_username(provided_username)
    stored_user = fetch_user_from_db(sanitized_username)

    if stored_user and stored_user.password == hash_password(password):
        return True
    return False

# When displaying usernames or domain names, consider using Punycode for internationalized domain names (IDNs)
# to make spoofing more obvious to users.

Detection

Review string comparisons: Look for any comparisons of user-controlled strings, especially for authentication, authorization, or access control decisions.
Check input processing: See how input strings are handled from reception to storage and display. Are normalization steps applied consistently?
Test with confusable characters: Try registering usernames or domains that use homoglyphs (e.g., Cyrillic 'a' instead of Latin 'a') for common reserved names (admin, root) or well-known brands (paypal, apple).
Test with zero-width characters: Insert zero-width characters (e.g., \u200B) into inputs to see if they bypass length checks or string comparisons.

Prevention

[ ] Normalize all Unicode input: Convert all strings to NFC (Normalization Form C) before validation, storage, or comparison.
[ ] Strip dangerous control characters: Remove zero-width spaces (\u200B), bidirectional overrides (\u202E), and non-printing characters.
[ ] Implement confusable detection: For critical identifiers (usernames, domains), check for homoglyphs using skeleton forms or confusables databases.
[ ] Restrict character sets: For sensitive identifiers, limit to well-defined character sets (ASCII alphanumeric preferred).
[ ] Apply consistently: Use identical Unicode processing (normalization, stripping, filtering) throughout application (input, storage, comparison, display).

Related Security Patterns & Anti-Patterns

Encoding Bypass Anti-Pattern: Unicode issues are a specific type of encoding manipulation that can bypass security filters.
Missing Input Validation Anti-Pattern: Failure to handle Unicode correctly is a form of improper input validation.
Cross-Site Scripting (XSS) Anti-Pattern: Malicious Unicode characters can sometimes be used to bypass XSS filters.

References

OWASP Top 10 A05:2025 - Injection
OWASP GenAI LLM05:2025 - Improper Output Handling
OWASP API Security API8:2023 - Security Misconfiguration
CWE-176: Improper Handling of Unicode
CAPEC-71: Using Unicode Encoding to Bypass Validation
Unicode Security Considerations
Unicode Confusables
Source: sec-context

Related Skills

igbuend/xss-anti-pattern

development

VerifiedTrustedCommunity

Security anti-pattern for Cross-Site Scripting vulnerabilities (CWE-79). Use when generating or reviewing code that renders HTML, handles user input in web pages, uses innerHTML/document.write, or builds dynamic web content. Covers Reflected, Stored, and DOM-based XSS. AI code has 86% XSS failure rate.

4SKILL.mdUpdated Apr 5, 2026

igbuend/xss-anti-pattern

igbuend/xpath-injection-anti-pattern

development

VerifiedTrustedCommunity

Security anti-pattern for XPath injection vulnerabilities (CWE-643). Use when generating or reviewing code that queries XML documents, constructs XPath expressions, or handles user input in XML operations. Detects unescaped quotes and special characters in XPath queries.

4SKILL.mdUpdated Apr 5, 2026

igbuend/xpath-injection-anti-pattern

igbuend/weak-password-hashing-anti-pattern

development

VerifiedTrustedCommunity

Security anti-pattern for weak password hashing (CWE-327, CWE-759). Use when generating or reviewing code that stores or verifies user passwords. Detects use of MD5, SHA1, SHA256 without salt, or missing password hashing entirely. Recommends bcrypt, Argon2, or scrypt.

4SKILL.mdUpdated Apr 5, 2026

igbuend/weak-password-hashing-anti-pattern

igbuend/weak-encryption-anti-pattern

development

VerifiedTrustedCommunity

Security anti-pattern for weak encryption (CWE-326, CWE-327). Use when generating or reviewing code that encrypts data, handles encryption keys, or uses cryptographic modes. Detects DES, ECB mode, static IVs, and custom crypto implementations.

4SKILL.mdUpdated Apr 5, 2026

igbuend/weak-encryption-anti-pattern

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/igbuend/grimbard.git

# Copy into Claude Code skills folder (global)
cp -r grimbard/skills/unicode-security-anti-pattern ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

igbuend/grimbard

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT