skills/encoding-bypass-anti-pattern/SKILL.md
Security anti-pattern for encoding bypass vulnerabilities (CWE-838). Use when generating or reviewing code that handles URL encoding, Unicode normalization, or character set conversions before security validation. Detects validation before normalization and double-encoding issues.
npx skillsauth add igbuend/grimbard encoding-bypass-anti-patternInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Severity: High
Encoding bypass evades security checks via alternate encodings. Occurs when validation happens before decoding/normalization. Encoded payload appears safe but becomes malicious after processing. Bypasses WAFs, input filters, enables XSS and SQL injection.
Flawed order of operations: Validate then Decode/Normalize. Security checks run on encoded data, application later uses decoded version, re-introducing the vulnerability.
# VULNERABLE: Validation happens before Unicode normalization.
import unicodedata
def is_safe_username(username):
# This check is flawed because it doesn't account for Unicode variants.
if '<' in username or '>' in username:
return False
return True
def create_user_profile(username):
if not is_safe_username(username):
raise ValueError("Invalid characters in username.")
# The application later normalizes the username for display or storage.
# The full-width less-than sign '<' (U+FF1C) was not caught by the check.
# It gets normalized into the standard '<' (U+003C), enabling XSS.
normalized_username = unicodedata.normalize('NFKC', username)
# This will render the malicious script tag.
return f"<div>Welcome, {normalized_username}</div>"
# Attacker's input: '<script>alert(1)</script>'
# is_safe_username returns True.
# The normalized output becomes '<div>Welcome, <script>alert(1)</script></div>'
# SECURE: Normalize then validate.
import unicodedata
def is_safe_username(username):
# This check is now effective because it runs on the canonical form of the input.
if '<' in username or '>' in username:
return False
return True
def create_user_profile(username):
# First, normalize the input to its canonical form.
normalized_username = unicodedata.normalize('NFKC', username)
# Then, perform the security validation on the normalized data.
if not is_safe_username(normalized_username):
raise ValueError("Invalid characters in username.")
# Now it's safe to use the normalized username.
return f"<div>Welcome, {normalized_username}</div>"
BAD:
// VULNERABLE: Validation before URL decoding in path traversal
const express = require('express');
const fs = require('fs');
const path = require('path');
app.get('/file/:filename', (req, res) => {
const filename = req.params.filename;
// Check for path traversal - but filename is still encoded
if (filename.includes('..')) {
return res.status(400).send('Invalid filename');
}
// Express automatically decodes URL parameters
// Attack: filename = "..%2F..%2Fetc%2Fpasswd"
// After decoding: "../../etc/passwd" - bypasses the check
const filePath = path.join('/uploads', filename);
res.sendFile(filePath);
});
// Attack payload: GET /file/..%252F..%252Fetc%252Fpasswd
// Double encoding: %252F becomes %2F, then becomes /
GOOD:
// SECURE: Decode then validate
const express = require('express');
const fs = require('fs');
const path = require('path');
app.get('/file/:filename', (req, res) => {
// Express already decoded once, but check for double encoding
let filename = decodeURIComponent(req.params.filename);
// Normalize to canonical form
filename = path.normalize(filename);
// Now validate the normalized path
if (filename.includes('..') || path.isAbsolute(filename)) {
return res.status(400).send('Invalid filename');
}
// Safe to use
const filePath = path.join('/uploads', filename);
res.sendFile(filePath);
});
BAD:
// VULNERABLE: SQL injection via URL decoding bypass
import java.net.URLDecoder;
import java.sql.*;
public void searchUser(String encodedQuery) {
// Validate before decoding
if (encodedQuery.contains("'") || encodedQuery.contains("--")) {
throw new SecurityException("Invalid characters");
}
// Decode after validation
String query = URLDecoder.decode(encodedQuery, "UTF-8");
// Attack: encodedQuery = "admin%27%20OR%20%271%27%3D%271"
// After decode: "admin' OR '1'='1" - bypasses the check
String sql = "SELECT * FROM users WHERE name = '" + query + "'";
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery(sql);
}
GOOD:
// SECURE: Decode then validate (but use parameterized queries)
import java.net.URLDecoder;
import java.sql.*;
import java.util.regex.Pattern;
public void searchUser(String encodedQuery) {
// Decode to canonical form first
String query = URLDecoder.decode(encodedQuery, "UTF-8");
// Validate the decoded form
if (!Pattern.matches("^[a-zA-Z0-9_]+$", query)) {
throw new SecurityException("Invalid characters");
}
// Use parameterized query (best practice)
String sql = "SELECT * FROM users WHERE name = ?";
PreparedStatement stmt = connection.prepareStatement(sql);
stmt.setString(1, query);
ResultSet rs = stmt.executeQuery();
}
Python:
unicodedata.normalize()urllib.parse.unquote()html.unescape()JavaScript/Node.js:
decodeURIComponent()path.normalize()Buffer.from(input, 'base64')Java:
URLDecoder.decode()Normalizer.normalize()StringEscapeUtils.unescapeHtml()Paths.get().normalize()PHP:
urldecode()html_entity_decode()realpath()Search Patterns:
normalize\(|decode\(|unescape\(|URLDecoder|decodeURIComponentCommon Encoding Bypass Techniques:
%3c for <, %2e%2e%2f for ../%253c for < (decoded twice)< (U+FF1C) for << or < for <\u003c for <%u003c or %c0%bc for <..%2f, ..%5c, %2e%2e/Manual Testing:
%3cscript%3e, ..%2f..%2f%253cscript%253e, ..%252f..%252f<script>, ../<script>, <script>%u003cscript%u003eAutomated Testing:
Example Test:
# Test that validation occurs after normalization
def test_encoding_bypass_prevention():
# Unicode variant of '<script>'
malicious_input = "<script>alert(1)</script>"
try:
create_user_profile(malicious_input)
assert False, "Should reject encoded malicious input"
except ValueError:
pass # Expected
# Double URL encoding
encoded_input = "%253cscript%253e"
try:
search_user(encoded_input)
assert False, "Should reject double-encoded input"
except SecurityException:
pass # Expected
Burp Suite Test:
# Intruder payload positions
GET /file/§..%2f..%2fetc%2fpasswd§
# Payload list (encoding variants)
..%2f..%2f
..%252f..%252f
../../
%2e%2e%2f%2e%2e%2f
../.development
Security anti-pattern for Cross-Site Scripting vulnerabilities (CWE-79). Use when generating or reviewing code that renders HTML, handles user input in web pages, uses innerHTML/document.write, or builds dynamic web content. Covers Reflected, Stored, and DOM-based XSS. AI code has 86% XSS failure rate.
development
Security anti-pattern for XPath injection vulnerabilities (CWE-643). Use when generating or reviewing code that queries XML documents, constructs XPath expressions, or handles user input in XML operations. Detects unescaped quotes and special characters in XPath queries.
development
Security anti-pattern for weak password hashing (CWE-327, CWE-759). Use when generating or reviewing code that stores or verifies user passwords. Detects use of MD5, SHA1, SHA256 without salt, or missing password hashing entirely. Recommends bcrypt, Argon2, or scrypt.
development
Security anti-pattern for weak encryption (CWE-326, CWE-327). Use when generating or reviewing code that encrypts data, handles encryption keys, or uses cryptographic modes. Detects DES, ECB mode, static IVs, and custom crypto implementations.