plugins/idasql/skills/data/SKILL.md
Query IDA strings, bytes, and binary data. Use when asked to search strings, find byte patterns, rebuild string tables, or analyze binary content.
npx skillsauth add allthingsida/idasql-skills dataInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when user asks to:
Route to:
xrefs for relationship expansion from matched addressesanalysis for triage synthesis from data signalsdebugger when findings should drive patch/breakpoint actions-- 1) Validate string availability
SELECT COUNT(*) AS strings FROM strings;
-- 2) Sample high-value strings
SELECT content, printf('0x%X', address) AS addr
FROM strings
WHERE length >= 8
ORDER BY length DESC
LIMIT 40;
-- 3) If expected strings are missing, rebuild once
SELECT rebuild_strings();
Interpretation guidance:
xrefs immediately for execution context.byte_search table over full instruction scans.rebuild_strings() and validate with COUNT(*) FROM strings.LIKE, regex-like pattern narrowing, module/function join filters).funcs.grep, not strings or byte_search.data -> xrefs to convert data hits into code paths/functions.data -> analysis for risk scoring and campaign-level insight.data -> debugger for actionable patch/watchpoint setup.String literals found in the binary. IDA maintains a cached string list that can be configured.
| Column | Type | Description |
|--------|------|-------------|
| address | INT | String address |
| length | INT | String length |
| type | INT | String type (raw encoding bits) |
| type_name | TEXT | Type name: ascii, utf16, utf32 |
| width | INT | Char width (0=1-byte, 1=2-byte, 2=4-byte) |
| width_name | TEXT | Width name: 1-byte, 2-byte, 4-byte |
| layout | INT | String layout (0=null-terminated, 1-3=pascal) |
| layout_name | TEXT | Layout name: termchr, pascal1, pascal2, pascal4 |
| encoding | INT | Encoding index (0=default) |
| content | TEXT | String content (the actual text — not value or text) |
String Type Encoding: IDA stores string type as a 32-bit value:
-- Find error messages
SELECT content, printf('0x%X', address) as addr FROM strings WHERE content LIKE '%error%';
-- ASCII strings only
SELECT * FROM strings WHERE type_name = 'ascii';
-- UTF-16 strings (common in Windows)
SELECT * FROM strings WHERE type_name = 'utf16';
-- Count strings by type
SELECT type_name, layout_name, COUNT(*) as count
FROM strings GROUP BY type_name, layout_name ORDER BY count DESC;
Important: For new analysis (exe/dll), strings are auto-built. For existing databases (i64/idb), strings are already saved. If you see 0 strings unexpectedly, run SELECT rebuild_strings() once to rebuild the list. See String List Surfaces section below.
Use strings + xrefs + funcs directly. This is the canonical pattern.
-- Find call sites/functions referencing error-like strings
SELECT
s.content as string_value,
printf('0x%X', x.from_ea) as ref_addr,
(SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as func_name
FROM strings s
JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%error%' OR s.content LIKE '%fail%'
ORDER BY func_name, ref_addr;
-- Functions with most string references
SELECT
(SELECT name FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) as func_name,
COUNT(*) as string_refs
FROM strings s
JOIN xrefs x ON x.to_ea = s.address
GROUP BY func_name
ORDER BY string_refs DESC
LIMIT 10;
| Function | Description |
|----------|-------------|
| bytes(addr, n) | Read n raw bytes as hex string |
| bytes_raw(addr, n) | Read n bytes as BLOB |
For row-shaped byte workflows, bytes is a pure mapped-byte table:
SELECT ea, value
FROM bytes
WHERE ea >= 0x401000 AND ea < 0x401010
ORDER BY ea;
Use heads when you need IDA item size/type metadata.
Use byte_search for raw bytes/opcodes. It requires WHERE pattern = ...; matched_hex is an output column, not the search input.
| Column | Type | Description |
|--------|------|-------------|
| address | INT | Match address |
| matched_hex | TEXT | Matched bytes rendered as hex text |
| matched_bytes | BLOB | Matched bytes as raw bytes |
| size | INT | Match size in bytes |
| pattern | HIDDEN TEXT | Required IDA byte pattern input |
| start_ea | HIDDEN INT | Optional inclusive lower bound |
| end_ea | HIDDEN INT | Optional exclusive upper bound |
| max_results | HIDDEN INT | Optional generator cap |
Pattern syntax (IDA native):
"48 8B 05" - Exact bytes (hex, space-separated)"48 ? 05" or "48 ?? 05" - ? = any byte wildcard (whole byte only)"(01 02 03)" - Alternatives (match any of these bytes)Note: Nibble wildcards and regex are not supported in byte patterns.
Example:
-- Find all matches for a pattern
SELECT address, matched_hex, size
FROM byte_search
WHERE pattern = '48 8B ? 00'
LIMIT 10;
-- First match only
SELECT printf('0x%llX', address) AS addr
FROM byte_search
WHERE pattern = 'CC CC CC'
ORDER BY address
LIMIT 1;
-- Search with alternatives
SELECT address, matched_hex
FROM byte_search
WHERE pattern = 'E8 (01 02 03 04)'
LIMIT 20;
Optimization Pattern: Find functions using specific instruction
To answer "How many functions use RDTSC instruction?" efficiently:
-- Count unique functions containing RDTSC (opcode: 0F 31)
SELECT COUNT(DISTINCT f.address) as count
FROM byte_search b
JOIN funcs f ON b.address >= f.address AND b.address < f.end_ea
WHERE b.pattern = '0F 31';
-- List those functions with names
SELECT DISTINCT
f.address as func_ea,
f.name as func_name
FROM byte_search b
JOIN funcs f ON b.address >= f.address AND b.address < f.end_ea
WHERE b.pattern = '0F 31';
This is much faster than scanning all disassembly lines because:
byte_search uses IDA's native binary searchfuncs table instead of scanning every instructiongrep for named entities such as functions, labels, structs, enums, and members.strings when the user is searching literal string contents inside the binary.byte_search when the target is a raw byte or opcode pattern.IDA maintains a cached list of strings. Use rebuild_strings() to detect and cache strings, COUNT(*) FROM strings for the current count, and strings for row-level analysis.
| Surface | Description |
|---------|-------------|
| rebuild_strings() | Rebuild with ASCII + UTF-16, minlen 5 (default) |
| rebuild_strings(minlen) | Rebuild with custom minimum length |
| rebuild_strings(minlen, types) | Rebuild with custom length and type mask |
| SELECT COUNT(*) FROM strings | Current string-list count (optimized without row materialization) |
Type mask values:
1 = ASCII only (STRTYPE_C)2 = UTF-16 only (STRTYPE_C_16)4 = UTF-32 only (STRTYPE_C_32)3 = ASCII + UTF-16 (default)7 = All types-- Check current string count
SELECT COUNT(*) AS strings FROM strings;
-- Rebuild with defaults (ASCII + UTF-16, minlen 5)
SELECT rebuild_strings();
-- Rebuild with shorter minimum length
SELECT rebuild_strings(4);
-- Rebuild with specific types
SELECT rebuild_strings(5, 1); -- ASCII only
SELECT rebuild_strings(5, 7); -- All types (ASCII + UTF-16 + UTF-32)
-- Typical workflow: rebuild then query
SELECT rebuild_strings();
SELECT * FROM strings WHERE content LIKE '%error%';
IMPORTANT - Agent Behavior for String Queries: When the user asks about strings (e.g., "show me the strings", "what strings are in this binary"):
SELECT rebuild_strings() to ensure strings are detectedstrings tableThe rebuild_strings() function configures IDA's string detection with sensible defaults (ASCII + UTF-16, minimum length 5) and rebuilds the string list. This ensures the user gets results even if the database had no prior string analysis.
| Table/Function | Architecture | Notes |
|----------------|-------------|-------|
| COUNT(*) FROM strings | Cached table count path | O(1) current string-list count |
| strings | Cached | Rebuilt on demand via rebuild_strings(); fast once cached |
| byte_search | Native binary search table | Much faster than iterating instructions table |
| bytes() | Direct read | O(1) per address, no table overhead |
Key rules:
rebuild_strings() before the first string query on a new database or after making code/data changes that may create new strings.byte_search uses IDA's native binary search engine; for "find functions containing opcode X", join matches to funcs by containment instead of scanning the instructions table.strings table is a snapshot of the cached string list. If IDA's analysis creates new strings after your initial query, call rebuild_strings() again.strings + xrefs + funcs JOIN pattern is canonical — IDA's xref index makes the JOIN fast.Categorize strings by security relevance for rapid threat assessment:
-- Categorize strings by security relevance
SELECT rebuild_strings();
WITH categorized AS (
SELECT address, content,
CASE
WHEN content LIKE '%password%' OR content LIKE '%passwd%' OR content LIKE '%secret%'
THEN 'credential'
WHEN content LIKE '%http://%' OR content LIKE '%https://%' OR content LIKE '%ftp://%'
THEN 'url'
WHEN content LIKE '%error%' OR content LIKE '%fail%' OR content LIKE '%exception%'
THEN 'error'
WHEN content LIKE '%debug%' OR content LIKE '%trace%' OR content LIKE '%assert%'
THEN 'debug'
WHEN content LIKE '%.exe%' OR content LIKE '%.dll%' OR content LIKE '%.sys%'
THEN 'file_path'
WHEN content LIKE '%HKEY_%' OR content LIKE '%SOFTWARE\\%'
THEN 'registry'
ELSE 'other'
END AS category
FROM strings
WHERE length >= 5
)
SELECT category, COUNT(*) AS count,
GROUP_CONCAT(SUBSTR(content, 1, 60), ' | ') AS samples
FROM categorized
WHERE category != 'other'
GROUP BY category
ORDER BY count DESC;
Functions referencing security-relevant strings AND having significant size are high-priority targets:
-- Suspicious functions: reference interesting strings AND are non-trivial
WITH interesting_strings AS (
SELECT address, content FROM strings
WHERE content LIKE '%password%' OR content LIKE '%encrypt%'
OR content LIKE '%decrypt%' OR content LIKE '%http%'
OR content LIKE '%socket%' OR content LIKE '%connect%'
),
string_funcs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr,
s.content AS matched_string
FROM interesting_strings s
JOIN xrefs x ON x.to_ea = s.address
WHERE (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) IS NOT NULL
)
SELECT (SELECT name FROM funcs WHERE sf.func_addr >= address AND sf.func_addr < end_ea LIMIT 1) AS func_name,
printf('0x%X', sf.func_addr) AS addr,
f.size AS func_size,
GROUP_CONCAT(sf.matched_string, ' | ') AS strings_referenced
FROM string_funcs sf
JOIN funcs f ON f.address = sf.func_addr
GROUP BY sf.func_addr
ORDER BY f.size DESC
LIMIT 20;
Cross-category correlation for identifying data exfiltration or C2 communication:
-- Functions touching both crypto and network strings
WITH crypto_refs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr
FROM strings s JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%crypt%' OR s.content LIKE '%aes%'
OR s.content LIKE '%cipher%' OR s.content LIKE '%hash%'
),
network_refs AS (
SELECT DISTINCT (SELECT address FROM funcs WHERE x.from_ea >= address AND x.from_ea < end_ea LIMIT 1) AS func_addr
FROM strings s JOIN xrefs x ON x.to_ea = s.address
WHERE s.content LIKE '%socket%' OR s.content LIKE '%connect%'
OR s.content LIKE '%send%' OR s.content LIKE '%recv%'
OR s.content LIKE '%http%'
)
SELECT (SELECT name FROM funcs WHERE c.func_addr >= address AND c.func_addr < end_ea LIMIT 1) AS func_name,
printf('0x%X', c.func_addr) AS addr
FROM crypto_refs c
JOIN network_refs n ON n.func_addr = c.func_addr;
tools
IDA type system. Use when asked to create, modify, or apply structs, unions, enums, typedefs, or parse C declarations.
databases
Complete idasql SQL function reference catalog. Use when looking up function signatures, parameters, or usage examples.
development
Query IDA disassembly. Use when asked about functions, segments, instructions, blocks, operands, control flow, or raw code structure.
development
Decompile and analyze IDA functions. Use when asked for pseudocode, ctree AST analysis, local variables, labels, or decompiler-driven cleanup.