offensive-coding/windows-internals-dev/SKILL.md
Auth/lab dev: Windows internals; PEB/TEB, PE/COFF, syscalls, unwinding, memory/heap, tokens, kernel objects, ETW/AMSI telemetry.
npx skillsauth add aeondave/malskill windows-internals-devInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Foundational Windows internals for programmatic work: writing implants and loaders, reversing your own binaries, building EDR/AV tooling, or understanding what the kernel actually does under a Win32 call. This is structural and mechanical knowledge — if you need tool usage, look in offensive-tools/; if you need language-specific patterns, look in *-patterns skills.
What this skill gives you: the offsets, structures, data flows, and invariants you need to touch Windows at the NTAPI/undocumented/kernel-struct level without guessing. Every claim here is either stable across the supported build range or explicitly flagged as version-dependent.
Rule of thumb: if your code uses any Win32 API that is documented, this skill is not needed. If you are walking a PEB pointer, parsing UNWIND_INFO, resolving SSNs, spoofing a call stack, patching ETW-TI, or touching a kernel object directly — start here.
If the question is "what function do I call" — wrong skill. If the question is "what does the OS actually do when I call this, and what can I touch directly instead" — right skill.
Everything below is a pointer into a reference file. Load only what you need.
| Domain | File | Covers |
|--------|------|--------|
| Process/thread state blocks | references/peb-teb.md | PEB, TEB, LDR data, InMemoryOrderModuleList, ApiSetMap, module walking, hash-based resolution |
| PE / COFF | references/pe-format.md | DOS/NT headers, sections, exports (incl. forwarded), imports, relocations, TLS callbacks, .pdata, COFF object files |
| Syscalls | references/syscalls.md | Syscall ABI (x64/ARM64), SSN resolution strategies, direct/indirect dispatch, gate variants, stub layout |
| Exception handling | references/exception-unwind.md | SEH, VEH, UNWIND_INFO, RtlLookupFunctionEntry, RtlVirtualUnwind, KiUserExceptionDispatcher, call-stack spoofing frames |
| Memory | references/memory-management.md | Virtual memory (NtAllocate/Protect/Write/Read), sections, VADs, NT heap vs Segment heap, LFH |
| Threads / APCs | references/threads-apcs.md | Thread creation, suspension, CONTEXT struct, user/kernel/Special APCs, thread hijacking primitives |
| Tokens / privileges | references/tokens-privileges.md | TOKEN struct, privileges, integrity levels, impersonation vs primary, DuplicateTokenEx semantics |
| Kernel objects | references/kernel-objects.md | EPROCESS, ETHREAD, KPCR/KPRCB, handle table, object manager namespace, kernel callbacks |
| Mitigation surface | references/evasion-surface.md | ETW/ETW-TI, AMSI, CFG/XFG/CET, VBS/HVCI/CG, userland hook detection/unhooking |
Non-negotiable invariants. Violating any of these causes silent corruption that surfaces later.
| Invariant | Rule |
|---|---|
| Stack alignment | rsp % 16 == 0 at every call site |
| Shadow space | 32 bytes reserved by caller before every call (sub rsp, 0x28 min with alignment) |
| Integer args (1–4) | rcx, rdx, r8, r9 |
| Integer args (5+) | Stack, caller-allocated |
| Float args (1–4) | xmm0–xmm3 |
| Integer return | rax |
| Callee-saved (int) | rbx, rbp, rsi, rdi, r12–r15 |
| Callee-saved (xmm) | xmm6–xmm15 |
| Syscall number | rax (user-mode SSN before syscall) |
| Syscall arg 4 | r10 — kernel clobbers rcx, so userland stub must mov r10, rcx |
| TEB | gs:[0x30] |
| PEB | gs:[0x60] (or [[gs:0x30] + 0x60]) |
| Invariant | Rule |
|---|---|
| Stack alignment | sp % 16 == 0 (hardware enforced at PSTATE transitions) |
| Integer args (1–8) | x0–x7 |
| Float args (1–8) | d0–d7 |
| Integer return | x0 |
| Callee-saved (int) | x19–x28, x29 (fp), x30 (lr) |
| Callee-saved (vec) | d8–d15 |
| Syscall number | x8 |
| Syscall instruction | svc #0 |
| TEB | x18 (platform register, reserved by Windows) |
| PEB | [x18 + 0x60] |
ARM64EC (x64 emulation compatible) uses a modified register mapping — x0..x15 map to rcx,rdx,r8,r9 etc. If writing ARM64EC ASM, consult the Microsoft ARM64EC ABI doc, not this table.
These are the 12 offsets you will look up most often. Full layouts in references/peb-teb.md.
| Structure | Field | Offset | Notes | |---|---|---|---| | TEB | ProcessEnvironmentBlock | 0x60 | PEB pointer | | TEB | ThreadLocalStoragePointer | 0x58 | TLS slot array | | TEB | LastErrorValue | 0x68 | GetLastError storage | | PEB | BeingDebugged | 0x02 | 1 byte | | PEB | ImageBaseAddress | 0x10 | Main .exe base | | PEB | Ldr | 0x18 | PEB_LDR_DATA pointer | | PEB | ProcessParameters | 0x20 | RTL_USER_PROCESS_PARAMETERS | | PEB | ApiSetMap | 0x68 | API_SET_NAMESPACE pointer | | PEB_LDR_DATA | InLoadOrderModuleList | 0x10 | Head of load order list | | PEB_LDR_DATA | InMemoryOrderModuleList | 0x20 | Head of memory order list | | LDR_DATA_TABLE_ENTRY | DllBase | 0x30 | Module base address | | LDR_DATA_TABLE_ENTRY | BaseDllName | 0x58 | UNICODE_STRING |
When walking
InMemoryOrderModuleList, theFlinkyou dereference points intoInMemoryOrderLinks(offset 0x10) of the next entry, not its base. Subtract 0x10 to get the entry pointer.
BaseDllName, compare. See peb-teb.md §Module Walk.DataDirectory[0] → iterate AddressOfNames / AddressOfOrdinals / AddressOfFunctions. See pe-format.md §Export Walk.mov eax, imm32. Halo's / Tartarus Gate fall back when the stub is hooked. Recycled Gate enumerates all Zw* exports and sorts by RVA to derive SSN from index. See syscalls.md.syscall; ret gadget — scan any Nt* stub body for the byte sequence 0F 05 C3. Used by Recycled Gate / indirect syscall paths to route all syscalls through ntdll. See syscalls.md §Indirect Dispatch..pdata (DataDirectory[3]) for a RUNTIME_FUNCTION whose [BeginAddress, EndAddress) contains your RIP. Walk UnwindCodes[] to derive frame size and saved registers. See exception-unwind.md.UWOP_SET_FPREG terminator + UWOP_PUSH_NONVOL(rbp) second frame), plant fake return addresses on the real RSP matching those unwinders, then jmp into the target. The unwinder validates; the scanner walks the frame list and sees legitimate module addresses. See exception-unwind.md §Call-stack Spoofing.NtOpenProcessToken → NtQueryInformationToken(TokenElevation) / TokenUser → NtDuplicateToken(SecurityImpersonation) → NtSetInformationThread(ThreadImpersonationToken). See tokens-privileges.md.NtAllocateVirtualMemory or section + NtMapViewOfSection), write payload, create thread (NtCreateThreadEx) or queue APC (NtQueueApcThread[Ex2]). See threads-apcs.md.AmsiScanBuffer (amsi.dll) or EtwEventWrite / NtTraceEvent (ntdll.dll), flip page protection RW, write a 1-byte ret or mov eax, 0x80070057; ret prologue, restore protection. See evasion-surface.md.PspCreateProcessNotifyRoutine (array of 64 EX_CALLBACK_ROUTINE_BLOCK pointers with low-bit tagging) and nulls entries belonging to EDR drivers. See kernel-objects.md §Kernel Callbacks.Need to invoke an Nt* function without touching ntdll's hooked stub?
│
├── Can ntdll.dll be walked? (99% yes — ntdll is first LDR entry)
│ │
│ ├── YES → Use PEB walk to resolve Nt* address
│ │ │
│ │ ├── Stub unhooked? → Hell's Gate (read SSN from stub at +4)
│ │ │
│ │ └── Stub hooked? → Halo's/Tartarus Gate (walk neighbors ±1)
│ │ OR Recycled Gate (sort Zw* exports by RVA)
│ │
│ └── NO (unusual — stripped PEB or sandbox) → Hard-coded SSN table
│ (brittle, version-specific)
│
├── Executing the syscall:
│ │
│ ├── Direct → Emit `syscall` instruction in your own .text
│ │ • Callstack has your module → red flag
│ │
│ ├── Indirect → Emit `call <syscall;ret gadget inside ntdll>`
│ │ • Callstack shows ntdll frame → clean
│ │ • Requires gadget discovery pass
│ │
│ └── Desync → Build spoofed 3-frame call stack, `jmp` into gadget
│ • Callstack shows chain of legit frames
│ • Full UNWIND_INFO math required
│
└── 6+ args? → 5th arg onwards goes on stack after shadow space
Stubs vary: see syscalls.md §Multi-arg variants
LdrLoadDll)KnownDlls (section object namespace under \KnownDlls)NtOpenSection on the pre-mapped section, NtMapViewOfSection into processNtOpenFile → NtCreateSection(SEC_IMAGE) → NtMapViewOfSectionIMAGE_TLS_DIRECTORY.AddressOfCallBacks array, call each with DLL_PROCESS_ATTACHDllMain with DLL_PROCESS_ATTACHPsSetLoadImageNotifyRoutine fires now)Pitfall: TLS callbacks from a DLL loaded via LoadLibrary are not invoked for already-running threads, only for threads created after. They are invoked for the current thread on DLL_PROCESS_ATTACH. Statically-linked DLLs' TLS callbacks are fired for all existing threads at process init.
NtCreateThreadEx)THREAD_CREATE_FLAGS_CREATE_SUSPENDED set → thread placed in waiting stateLdrInitializeThunk (unless THREAD_CREATE_FLAGS_SKIP_THREAD_ATTACH) — runs DLL TLS callbacks, DllMain(DLL_THREAD_ATTACH) for every loaded DLL, then jumps to start addressOffensive note: SKIP_THREAD_ATTACH skips the entire LdrInit chain — useful for shellcode threads where you do not want DLL_THREAD_ATTACH to fire across every loaded DLL (which can alert hooks or crash if a DLL's DllMain mismatches).
KiUserExceptionDispatcher in ntdllRtlDispatchException walks the function table starting at faulting RIPRtlLookupFunctionEntry → RtlVirtualUnwind → check for registered handlerExceptionContinueSearch / ContinueExecutionUnhandledExceptionFilter → VEH chain → process terminate via NtTerminateProcessOffensive use: VEH registration (RtlAddVectoredExceptionHandler) is a pre-SEH hook, runs before stack-based handlers. Common abuse: register VEH, trigger a fault, have VEH set ContextRecord->Rip to shellcode, return ExceptionContinueExecution. Leaves minimal forensic trace compared to thread creation.
When you apply any technique, the following signals are potentially visible. This is a high-level map; details in evasion-surface.md.
| Signal | Who sees it | What triggers it |
|---|---|---|
| Userland hooks | EDR in-process agent | Calling Nt* via ntdll stub |
| ETW (userland) | EDR agent or Microsoft-Windows-* consumers | EtwEventWrite inside ntdll/kernel32 wrappers |
| ETW-TI (kernel) | PPL service consuming kernel events | Memory writes to remote processes, SUSPEND/RESUME, APC queue, etc. |
| AMSI | Registered AMSI providers (Defender) | Script content scanned by AmsiScanBuffer |
| Kernel callbacks | EDR driver | Process/thread/image/registry create, handle open |
| Minifilter | EDR filesystem driver | File open/read/write/create |
| Handle open audit | Kernel | Any NtOpen* on protected process (LSASS, etc.) |
| CFG violation | Process | Indirect call to non-valid target |
| Shadow stack mismatch | CET-enabled process | ret address mismatches shadow stack top |
Important reality: modern advanced EDRs (Defender for Endpoint, Elastic, CrowdStrike recent builds) rely primarily on kernel callbacks and ETW-TI. Userland unhooking by itself does nothing against them. Plan your technique accordingly — see evasion-surface.md §EDR Architecture.
PEB walk invariants
InLoadOrderModuleList is the main executableInMemoryOrderModuleList is ntdll.dll (loader init guarantees this)BaseDllName is a UNICODE_STRING (16 bytes: Length, MaximumLength, Buffer). Length is in bytes, not charsPE parsing invariants
e_magic == 0x5A4D ("MZ") and e_lfanew > 0 && e_lfanew < PE_file_sizeNT_Signature == 0x00004550 ("PE\0\0")OptionalHeader.Magic == 0x20B; export directory RVA at OptionalHeader + 0x70 (DataDirectory[0].VirtualAddress lives at offset 0x88 of OptionalHeader on x64, or offset 0x78 of IMAGE_NT_HEADERS64 if counting from NT base)DataDirectory[0] range, it is a forwarder string "dll.function" — recursively resolveManual mapping / reflective DLL invariants
AddressOfEntryPoint, which is usually CRT startup before user DllMain.DllMain can be a minimal shellcode attach gate, followed by an explicit reflective start export. Do not replace that with AddressOfEntryPoint just because it resembles the OS loader; verify the payload's reserved semantics and runtime markers first.TcpStream::connect_timeout is more likely a socket/runtime primitive issue than an earlier PE entrypoint issue.Syscall invariants
syscall: r10 = rcx. Kernel clobbers rcx with return addresswow64cpu!CpupReturnFromSimulatedCode; direct syscall from 32-bit code does not work — must go through the transitionsvc #0 is the instruction; SSN goes in x8 (not w8), args in x0–x7, then stackUnwind invariants
UNWIND_INFO.CountOfCodes is rounded up to even; iterate in pairsUWOP_ALLOC_SMALL: size encoded as OpInfo * 8 + 8 (range 8..128)UWOP_ALLOC_LARGE: size in following 2 or 4 bytes; OpInfo == 0 → 2-byte scaled by 8, OpInfo == 1 → 4-byte rawUNW_FLAG_CHAININFO): last code entry is RUNTIME_FUNCTION of parent, not a code — follow it.pdata (leaf functions with no prologue) cannot be unwound — useless as spoof framesToken invariants
NtOpenProcessToken(NtCurrentProcess(), TOKEN_QUERY, ...) always succeeds — you own your own tokenNtOpenProcessToken on another process requires at minimum PROCESS_QUERY_LIMITED_INFORMATION on the process handle and TOKEN_DUPLICATE for impersonationDuplicateTokenEx with TokenImpersonation first, or use CreateProcessWithTokenW for new processSeImpersonatePrivilegeMemory invariants
NtProtectVirtualMemory rounds up to page granularity — querying the old protection returns the protection of the first page in the rangeNtAllocateVirtualMemory with MEM_COMMIT over an already-committed region succeeds (idempotent commit) and preserves contents; over a reserved-but-not-committed region zero-fillsPAGE_NOACCESS guard page raises STATUS_GUARD_PAGE_VIOLATION once, then converts to normal access — useful for trap-then-continue primitivesstack-spoofing — Building call-stack spoof trampolines (Draugr / SilentMoonwalk / CHRYSALIS): frame-size math with SAVE_NONVOL safety filter, FF 23 gadget scanner with debug instrumentation, Win11 22H2+ empirical gadget inventory, C/Rust/Go trampoline skeletons.indirect-syscall — Building indirect syscall dispatchers: SSN resolution (Hell's / Halo's / Tartarus / RecycledGate / DWhisper), syscall;ret gadget discovery with caching, name obfuscation, per-language dispatcher implementations with arg-count variants.Use those skills when writing code; use this one for the underlying structures and invariants.
references/peb-teb.md — PEB, TEB, LDR data, ApiSetMap, module walking, hash-based resolutionreferences/pe-format.md — DOS/NT headers, sections, exports, imports, TLS callbacks, .pdata, COFF objectsreferences/syscalls.md — Syscall ABI, SSN resolution (Hell's/Halo's/Tartarus/Recycled/HWSyscall), direct/indirect dispatchreferences/exception-unwind.md — SEH/VEH, UNWIND_INFO, RtlVirtualUnwind, KiUserExceptionDispatcher, call-stack spoofingreferences/memory-management.md — Virtual memory, sections, VADs, NT heap / Segment heap / LFHreferences/threads-apcs.md — Thread creation, CONTEXT, user/kernel/Special APCs, thread hijackingreferences/tokens-privileges.md — TOKEN struct, privileges, integrity levels, impersonationreferences/kernel-objects.md — EPROCESS, ETHREAD, KPCR/KPRCB, handle table, object manager, kernel callbacksreferences/evasion-surface.md — ETW/ETW-TI, AMSI, CFG/XFG/CET, VBS/HVCI/CG, hook detection/unhookingntdoc.m417z.com) — searchable NTAPI referencedata-ai
Scoped routing: Linux operator; hosts, sessions, users, services, packages, logs, containers, SSH, network paths, privilege evidence.
development
Offensive methodology for ICS/OT/SCADA environments in authorized industrial penetration testing and red team operations. Use when assessing PLCs, RTUs, HMIs, engineering workstations, historians, or field devices running Modbus, DNP3, EtherNet/IP, S7comm/S7+, Profinet, IEC 60870-5-104, BACnet, or OPC-UA. Covers passive OT network enumeration, protocol-level device interrogation, PLC coil/register read-write attacks, HMI session exploitation, historian and engineering workstation compromise, and safe escalation rules for critical infrastructure scope. Does not cover: general IT network exploitation (network-technique), physical hardware interfaces UART/JTAG/SPI (hardware-technique), wireless sensor network attacks (wireless-technique), RF/SDR signal analysis (hardware-ctf or wireless-technique), or CTF-framed ICS lab tasks (ics-ctf).
tools
Offensive methodology for authorized game security assessments, game client security research, and game-adjacent penetration testing in real-world engagements. Use when assessing game clients for cheating vulnerabilities, testing anti-cheat effectiveness, auditing game server protocols for score manipulation or economic fraud, reverse engineering game DRM or license validation, analyzing game save file protection, or assessing game mod/plugin security. Covers: process memory scanning and manipulation (Cheat Engine methodology), game binary reversing for license and DRM bypass, game network protocol analysis and packet replay, anti-cheat mechanism analysis, save file format reversing and tampering, speed hack and value injection techniques. Does NOT cover: CTF game challenges (game-ctf), game engine source code auditing (web-exploit-technique or vuln-search-technique for the backend), or general binary exploitation (pwn-ctf or reversing-technique).
development
Auth assessment: hardware/embedded methodology; UART/JTAG/SWD/SPI/I2C, firmware extraction, boot/debug paths, embedded OS evidence.