SKILLS/analyzing-email-headers-for-phishing-investigation/SKILL.md
Parse and analyze email headers to trace the origin of phishing emails, verify sender authenticity, and identify spoofing through SPF, DKIM, and DMARC validation.
npx skillsauth add pinkpixel-dev/skills-collection-1 analyzing-email-headers-for-phishing-investigationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
# Export from Outlook: Open email > File > Properties > Internet Headers
# Export from Gmail: Open email > Three dots > Show original
# Export from Thunderbird: View > Message Source
# If working with EML file from forensic image
cp /mnt/evidence/Users/suspect/AppData/Local/Microsoft/Outlook/phishing_email.eml \
/cases/case-2024-001/email/
# If working with PST file, extract individual messages
pip install pypff
python3 << 'PYEOF'
import pypff
pst = pypff.file()
pst.open("/cases/case-2024-001/email/outlook.pst")
root = pst.get_root_folder()
def extract_messages(folder, path=""):
for i in range(folder.get_number_of_sub_messages()):
msg = folder.get_sub_message(i)
headers = msg.get_transport_headers()
subject = msg.get_subject()
if headers:
filename = f"/cases/case-2024-001/email/msg_{i}_{subject[:30]}.txt"
with open(filename, 'w') as f:
f.write(headers)
for i in range(folder.get_number_of_sub_folders()):
extract_messages(folder.get_sub_folder(i))
extract_messages(root)
PYEOF
# Parse headers using Python email library
python3 << 'PYEOF'
import email
from email import policy
with open('/cases/case-2024-001/email/phishing_email.eml', 'r') as f:
msg = email.message_from_file(f, policy=policy.default)
print("=== KEY HEADER FIELDS ===")
print(f"From: {msg['From']}")
print(f"To: {msg['To']}")
print(f"Subject: {msg['Subject']}")
print(f"Date: {msg['Date']}")
print(f"Message-ID: {msg['Message-ID']}")
print(f"Reply-To: {msg['Reply-To']}")
print(f"Return-Path: {msg['Return-Path']}")
print(f"X-Mailer: {msg['X-Mailer']}")
print(f"X-Originating-IP: {msg['X-Originating-IP']}")
print("\n=== RECEIVED HEADERS (bottom-up = chronological) ===")
received_headers = msg.get_all('Received')
if received_headers:
for i, header in enumerate(reversed(received_headers)):
print(f"\nHop {i+1}: {header.strip()}")
print("\n=== AUTHENTICATION RESULTS ===")
auth_results = msg.get_all('Authentication-Results')
if auth_results:
for result in auth_results:
print(result)
print(f"\nARC-Authentication-Results: {msg.get('ARC-Authentication-Results', 'Not present')}")
print(f"Received-SPF: {msg.get('Received-SPF', 'Not present')}")
print(f"DKIM-Signature: {msg.get('DKIM-Signature', 'Not present')}")
PYEOF
# Extract the envelope sender domain
SENDER_DOMAIN="example-corp.com"
# Check SPF record
dig TXT $SENDER_DOMAIN +short | grep "v=spf1"
# Example: "v=spf1 include:_spf.google.com include:sendgrid.net ~all"
# Check DKIM record (selector from DKIM-Signature header, e.g., "s=selector1")
DKIM_SELECTOR="selector1"
dig TXT ${DKIM_SELECTOR}._domainkey.${SENDER_DOMAIN} +short
# Check DMARC record
dig TXT _dmarc.${SENDER_DOMAIN} +short
# Example: "v=DMARC1; p=reject; rua=mailto:[email protected]; pct=100"
# Verify the sending IP against SPF
# Extract IP from first Received header
SENDING_IP="203.0.113.45"
# Manual SPF check using python
python3 << 'PYEOF'
import spf # pip install pyspf
result, explanation = spf.check2(
i='203.0.113.45',
s='[email protected]',
h='mail.example-corp.com'
)
print(f"SPF Result: {result}")
print(f"Explanation: {explanation}")
# Results: pass, fail, softfail, neutral, none, temperror, permerror
PYEOF
# Check if sending IP is in known malicious IP lists
# Query AbuseIPDB or VirusTotal
curl -s "https://api.abuseipdb.com/api/v2/check?ipAddress=${SENDING_IP}" \
-H "Key: YOUR_API_KEY" -H "Accept: application/json" | python3 -m json.tool
# WHOIS lookup on sender domain
whois $SENDER_DOMAIN | grep -iE '(registrar|creation|expiration|registrant|nameserver)'
# Check domain age (recently registered domains are suspicious)
# DNS record investigation
dig A $SENDER_DOMAIN +short
dig MX $SENDER_DOMAIN +short
dig NS $SENDER_DOMAIN +short
# Reverse DNS on sending IP
dig -x $SENDING_IP +short
# Check for lookalike/typosquatting domains
# Compare with legitimate domain using visual similarity
python3 << 'PYEOF'
import Levenshtein # pip install python-Levenshtein
legitimate = "microsoft.com"
suspicious = "micr0soft.com"
distance = Levenshtein.distance(legitimate, suspicious)
ratio = Levenshtein.ratio(legitimate, suspicious)
print(f"Edit distance: {distance}")
print(f"Similarity ratio: {ratio:.2%}")
if ratio > 0.8:
print("WARNING: Likely typosquatting/lookalike domain!")
PYEOF
# Check domain reputation on VirusTotal
curl -s "https://www.virustotal.com/api/v3/domains/${SENDER_DOMAIN}" \
-H "x-apikey: YOUR_VT_API_KEY" | python3 -m json.tool
# Check if the Reply-To differs from From (common phishing indicator)
python3 -c "
import email
with open('/cases/case-2024-001/email/phishing_email.eml') as f:
msg = email.message_from_file(f)
from_addr = email.utils.parseaddr(msg['From'])[1]
reply_to = email.utils.parseaddr(msg.get('Reply-To', msg['From']))[1]
if from_addr != reply_to:
print(f'WARNING: From ({from_addr}) != Reply-To ({reply_to})')
else:
print('From and Reply-To match')
"
# Extract URLs from email body
python3 << 'PYEOF'
import email
import re
from email import policy
with open('/cases/case-2024-001/email/phishing_email.eml', 'r') as f:
msg = email.message_from_file(f, policy=policy.default)
body = msg.get_body(preferencelist=('html', 'plain'))
if body:
content = body.get_content()
urls = re.findall(r'https?://[^\s<>"\']+', content)
print("=== URLs FOUND IN EMAIL BODY ===")
for url in set(urls):
print(f" {url}")
# Check for URL obfuscation (display text != href)
href_pattern = re.findall(r'<a[^>]*href=["\']([^"\']+)["\'][^>]*>(.*?)</a>', content, re.DOTALL)
print("\n=== HYPERLINK ANALYSIS ===")
for href, text in href_pattern:
display_url = re.findall(r'https?://[^\s<]+', text)
if display_url and display_url[0] != href:
print(f" MISMATCH: Display='{display_url[0]}' -> Actual='{href}'")
# Extract and hash attachments
print("\n=== ATTACHMENTS ===")
for part in msg.walk():
if part.get_content_disposition() == 'attachment':
filename = part.get_filename()
content = part.get_payload(decode=True)
import hashlib
sha256 = hashlib.sha256(content).hexdigest()
print(f" File: {filename}, Size: {len(content)}, SHA-256: {sha256}")
with open(f'/cases/case-2024-001/email/attachments/{filename}', 'wb') as af:
af.write(content)
PYEOF
# Submit attachment hashes to VirusTotal
# Submit URLs to URLhaus or PhishTank for reputation check
| Concept | Description | |---------|-------------| | SPF (Sender Policy Framework) | DNS record specifying authorized mail servers for a domain | | DKIM (DomainKeys Identified Mail) | Cryptographic signature verifying email content integrity | | DMARC | Policy framework combining SPF and DKIM for sender authentication | | Received headers | Server-added headers showing each hop in the delivery chain (read bottom to top) | | Return-Path | Envelope sender address used for bounce messages; may differ from From | | Message-ID | Unique identifier assigned by the originating mail server | | X-Originating-IP | Original sender IP address (added by some mail services) | | Header forgery | Attackers can forge From, Reply-To, and other headers but not Received chains |
| Tool | Purpose | |------|---------| | MXToolbox | Online email header analyzer and DNS lookup | | dig/nslookup | DNS record queries for SPF, DKIM, DMARC verification | | pyspf | Python SPF record validation library | | dkimpy | Python DKIM signature verification library | | PhishTool | Specialized phishing email analysis platform | | VirusTotal | URL and file reputation checking service | | AbuseIPDB | IP address reputation database | | whois | Domain registration information lookup |
Scenario 1: CEO Fraud / Business Email Compromise The email claims to be from the CEO but Reply-To points to a Gmail address, SPF fails because the sending IP is not authorized for the spoofed domain, DKIM is missing, and the From domain is a lookalike (ceo-company.com vs company.com).
Scenario 2: Credential Harvesting Phishing Email contains a link that displays "login.microsoft.com" but href points to a lookalike domain, the attachment is an HTML file containing a fake login page with credential exfiltration JavaScript, the sending domain was registered 3 days ago.
Scenario 3: Malware Delivery via Attachment Email with an Office document attachment containing macros, the sender domain passes SPF but the account was compromised, DKIM signature is valid (sent from legitimate infrastructure), attachment SHA-256 matches known malware on VirusTotal.
Scenario 4: Spear Phishing with Legitimate Service Attacker uses a legitimate email marketing service to send phishing, SPF and DKIM pass because the service is authorized, the phishing is in the content not the infrastructure, requires URL and content analysis rather than header authentication checks.
Email Header Analysis Report:
Subject: "Urgent: Invoice Payment Required"
From: [email protected] (SPOOFED)
Reply-To: [email protected] (MISMATCH)
Return-Path: <[email protected]>
Date: 2024-01-15 09:23:45 UTC
Delivery Path (4 hops):
Hop 1: mail-server.xyz [203.0.113.45] -> relay1.isp.com
Hop 2: relay1.isp.com -> mx.target-company.com
Hop 3: mx.target-company.com -> internal-filter.target.com
Hop 4: internal-filter.target.com -> mailbox
Authentication:
SPF: FAIL (203.0.113.45 not authorized for examp1e-corp.com)
DKIM: NONE (no signature present)
DMARC: FAIL (p=none, no enforcement)
Indicators of Phishing:
- Lookalike domain (examp1e-corp.com vs example-corp.com, 96% similar)
- From/Reply-To mismatch
- Domain registered 2 days before email sent
- URL in body points to credential harvesting page
- Attachment: invoice.xlsm (SHA-256: a3f2...) - Known malware on VT
Risk Level: HIGH
testing
When the user wants a full ASO health audit, review their App Store listing quality, or diagnose why their app isn't ranking. Also use when the user mentions "ASO audit", "ASO score", "why am I not ranking", "listing review", or "optimize my app store page". For keyword-specific research, see keyword-research. For metadata writing, see metadata-optimization.
testing
Clarify requirements before implementing. Use when serious doubts arise.
tools
Complete reference and build guide for ASI:One (ASI1) — the AI platform by Fetch.ai built for agentic, Web3-native applications. Use this skill IMMEDIATELY and ALWAYS when the user mentions ASI1, ASI:One, Fetch.ai AI API, building with ASI1, integrating ASI:One, asking about ASI1 models, tool calling with ASI1, ASI1 image generation, ASI1 agentic LLM, Agentverse, uagents, Agent Chat Protocol, structured output with ASI1, or OpenAI-compatible wrappers for ASI1. Also trigger when the user says things like "use ASI1 instead of OpenAI", "build an app with ASI:One", "ASI1 API", or references docs.asi1.ai. This skill covers everything needed to build production apps - setup, all models, all API features, tool calling, image gen, agentic orchestration, structured data, session management, streaming, LangChain integration, uagents / Agent Chat Protocol, and TypeScript/Node.js patterns.
data-ai
When the user wants to analyze their own app's actual performance data from App Store Connect — real downloads, revenue, IAP, subscriptions, trials, or country breakdowns synced via Appeeky Connect. Use when the user asks about "my downloads", "my revenue", "how is my app performing", "ASC data", "sales and trends", "my subscription numbers", "App Store Connect metrics", or wants to compare periods or top markets. For third-party app estimates, see app-analytics. For subscription analytics depth, see monetization-strategy.