skills/browser-probe/SKILL.md
Probe a URL with escalating headless browser configurations to detect CDN bot protection (Akamai, Cloudflare, DataDome, AWS WAF) and produce a browser-recipe.json that downstream playwright-cli consumers use to bypass blocking. Runs an automated escalation ladder: default headless → stealth script injection → system Chrome (TLS fingerprint fix) → persistent profile. Use BEFORE any playwright-cli interaction with an untrusted domain. Triggers on: browser probe, site blocked, headless blocked, CDN blocking, bot detection, browser recipe, can't load page, 403 error page, access denied.
npx skillsauth add catalan-adobe/skills browser-probeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Detect CDN bot protection blocking headless Chrome and produce a browser recipe
for downstream playwright-cli consumers. Node 22+ required. No npm
dependencies.
Run this skill before any playwright-cli interaction with a domain you
haven't tested, or when a downstream script reports a blocked page. Common
triggers:
capture-snapshot.js produces empty/error snapshotsif [[ -n "${CLAUDE_SKILL_DIR:-}" ]]; then
PROBE_DIR="${CLAUDE_SKILL_DIR}/scripts"
else
PROBE_DIR="$(dirname "$(command -v browser-probe.js 2>/dev/null || \
find ~/.claude -path "*/browser-probe/scripts/browser-probe.js" \
-type f 2>/dev/null | head -1)")"
fi
node "$PROBE_DIR/browser-probe.js" "$URL" "$OUTPUT_DIR"
The script tries up to 5 browser configurations, stopping at the first success:
navigator.webdriver, plugins, languages)HeadlessChrome from HTTP UA header via --user-agent launch arg)--browser=chrome) + JS stealth + UA override (fixes TLS fingerprint detection)Output: $OUTPUT_DIR/probe-report.json
Load probe-report.json. Check firstSuccess:
Load stealth-config.md and match the
detectedSignals array against the Provider Signature Table.
Key interpretation rules:
cloudfront-block or stealth fails but stealth-ua succeeds →
CloudFront WAF UA-based blocking (matches HeadlessChrome in HTTP
User-Agent header). Common on pharma/enterprise sites. Simple fix,
no TLS concerns. stealth-ua is the minimum working config.cloudfront without cloudfront-block → CloudFront present but not
actively blocking. Default config may work.akamai-server or akamai-bot-manager → TLS fingerprint blocking.
System Chrome is the fix. Stealth + UA alone is insufficient.cloudflare-ray without cloudflare-challenge → Cloudflare present
but not actively blocking. Default config may work.cloudflare-challenge → Active JS challenge. System Chrome + stealth
datadome → Aggressive detection. System Chrome + stealth + UA required.aws-waf → Usually UA-based. Stealth + UA often sufficient.Write browser-recipe.json to $OUTPUT_DIR:
{
"url": "<probed URL>",
"generated": "<ISO timestamp>",
"cliConfig": {
"browser": {
"browserName": "chromium",
"launchOptions": { "channel": "<from firstSuccess step>" }
}
},
"stealthInitScript": "<full script from stealth-config.md if stealth was needed>",
"notes": "<1-2 sentence explanation of what was detected and why this config>"
}
Config mapping from firstSuccess:
| firstSuccess | cliConfig.launchOptions | stealthInitScript |
|---|---|---|
| default | {} (no channel, no args) | null (not needed) |
| stealth | {} (no channel, no args) | Full stealth script from reference |
| stealth-ua | { "args": ["--user-agent=<realistic UA>"] } | Full stealth script from reference |
| chrome | { "channel": "chrome", "args": ["--user-agent=<realistic UA>"] } | Full stealth script from reference |
| persistent | { "channel": "chrome", "args": ["--user-agent=<realistic UA>"] } | Full stealth script from reference |
If firstSuccess is persistent, add a "persistent": true field to the
recipe so consumers know to use --persistent.
If a configuration worked:
Browser probe complete for <url>.
Working config: <firstSuccess>
Detected: <detectedSignals or "no bot protection detected">
Recipe: <path to browser-recipe.json>
If all configurations failed:
Browser probe failed for <url>. No headless configuration could load the page.
Tried: default, stealth, stealth-ua, chrome, persistent
Detected signals: <detectedSignals>
Options:
1. Use --headed flag for manual browser interaction
2. Provide pre-captured data (DOM snapshot, screenshots) manually
3. Check if the URL requires authentication or VPN access
Do NOT produce a recipe when all steps fail. Do NOT silently continue with a broken configuration.
Any script using playwright-cli can consume browser-recipe.json:
cliConfig to a temp file (e.g., /tmp/probe-cli-config.json)stealthInitScript, write it to a temp file and add
it to the config's browser.initScript array (do NOT use
playwright-cli eval — eval only accepts pure expressions, not
multi-statement scripts)--config=/tmp/probe-cli-config.json to playwright-cli opengoto <url> and workflowIf recipe has "persistent": true, also pass --persistent to open.
tools
Reduce a webpage to a structural skeleton with semantic tokens. Two-phase pipeline: Phase 1 injects a browser script that tokenizes content ({TEXT}, {HEADING:n}, {IMAGE:WxH}, {CTA:label}, {LINK:label}, {INPUT:type}, {VIDEO}, {ICON}). Phase 2 applies LLM structural reasoning to collapse repeated patterns ({REPEAT:N}), remove decorative wrappers, strip utility classes, and produce skeleton.html + manifest.json. Use when migrating pages to EDS, analyzing page structure, extracting page blueprints, or preparing input for GenAI block generation. Triggers on: reduce page, page skeleton, page blueprint, extract structure, tokenize page, page reduction, structural skeleton, reduce URL.
tools
Capture a spatial hierarchy of rendered DOM elements from any webpage. Injects a pre-built script via playwright-cli that walks the DOM, detects layout grids, extracts backgrounds, prunes invisible nodes, promotes elements rendered outside their DOM parent (overlays, fixed navs, modals), and tags overlay nodes with occlusion metadata. Returns three outputs: LLM-friendly indented text, structured JSON tree, and a nodeMap mapping positional IDs to CSS selectors with background and overlay data. Use before page decomposition, overlay detection, brand extraction, or any workflow that needs structured page analysis. Triggers on: visual tree, capture tree, page structure, page hierarchy, DOM tree, capture visual, page analysis, extract tree.
tools
Summarize any video by analyzing both audio and visuals. Downloads via yt-dlp, extracts transcript (YouTube captions or Whisper), pulls scene-detected keyframes, and produces a multimodal summary with clickable timestamped YouTube links. Use this skill whenever the user wants to summarize a YouTube video, digest a talk or tutorial, get notes from a video, extract key points from a recording, or says things like "tl;dw", "summarize this video", "what's in this video", or pastes a YouTube URL and asks for a summary. Also triggers for non-YouTube URLs that yt-dlp supports.
development
Design and build web UIs with Adobe Spectrum 2 design system. Applies S2 layout principles, visual hierarchy, spacing, and component composition to produce accessible interfaces. Outputs vanilla CSS with Spectrum tokens (static pages) or Spectrum Web Components (interactive apps). Recommends tier based on complexity. Covers sp-theme setup, side-effect imports, overlay system, form patterns, --mod-* token customization, and 14 critical gotchas. Use for: spectrum 2 web, SWC, sp-button, sp-theme, build UI with spectrum, S2 layout, spectrum application, adobe design system, web component form, spectrum overlay.