skills/reduce-page/SKILL.md
Reduce a webpage to a structural skeleton with semantic tokens. Two-phase pipeline: Phase 1 injects a browser script that tokenizes content ({TEXT}, {HEADING:n}, {IMAGE:WxH}, {CTA:label}, {LINK:label}, {INPUT:type}, {VIDEO}, {ICON}). Phase 2 applies LLM structural reasoning to collapse repeated patterns ({REPEAT:N}), remove decorative wrappers, strip utility classes, and produce skeleton.html + manifest.json. Use when migrating pages to EDS, analyzing page structure, extracting page blueprints, or preparing input for GenAI block generation. Triggers on: reduce page, page skeleton, page blueprint, extract structure, tokenize page, page reduction, structural skeleton, reduce URL.
npx skillsauth add catalan-adobe/skills reduce-pageInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Reduce any webpage to a minimal structural skeleton by combining browser-based content tokenization (Phase 1) with LLM structural reasoning (Phase 2).
Phase 1 (browser script): Injects the blueprint detector + tokenizer
into the live page. Detects sections, cleans the DOM (removes scripts,
invisible elements, styling tags, comments, tracking attributes), then
replaces content with tokens. Output: JSON with tokenizedHtml per section.
Phase 2 (you, the agent): Applies structural reasoning to the tokenized HTML — collapses repeated patterns, removes decorative wrappers, strips utility CSS classes, and generates the final skeleton + manifest.
/reduce-page <URL>
Optional flags the user may provide:
--phase1-only — stop after Phase 1, output raw tokenized JSON--output <dir> — write files to a specific directory (default: cwd)if [[ -n "${CLAUDE_SKILL_DIR:-}" ]]; then
BUNDLE="${CLAUDE_SKILL_DIR}/scripts/reduce-page-bundle.js"
else
BUNDLE="$(find ~/.claude \
-path "*/reduce-page/scripts/reduce-page-bundle.js" \
-type f 2>/dev/null | head -1)"
fi
Verify the path is non-empty before continuing. If missing, report an error: the skill's scripts directory needs the combined bundle.
Use the browser-universal skill to detect what browser tool is
available (playwright-cli, cmux-browser, or CDP). If browser-universal
is not available, fall back to playwright-cli directly.
page-prep skill is available, invoke it to dismiss cookie
banners, GDPR consent modals, and other overlays[...document.body.querySelectorAll('*')].forEach(el => {
const s = window.getComputedStyle(el);
if (s.position === 'fixed' || s.position === 'sticky')
el.style.position = 'relative';
});
Inject the bundle script found in Step 0 into the page. The bundle contains both the blueprint detector and the reducer.
After injection, execute in the page context:
// Run detection
await window.xp.detectSections(document.body, window, {
autoDetect: true,
highlightBoxes: false,
highlightSections: false,
});
// Run Phase 1 reduction (on clones — non-destructive)
const result = window.__reduceForSkill(document.body, window);
JSON.stringify(result);
Parse the returned JSON. This is the Phase 1 output:
{
"url": "https://example.com",
"title": "Page Title",
"viewport": { "width": 1280 },
"templateHash": "...",
"sections": [
{
"index": 0,
"sectionType": "hero",
"xpath": "/html/body/main/section[1]",
"xpathWithDetails": "//section[@class='hero']",
"tokenizedHtml": "<section class='hero'>...</section>",
"layout": { "numCols": 2, "numRows": 1 },
"features": ["hasHeading", "hasBackgroundImage", "hasCTA"],
"section": null
}
]
}
If --phase1-only was requested, write this JSON to
phase1-output.json and stop.
Read the Phase 2 rules and apply them to
each section's tokenizedHtml.
Process each section:
{REPEAT:N}data-analytics-*, etc.{FORM:N-fields}{NAV:N-items}{REPEAT:N}unknown with tab panels → tabs)skeleton.html — all sections with comment separators:
<!-- section:0 type:hero xpath:/html/body/main/section[1] -->
<section class="hero">
<h1>{HEADING:1}</h1>
<p>{TEXT}</p>
{CTA:Get Started}
{IMAGE:1200x600}
</section>
<!-- section:1 type:cards xpath:/html/body/main/div[2] -->
<div class="cards-container">
<div class="card">
{IMAGE:400x300}
<h3>{HEADING:3}</h3>
<p>{TEXT}</p>
<a>{LINK:Read more}</a>
</div>
<div class="card">
{IMAGE:400x300}
<h3>{HEADING:3}</h3>
<p>{TEXT}</p>
<a>{LINK:Read more}</a>
</div>
{REPEAT:4}
</div>
Pretty-print with 2-space indentation.
manifest.json — structured metadata per section. See Phase 2 rules for the full schema.
Write both files to the output directory.
Print:
These tokens are produced by the browser script (Phase 1):
| Token | Source Signal |
|-------|-------------|
| {TEXT} | Non-empty text node |
| {HEADING:n} | <h1>-<h6> tag |
| {IMAGE:WxH} | <img> with both dimensions > 64px |
| {ICON} | <img>/<svg> with either dimension ≤ 64px |
| {VIDEO} | <video> or iframe with video domain src |
| {CTA:label} | <a> with styled background/border |
| {LINK:label} | <a> with href (plain style) |
| {INPUT:type} | <input> or <textarea> |
| {SELECT:N} | <select> with N options |
playwright-cli (preferred), cmux-browser, or CDPbrowser-universal — browser layer detectionpage-prep — overlay dismissalThe bundle at scripts/reduce-page-bundle.js is built from the
site-transfer-blueprint-detector
repo. To update:
cd <detector-repo>
npm run build # builds dist/detect.js
npm run build:skill # builds dist/reduce-for-skill.js
cat dist/detect.js dist/reduce-for-skill.js > <skills-repo>/skills/reduce-page/scripts/reduce-page-bundle.js
tools
Capture a spatial hierarchy of rendered DOM elements from any webpage. Injects a pre-built script via playwright-cli that walks the DOM, detects layout grids, extracts backgrounds, prunes invisible nodes, promotes elements rendered outside their DOM parent (overlays, fixed navs, modals), and tags overlay nodes with occlusion metadata. Returns three outputs: LLM-friendly indented text, structured JSON tree, and a nodeMap mapping positional IDs to CSS selectors with background and overlay data. Use before page decomposition, overlay detection, brand extraction, or any workflow that needs structured page analysis. Triggers on: visual tree, capture tree, page structure, page hierarchy, DOM tree, capture visual, page analysis, extract tree.
tools
Summarize any video by analyzing both audio and visuals. Downloads via yt-dlp, extracts transcript (YouTube captions or Whisper), pulls scene-detected keyframes, and produces a multimodal summary with clickable timestamped YouTube links. Use this skill whenever the user wants to summarize a YouTube video, digest a talk or tutorial, get notes from a video, extract key points from a recording, or says things like "tl;dw", "summarize this video", "what's in this video", or pastes a YouTube URL and asks for a summary. Also triggers for non-YouTube URLs that yt-dlp supports.
development
Design and build web UIs with Adobe Spectrum 2 design system. Applies S2 layout principles, visual hierarchy, spacing, and component composition to produce accessible interfaces. Outputs vanilla CSS with Spectrum tokens (static pages) or Spectrum Web Components (interactive apps). Recommends tier based on complexity. Covers sp-theme setup, side-effect imports, overlay system, form patterns, --mod-* token customization, and 14 critical gotchas. Use for: spectrum 2 web, SWC, sp-button, sp-theme, build UI with spectrum, S2 layout, spectrum application, adobe design system, web component form, spectrum overlay.
development
Control Slack via CDP or headless API tokens. Navigate channels, read/send messages, search conversations, check unreads, and manage status. Two modes: CDP (Slack desktop with --remote-debugging-port) for full UI control, or headless (xoxp/xoxb token) for data operations without Slack running. Triggers on: slack, read slack, search slack, slack unreads, send slack message, slack status, navigate slack, check slack, slack messages, go to channel, slack DM.