.claude/skills/baoyu-url-to-markdown/SKILL.md
Fetch any URL and convert to markdown using Chrome CDP. Saves the rendered HTML snapshot alongside the markdown, uses an upgraded Defuddle pipeline with better web-component handling and YouTube transcript extraction, and automatically falls back to the pre-Defuddle HTML-to-Markdown pipeline when needed. If local browser capture fails entirely, it can fall back to the hosted defuddle.md API. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.
npx skillsauth add wallacedobbs428/thecalltaker baoyu-url-to-markdownInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetches any URL via Chrome CDP, saves the rendered HTML snapshot, and converts it to clean markdown.
Important: All scripts are located in the scripts/ subdirectory of this skill.
Agent Execution Instructions:
{baseDir}{baseDir}/scripts/<script-name>.ts${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun{baseDir} and ${BUN_X} in this document with actual valuesScript Reference:
| Script | Purpose |
|--------|---------|
| scripts/main.ts | CLI entry point for URL fetching |
| scripts/html-to-markdown.ts | Markdown conversion entry point and converter selection |
| scripts/defuddle-converter.ts | Defuddle-based conversion |
| scripts/legacy-converter.ts | Pre-Defuddle legacy extraction and markdown conversion |
| scripts/markdown-conversion-shared.ts | Shared metadata parsing and markdown document helpers |
Check EXTEND.md existence (priority order):
# macOS, Linux, WSL, Git Bash
test -f .baoyu-skills/baoyu-url-to-markdown/EXTEND.md && echo "project"
test -f "${XDG_CONFIG_HOME:-$HOME/.config}/baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "xdg"
test -f "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "user"
# PowerShell (Windows)
if (Test-Path .baoyu-skills/baoyu-url-to-markdown/EXTEND.md) { "project" }
$xdg = if ($env:XDG_CONFIG_HOME) { $env:XDG_CONFIG_HOME } else { "$HOME/.config" }
if (Test-Path "$xdg/baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "xdg" }
if (Test-Path "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md") { "user" }
┌────────────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ Project directory │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ User home │ └────────────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ MUST run first-time setup (see below) — do NOT silently create defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports: Download media by default | Default output directory | Default capture mode | Timeout settings
CRITICAL: When EXTEND.md is not found, you MUST use AskUserQuestion to ask the user for their preferences before creating EXTEND.md. NEVER create EXTEND.md with defaults without asking. This is a BLOCKING operation — do NOT proceed with any conversion until setup is complete.
Use AskUserQuestion with ALL questions in ONE call:
Question 1 — header: "Media", question: "How to handle images and videos in pages?"
Question 2 — header: "Output", question: "Default output directory?"
Question 3 — header: "Save", question: "Where to save preferences?"
After user answers, create EXTEND.md at the chosen location, confirm "Preferences saved to [path]", then continue.
Full reference: references/config/first-time-setup.md
| Key | Default | Values | Description |
|-----|---------|--------|-------------|
| download_media | ask | ask / 1 / 0 | ask = prompt each time, 1 = always download, 0 = never |
| default_output_dir | empty | path or empty | Default output directory (empty = ./url-to-markdown/) |
EXTEND.md → CLI mapping:
| EXTEND.md key | CLI argument | Notes |
|---------------|-------------|-------|
| download_media: 1 | --download-media | |
| default_output_dir: ./posts/ | --output-dir ./posts/ | Directory path. Do NOT pass to -o (which expects a file path) |
Value priority:
--download-media, -o, --output-dir)-captured.html filedefuddle.md/<url> and still save markdown# Auto mode (default) - capture when page loads
${BUN_X} {baseDir}/scripts/main.ts <url>
# Wait mode - wait for user signal before capture
${BUN_X} {baseDir}/scripts/main.ts <url> --wait
# Save to specific file
${BUN_X} {baseDir}/scripts/main.ts <url> -o output.md
# Save to a custom output directory (auto-generates filename)
${BUN_X} {baseDir}/scripts/main.ts <url> --output-dir ./posts/
# Download images and videos to local directories
${BUN_X} {baseDir}/scripts/main.ts <url> --download-media
| Option | Description |
|--------|-------------|
| <url> | URL to fetch |
| -o <path> | Output file path — must be a file path, not directory (default: auto-generated) |
| --output-dir <dir> | Base output directory — auto-generates {dir}/{domain}/{slug}.md (default: ./url-to-markdown/) |
| --wait | Wait for user signal before capturing |
| --timeout <ms> | Page load timeout (default: 30000) |
| --download-media | Download image/video assets to local imgs/ and videos/, and rewrite markdown links to local relative paths |
| Mode | Behavior | Use When |
|------|----------|----------|
| Auto (default) | Capture on network idle | Public pages, static content |
| Wait (--wait) | User signals when ready | Login-required, lazy loading, paywalls |
Wait mode workflow:
--wait → script outputs "Press Enter when ready"Each run saves two files side by side:
url, title, description, author, published, optional coverImage, and captured_at, followed by converted markdown content*-captured.html, containing the rendered page HTML captured from ChromeWhen Defuddle or page metadata provides a language hint, the markdown front matter also includes language.
The HTML snapshot is saved before any markdown media localization, so it stays a faithful capture of the page DOM used for conversion.
If the hosted defuddle.md API fallback is used, markdown is still saved, but there is no local -captured.html snapshot for that run.
Default: url-to-markdown/<domain>/<slug>.md
With --output-dir ./posts/: ./posts/<domain>/<slug>.md
HTML snapshot path uses the same basename:
url-to-markdown/<domain>/<slug>-captured.html
./posts/<domain>/<slug>-captured.html
<slug>: From page title or URL path (kebab-case, 2-6 words)
Conflict resolution: Append timestamp <slug>-YYYYMMDD-HHMMSS.md
When --download-media is enabled:
imgs/ next to the markdown filevideos/ next to the markdown fileConversion order:
https://defuddle.md/<url> API and save its markdown output directlyCLI output will show:
Converter: defuddle when Defuddle succeedsConverter: legacy:... plus Fallback used: ... when fallback was neededConverter: defuddle-api when local browser capture failed and the hosted API was used insteadBased on download_media setting in EXTEND.md:
| Setting | Behavior |
|---------|----------|
| 1 (always) | Run script with --download-media flag |
| 0 (never) | Run script without --download-media flag |
| ask (default) | Follow the ask-each-time flow below |
--download-media → markdown savedhttps:// in image/video links)AskUserQuestion:
--download-media (overwrites markdown with localized links)| Variable | Description |
|----------|-------------|
| URL_CHROME_PATH | Custom Chrome executable path |
| URL_DATA_DIR | Custom data directory |
| URL_CHROME_PROFILE_DIR | Custom Chrome profile directory |
Troubleshooting: Chrome not found → set URL_CHROME_PATH. Timeout → increase --timeout. Complex pages → try --wait mode. If markdown quality is poor, inspect the saved -captured.html and check whether the run logged a legacy fallback.
--wait and capture after the watch page is fully hydrated.https://defuddle.md/<url>. In shell form: curl https://defuddle.md/stephango.comCustom configurations via EXTEND.md. See Preferences section for paths and supported options.
documentation
Agentic memory system for writers - track characters, relationships, scenes, and themes
tools
Automate repetitive development tasks and workflows. Use when creating build scripts, automating deployments, or setting up development workflows. Handles npm scripts, Makefile, GitHub Actions workflows, and task automation.
development
Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices". Fetches latest Vercel guidelines and checks files against all rules.
development
Implement web accessibility (a11y) standards following WCAG 2.1 guidelines. Use when building accessible UIs, fixing accessibility issues, or ensuring compliance with disability standards. Handles ARIA attributes, keyboard navigation, screen readers, semantic HTML, and accessibility testing.