skills/markdown-fetch/SKILL.md
Fetch and extract web content as clean Markdown when provided with URLs. Use this skill whenever a user provides a URL (http/https link) that needs to be read, analyzed, summarized, or extracted. Converts web pages to Markdown with 80% fewer tokens than raw HTML. Handles all content types including JS-heavy sites, documentation, articles, and blog posts. Supports three conversion methods (auto, AI, browser rendering). Always use this instead of web_fetch when working with URLs - it's more efficient and provides cleaner output.
npx skillsauth add ckorhonen/claude-skills markdown-fetchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Efficiently fetch web content as clean Markdown using the markdown.new service.
This skill should trigger automatically when:
# Fetch any URL
scripts/fetch.sh "https://example.com"
# Use browser rendering for JS-heavy sites
scripts/fetch.sh "https://example.com" --method browser
# Retain images in output
scripts/fetch.sh "https://example.com" --retain-images
When a user says:
auto (default) - Try Markdown-first, fall back to AI or browser as needed
ai - Use Cloudflare Workers AI for conversion
browser - Full browser rendering for JS-heavy content
--method <auto|ai|browser> - Conversion method
--retain-images - Keep image references in output
--output <file> - Save to file instead of stdout
Returns clean Markdown with metadata:
---
title: Page Title
url: https://example.com
method: auto
duration_ms: 725
fetched_at: 2026-03-07T12:00:00Z
---
# Content here...
The service handles:
Symptom: Empty output, truncated content, or "Error: No content in response"
Root Causes:
--method browser)Debugging & Workarounds:
# 1. Check if the page loads at all
curl -s "https://example.com" | head -c 500
# 2. Try browser rendering (handles JS-heavy sites)
scripts/fetch.sh "https://example.com" --method browser
# 3. Check what the service is getting
scripts/fetch.sh "https://example.com" 2>&1 | head -20
# 4. If still empty, the page may use iframes or require auth
Best Practice: Always use --method browser for single-page apps, social platforms, or content-heavy dashboards.
Symptom: HTTP 429 (Too Many Requests), 403 (Forbidden), or connection timeouts
Root Causes:
Debugging & Workarounds:
# 1. Check HTTP status code
curl -s -w "\n%{http_code}\n" -o /dev/null "https://example.com"
# 2. Test with curl directly (bypasses service)
curl -H "User-Agent: Mozilla/5.0" "https://example.com" | head -c 500
# 3. Add delays between requests (if fetching multiple URLs)
for url in url1 url2 url3; do
scripts/fetch.sh "$url"
sleep 5 # Wait 5 seconds between requests
done
# 4. If blocked, try from different IP or use residential proxy
# For markdown.new service: no built-in proxy rotation; consider mirror services
Best Practice: Respect site robots.txt and use sensible request delays when processing multiple pages. Some sites require retry-after headers.
Symptom: Garbled text, mojibake (weird characters), missing non-ASCII content (emojis, accents, CJK)
Root Causes:
Debugging & Workarounds:
# 1. Check the page's declared charset
curl -sI "https://example.com" | grep -i "charset"
# 2. Save raw HTML and inspect encoding
curl -s "https://example.com" > raw.html
file raw.html
head -c 200 raw.html | od -c # Show raw bytes
# 3. Try the markdown.new service and check output
scripts/fetch.sh "https://example.com" | file - # Check output encoding
# 4. If output is garbled, convert explicitly
scripts/fetch.sh "https://example.com" | iconv -f UTF-8 -t UTF-8 > fixed.md
Best Practice: Always verify output is valid UTF-8. If encountering garbled content, the page likely has a charset mismatch — notify the site owner or use --method browser (more robust).
Symptom: Page returns but content is missing, loads as empty, or shows loading spinners instead of data
Root Causes:
auto method only fetches HTML skeleton without executing JSDebugging & Workarounds:
# 1. Fetch with browser rendering (executes JS)
scripts/fetch.sh "https://example.com" --method browser
# 2. Check what the auto method returns
scripts/fetch.sh "https://example.com" --method auto
# 3. Inspect page source to confirm JS-rendering
curl -s "https://example.com" | grep -i "react\|vue\|angular\|<div id=\"app\""
# 4. If content still missing, page may require interaction
# (e.g., clicking buttons, scrolling) — no workaround in current service
Best Practice: Use --method browser by default for modern web apps, news sites, and any site built in the last 5 years. The auto method is faster but misses JS-rendered content.
Symptom: Returns login page instead of content, shows "Subscribe to read more", or HTTP 401/403
Root Causes:
Debugging & Workarounds:
# 1. Check if curl can access without auth
curl -s "https://example.com/article" | head -c 300
# 2. Identify authentication method
curl -sI "https://example.com" | grep -i "auth\|set-cookie\|www-authenticate"
# 3. Try public/preview version if available
# For paywalled sites, look for preview URLs or RSS feeds
curl -s "https://example.com/rss" | head -c 500
# 4. Check for paywall indicators
curl -s "https://example.com/article" | grep -i "paywall\|subscribe\|login required"
# 5. If member-only, request your credentials in the conversation
# (Don't embed in scripts — use prompt)
Best Practice: This skill cannot bypass authentication or paywalls by design. For protected content:
Symptom: HTTP error codes (5xx), timeout, or malformed JSON responses
Root Causes:
Debugging & Workarounds:
# 1. Check service health
curl -s "https://markdown.new/health" 2>&1 | head
# 2. Verify request format
jq -n --arg url "https://example.com" --arg method "auto" \
'{url: $url, method: $method, retain_images: false}' | jq .
# 3. Retry with exponential backoff
for attempt in {1..3}; do
result=$(scripts/fetch.sh "https://example.com" 2>&1)
if [[ $? -eq 0 ]]; then
echo "$result"
break
fi
sleep $((2 ** attempt)) # 2s, 4s, 8s
done
# 4. If service is down, no workaround available
# — try again later or use alternative (curl, browser, etc.)
Best Practice: The service is stateless and generally reliable. If failures persist, fall back to curl directly or ask the user for an alternative source.
No content returned?
├─ Check if site requires JavaScript
│ └─ YES → Use --method browser
│ └─ NO → Continue
├─ Check if site requires authentication
│ └─ YES → Use authenticated request (see Paywall section)
│ └─ NO → Continue
├─ Check for encoding issues
│ └─ YES → Garbled text → likely site charset mismatch
│ └─ NO → Continue
└─ Service may be down → Wait and retry
Getting rate-limited (HTTP 429)?
├─ Add delays between requests (5-10 seconds)
└─ If persistent, try different time or request fewer pages
Truncated or broken output?
└─ Try --method browser (more robust but slower)
Can't access paywalled content?
└─ No workaround — try public preview, RSS, or archive services
documentation
Create or expand an Idea.md / IDEA.md file from a rough description, existing repo, conversation history, notes, or other early-stage product inputs. Use when the user asks to "write an Idea.md", "turn this into an idea file", "capture this product idea", "expand this concept", or wants a repo-grounded concept brief before validation, PRD, or implementation work.
development
Write structured implementation plans from specs or requirements before touching code. Use when given a spec, requirements doc, or feature description, when user says "plan this out", "write a plan for", "how should we implement", or before starting any multi-step coding task.
testing
Expert guidance for video editing with ffmpeg, encoding best practices, and quality optimization. Use when working with video files, transcoding, remuxing, encoding settings, color spaces, or troubleshooting video quality issues.
development
Opinionated constraints for building better interfaces with agents. Use when building UI components, implementing animations, designing layouts, reviewing frontend accessibility, or working with Tailwind CSS, motion/react, or accessible primitives like Radix/Base UI.