skills/cf-browser/SKILL.md
Browse and scrape websites using Cloudflare's Browser Rendering REST API. Use when the agent needs to fetch rendered web content, extract structured data from pages, take screenshots, or scrape specific elements via CSS selectors. Triggers on tasks like "scrape this site", "get listings from this page", "extract data from this URL", "take a screenshot of this page", "browse this website", or any task requiring headless browser access to read, crawl, or extract information from live web pages. Also use when WebFetch is insufficient (JS-heavy sites, SPAs, pages requiring cookies, or when structured extraction is needed).
npx skillsauth add rarestg/rarestg-skills cf-browserInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Browse and scrape the web via Cloudflare's Browser Rendering REST API. Every call is a single POST request — no browser setup, no Puppeteer scripts.
Requires two env vars (confirm they're set before making calls):
CF_ACCOUNT_ID — Cloudflare account IDCF_API_TOKEN — API token with Browser Rendering - Edit permissionUse cfbr.sh for all API calls. It handles auth headers and the base URL:
# JSON endpoints
cfbr.sh <endpoint> '<json_body>'
# Screenshot (binary) — optional third arg for output filename
cfbr.sh screenshot '<json_body>' output.png
| Goal | Endpoint | When to use |
|---|---|---|
| Read page content for analysis | markdown | Default choice — clean, token-efficient |
| Extract specific elements | scrape | Know the CSS selectors for what you need |
| Extract structured data with AI | json | Need typed objects, don't know exact selectors |
| Get full rendered DOM | content | Need raw HTML for parsing or debugging |
| Discover pages / crawl | links | Building a sitemap or finding subpages |
| Visual inspection | screenshot | Need to see the page layout or debug visually |
| DOM + visual in one shot | snapshot | Need both HTML and a screenshot |
For full endpoint details and parameters, see api.md.
Follow this sequence when scraping a site for structured data (e.g. rental listings, product catalogs, job boards):
Start with markdown to see what content is on the page and how it's structured:
cfbr.sh markdown '{"url":"https://target-site.com/listings", "gotoOptions":{"waitUntil":"networkidle0"}}'
If the page is an SPA or loads content dynamically, networkidle0 ensures JS finishes executing. If you know a specific element that signals content is ready, use waitForSelector instead — it's faster:
{"url":"...", "waitForSelector": ".listing-card"}
From the markdown/HTML, identify repeating patterns (listing cards, table rows, etc.) and their CSS selectors. If unclear from markdown alone, use screenshot to visually inspect:
cfbr.sh screenshot '{"url":"https://target-site.com/listings", "screenshotOptions":{"fullPage":true}, "gotoOptions":{"waitUntil":"networkidle0"}}' listings.png
Option A: CSS selectors (when you know the DOM structure)
cfbr.sh scrape '{
"url": "https://target-site.com/listings",
"gotoOptions": {"waitUntil": "networkidle0"},
"elements": [
{"selector": ".listing-card .title"},
{"selector": ".listing-card .price"},
{"selector": ".listing-card .address"},
{"selector": ".listing-card a"}
]
}'
The scrape endpoint returns text, html, attributes (including href), and position/dimensions for each match. Correlate results across selectors by index (first title matches first price, etc.).
Option B: AI extraction (when structure is complex or unknown)
cfbr.sh json '{
"url": "https://target-site.com/listings",
"gotoOptions": {"waitUntil": "networkidle0"},
"prompt": "Extract all rental listings with title, price, address, bedrooms, and link",
"response_format": {
"type": "json_schema",
"schema": {
"type": "object",
"properties": {
"listings": {
"type": "array",
"items": {
"type": "object",
"properties": {
"title": {"type": "string"},
"price": {"type": "string"},
"address": {"type": "string"},
"bedrooms": {"type": "string"},
"url": {"type": "string"}
},
"required": ["title", "price"]
}
}
}
}
}
}'
Prefer scrape when selectors are clear — it's deterministic and free. Use json when the page structure is messy or you need semantic interpretation (incurs Workers AI charges).
Use links to find pagination URLs:
cfbr.sh links '{"url":"https://target-site.com/listings"}'
Look for ?page=2, next, or load-more patterns. Repeat extraction for each page.
Infinite-scroll pages are a limitation — the API is stateless (one request = one browser session), so there's no way to scroll, wait for new content to load, and then extract in a single call. For these pages, look for an underlying API or URL parameters (e.g. ?page=2, ?offset=20) that serve paginated data directly.
SPA / empty results — Add "gotoOptions": {"waitUntil": "networkidle0"} or "waitForSelector": "<selector>".
Slow pages — Increase timeout: "gotoOptions": {"timeout": 60000}.
Heavy pages — Strip unnecessary resources:
{"rejectResourceTypes": ["image", "stylesheet", "font", "media"]}
Auth-gated pages — Pass session cookies:
{"cookies": [{"name": "session", "value": "abc123", "domain": "target-site.com", "path": "/"}]}
Bot detection — Cloudflare Browser Rendering is always identified as a bot. The userAgent field changes what the site sees but will not bypass bot protection. If a site blocks the request, there is no workaround via this API.
markdown is the best default for content extraction — it's clean, compact, and LLM-ready.networkidle0 or waitForSelector on any modern site. Without it you'll get incomplete content.rejectResourceTypes dramatically speeds up text-only operations. Always strip images/fonts/stylesheets when you only need text.scrape results are ordered by DOM position — correlate across selectors by array index.tools
Break large code changes into small, stacked pull requests using vanilla git and the gh CLI. Auto-trigger when implementing a feature or change that spans multiple logical steps, touches several files, or will exceed ~200 changed lines. Also trigger on "stack PRs", "break this into smaller PRs", "stacked diffs", or "create a PR stack". Do NOT trigger for single-file fixes, small bug fixes, or changes under ~200 lines that are a single logical unit.
development
Export a PR's clean inline review comments, CodeRabbit outside-diff comments, and CodeRabbit nitpicks into local files, then triage review feedback through a stack-aware orchestrator workflow with durable reply drafts for follow-up PRs. Use when given a GitHub pull request URL and asked to work through review comments without relying on noisy raw API blobs.
development
Run reusable Graphify-led architecture analysis for codebases using semantic graphs, optional subagent extraction, graph synthesis, source-search validation, graph-shape review, and follow-up refactor planning. Use when asked to analyze repo architecture, god nodes, surprising edges, topology, module boundaries, or graph-derived cleanup/refactor opportunities.
testing
Run a meaningful coding ticket through a delegated delivery workflow: tighten the ticket, assign one ticket owner, delegate implementation, get an independent review, scan for high-value simplification, validate the result, and return a compact outcome packet. Use when the user wants structured agent execution with clear scope, ownership, and review rather than a single-pass implementation. Triggers on: "delegate this ticket", "use a sub-PM", "run this through worker and reviewer", "own this ticket end to end", "send this for independent review", or "close this ticket out with review". Skip trivial fixes and tasks that are still too vague to delegate.