skills/bun-guides-html-rewriter-extract-links/SKILL.md
Extract links from a webpage using HTMLRewriter
npx skillsauth add jarle/bun-skills Bun Extract links from a webpage using HTMLRewriterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Bun's HTMLRewriter API can be used to efficiently extract links from HTML content. It works by chaining together CSS selectors to match the elements, text, and attributes you want to process. This is a simple example of how to extract links from a webpage. You can pass .transform a Response, Blob, or string.
async function extractLinks(url: string) {
const links = new Set<string>();
const response = await fetch(url);
const rewriter = new HTMLRewriter().on("a[href]", {
element(el) {
const href = el.getAttribute("href");
if (href) {
links.add(href);
}
},
});
// Wait for the response to be processed
await rewriter.transform(response).blob();
console.log([...links]); // ["https://bun.com", "/docs", ...]
}
// Extract all links from the Bun website
await extractLinks("https://bun.com");
When scraping websites, you often want to convert relative URLs (like /docs) to absolute URLs. Here's how to handle URL resolution:
async function extractLinksFromURL(url: string) {
const response = await fetch(url);
const links = new Set<string>();
const rewriter = new HTMLRewriter().on("a[href]", {
element(el) {
const href = el.getAttribute("href");
if (href) {
// Convert relative URLs to absolute // [!code ++]
try { // [!code ++]
const absoluteURL = new URL(href, url).href; // [!code ++]
links.add(absoluteURL);
} catch { // [!code ++]
links.add(href); // [!code ++]
} // [!code ++]
}
},
});
// Wait for the response to be processed
await rewriter.transform(response).blob();
return [...links];
}
const websiteLinks = await extractLinksFromURL("https://example.com");
See Docs > API > HTMLRewriter for complete documentation on HTML transformation with Bun.
development
Using TypeScript with Bun, including type definitions and compiler options
development
Learn how to write tests using Bun's Jest-compatible API with support for async tests, timeouts, and various test modifiers
testing
Learn how to use snapshot testing in Bun to save and compare output between test runs
testing
Learn about Bun test's runtime integration, environment variables, timeouts, and error handling