skills/web-scraper/SKILL.md
Modern web scraping with structured data extraction. Fetch web pages, extract content using CSS selectors, parse structured data (JSON-LD, Open Graph, meta tags), and handle pagination.
npx skillsauth add sahiixx/moltworker web-scraperInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Modern web scraping with intelligent content extraction.
node /path/to/skills/web-scraper/scripts/fetch.js https://example.com
node /path/to/skills/web-scraper/scripts/extract.js https://example.com --selector "h1,h2,p"
node /path/to/skills/web-scraper/scripts/metadata.js https://example.com
Fetch web page content with smart extraction.
Usage:
node fetch.js <url> [OPTIONS]
Options:
--output <fmt> - Output format: text, html, markdown (default: text)--timeout <ms> - Request timeout (default: 30000)--user-agent <ua> - Custom User-Agent string--headers <json> - Custom headers as JSON--follow - Follow redirects (default: true)Extract specific elements using CSS selectors.
Usage:
node extract.js <url> --selector <css> [OPTIONS]
Options:
--selector <css> - CSS selector (required)--attr <name> - Extract attribute instead of text--multiple - Return all matches (default: first only)--json - Output as JSON arrayExtract structured metadata from pages.
Usage:
node metadata.js <url> [OPTIONS]
Extracts:
Extract and analyze links from a page.
Usage:
node links.js <url> [OPTIONS]
Options:
--internal - Only internal links--external - Only external links--filter <pattern> - Filter by URL pattern--format <fmt> - Output: json, csv, listParse and process XML sitemaps.
Usage:
node sitemap.js <url> [OPTIONS]
Options:
--discover - Auto-discover sitemap from robots.txt--filter <pattern> - Filter URLs by pattern--limit <n> - Limit number of URLsnode fetch.js https://blog.example.com/article --output markdown
node extract.js https://shop.example.com --selector "a.product-link" --attr href --multiple
node metadata.js https://example.com
Output:
{
"title": "Example Page",
"description": "Page description",
"openGraph": {
"title": "Example OG Title",
"image": "https://example.com/image.jpg",
"type": "website"
}
}
node links.js https://example.com --external --format csv
node sitemap.js https://example.com/sitemap.xml --filter "/blog/"
# Page Title
Main content extracted and converted to markdown...
## Section Heading
Paragraph text with [links](https://example.com).
{
"url": "https://example.com",
"selector": "h2",
"matches": [
{ "text": "First Heading", "html": "<h2>First Heading</h2>" },
{ "text": "Second Heading", "html": "<h2>Second Heading</h2>" }
],
"count": 2
}
{
"url": "https://example.com",
"title": "Page Title",
"description": "Meta description",
"canonical": "https://example.com/page",
"openGraph": {
"title": "OG Title",
"description": "OG Description",
"image": "https://example.com/og-image.jpg",
"type": "article"
},
"twitterCard": {
"card": "summary_large_image",
"site": "@example"
},
"jsonLd": [
{ "@type": "Article", "headline": "Article Title" }
]
}
cloudflare-browser skilldevelopment
System monitoring and diagnostics. Get CPU, memory, disk, network stats, process information, environment details, and health checks for services and endpoints.
development
Persistent key-value storage for notes, reminders, and user preferences. Store and retrieve information across conversations using a simple JSON-based storage system.
tools
Make HTTP requests to external APIs. Supports GET, POST, PUT, DELETE with JSON and form data. Use for fetching data, calling APIs, and webhooks.
tools
File system utilities for reading, writing, listing, and searching files. Includes tree view, file search by pattern, and text search within files.