skills/firecrawl-automation/SKILL.md
Automate web crawling and data extraction with Firecrawl -- scrape pages, crawl sites, extract structured data, batch scrape URLs, and map website structures through the Composio Firecrawl integration
npx skillsauth add ranbot-ai/awesome-skills Firecrawl AutomationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run Firecrawl web crawling and extraction directly from Claude Code. Scrape individual pages, crawl entire sites, extract structured data with AI, batch process URL lists, and map website structures without leaving your terminal.
Toolkit docs: composio.dev/toolkits/firecrawl
https://rube.app/mcp
Fetch content from a URL in multiple formats with optional browser actions for dynamic pages.
Tool: FIRECRAWL_SCRAPE
Key parameters:
url (required) -- fully qualified URL to scrapeformats -- output formats: markdown (default), html, rawHtml, links, screenshot, jsononlyMainContent (default true) -- extract main content only, excluding nav/footer/adswaitFor -- milliseconds to wait for JS rendering (default 0)timeout -- max wait in ms (default 30000)actions -- browser actions before scraping (click, write, wait, press, scroll)includeTags / excludeTags -- filter by HTML tagsjsonOptions -- for structured extraction with schema and/or promptExample prompt: "Scrape the main content from https://example.com/pricing as markdown"
Discover and scrape multiple pages from a website with configurable depth, path filters, and concurrency.
Tool: FIRECRAWL_CRAWL_V2
Key parameters:
url (required) -- starting URL for the crawllimit (default 10) -- max pages to crawlmaxDiscoveryDepth -- depth limit from the root pageincludePaths / excludePaths -- regex patterns for URL pathsallowSubdomains -- include subdomains (default false)crawlEntireDomain -- follow sibling/parent links, not just children (default false)sitemap -- include (default), skip, or onlyprompt -- natural language to auto-configure crawler settingsscrapeOptions_formats -- output format for each pagescrapeOptions_onlyMainContent -- main content extraction per pageExample prompt: "Crawl the docs section of firecrawl.dev, max 50 pages, only paths matching docs"
Extract structured JSON data from web pages using AI with a natural language prompt or JSON schema.
Tool: FIRECRAWL_EXTRACT
Key parameters:
urls (required) -- array of URLs to extract from (max 10 in beta). Supports wildcards like https://example.com/blog/*prompt -- natural language description of what to extractschema -- JSON Schema defining the desired output structureenable_web_search -- allow crawling links outside initial domains (default false)At least one of prompt or schema must be provided.
Check extraction status with FIRECRAWL_EXTRACT_GET using the returned job id.
Example prompt: "Extract company name, pricing tiers, and feature lists from https://example.com/pricing"
Scrape many URLs concurrently with shared configuration for efficient bulk data collection.
Tool: FIRECRAWL_BATCH_SCRAPE
Key parameters:
urls (required) -- array of URLs to scrapeformats -- output format for all pages (default markdown)onlyMainContent (default true) -- main content extractionmaxConcurrency -- parallel scrape limitignoreInvalidURLs (default true) -- skip bad URLs instead of failing the batchlocation -- geolocation settings with country codeactions -- browser actions applied to each pageblockAds (default true) -- block advertisementsExample prompt: "Batch scrape these 20 product page URLs as markdown with ad blocking"
Discover all URLs on a website from a starting URL, useful for planning crawls or auditing site structure.
Tool: FIRECRAWL_MAP_MULTIPLE_URLS_BASED_ON_OPTIONS
Key parameters:
url (required) -- starting URL (must be https:// or http://)search -- guide URL discovery toward specific page typeslimit (default 5000, max 100000) -- max URLs to returnincludeSubdomains (default true) -- include subdomainsignoreQueryParameters (default true) -- dedupe URLs differing only by query paramssitemap -- include, skip, or onlyExample prompt: "Map all URLs on docs.example.com, focusing on API reference pages"
Track crawl progress, retrieve results, and cancel runaway jobs.
Tools: FIRECRAWL_CRAWL_GET, FIRECRAWL_GET_THE_STATUS_OF_A_CRAWL_JOB, FIRECRAWL_CANCEL_A_CRAWL_JOB
FIRECRAWL_CRAWL_GET -- get status, progress, credits used, and crawled page dataFIRECRAWL_CANCEL_A_CRAWL_JOB -- stop an active or queued crawlBoth require the cr
development
Production-grade Android app development guide covering native (Kotlin/Java), cross-platform (Flutter, RN, KMM), and hybrid architectures.
testing
Plan, orchestrate, and adversarially verify parallel AI coding agents with a dynamic multi-agent workflow engine.
development
Generate professional, ATS-optimized CVs for FlowCV, Canva, Google Docs, or Word. Handles multi-source merging, JD targeting, seniority adaptation, and humanized rewriting. Outputs paste-ready text wi
tools
Generate hand-drawn 16:9 article illustrations with the Grav character IP, sparse annotations, and absurd but clear visual metaphors.