.claude/skills/ts-firecrawl/SKILL.md
Convert any website into clean, structured data with Firecrawl — API-first web scraping service. Use when someone asks to "turn a website into markdown", "scrape website for LLM", "Firecrawl", "extract website content as clean text", "crawl and convert to structured data", or "scrape website for RAG". Covers single-page scraping, full-site crawling, structured extraction, and LLM-ready output.
npx skillsauth add eliferjunior/Claude firecrawlInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Firecrawl is an API that scrapes websites and returns clean, LLM-ready content. Point it at any URL and get back markdown, HTML, or structured data — no selectors to write, no anti-bot handling, no browser management. It handles JavaScript rendering, proxy rotation, and content extraction automatically. Built for feeding web content into LLMs, RAG pipelines, and data workflows.
npm install @mendable/firecrawl-js
# Or Python: pip install firecrawl-py
# Self-hosted: docker run -p 3002:3002 mendableai/firecrawl
// scrape.ts — Convert any URL to clean markdown
import FirecrawlApp from "@mendable/firecrawl-js";
const firecrawl = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY,
// apiUrl: "http://localhost:3002" // For self-hosted
});
// Scrape a single page
const result = await firecrawl.scrapeUrl("https://docs.example.com/getting-started", {
formats: ["markdown", "html"], // Get both formats
});
console.log(result.markdown); // Clean markdown content
console.log(result.metadata); // Title, description, language, etc.
// crawl.ts — Crawl an entire site
const crawlResult = await firecrawl.crawlUrl("https://docs.example.com", {
limit: 100, // Max pages to crawl
scrapeOptions: {
formats: ["markdown"],
},
});
// Process all pages
for (const page of crawlResult.data) {
console.log(`${page.metadata.title}: ${page.markdown.length} chars`);
// Feed into your RAG pipeline, vector DB, etc.
}
// extract.ts — Extract structured data using LLM
import { z } from "zod";
const ProductSchema = z.object({
name: z.string(),
price: z.number(),
currency: z.string(),
rating: z.number().optional(),
inStock: z.boolean(),
features: z.array(z.string()),
});
const result = await firecrawl.scrapeUrl("https://shop.example.com/product/123", {
formats: ["extract"],
extract: {
schema: ProductSchema,
},
});
console.log(result.extract);
// { name: "Widget Pro", price: 49.99, currency: "USD", rating: 4.5, inStock: true, features: [...] }
// rag-ingest.ts — Crawl docs site and ingest into vector DB
import FirecrawlApp from "@mendable/firecrawl-js";
import { ChromaClient } from "chromadb";
const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
const chroma = new ChromaClient();
const collection = await chroma.getOrCreateCollection({ name: "docs" });
// Crawl documentation site
const crawl = await firecrawl.crawlUrl("https://docs.myproduct.com", {
limit: 500,
scrapeOptions: { formats: ["markdown"] },
});
// Chunk and store in vector DB
for (const page of crawl.data) {
const chunks = splitIntoChunks(page.markdown, 1000); // 1000 char chunks
await collection.add({
ids: chunks.map((_, i) => `${page.metadata.sourceURL}-chunk-${i}`),
documents: chunks,
metadatas: chunks.map(() => ({
source: page.metadata.sourceURL,
title: page.metadata.title,
})),
});
}
function splitIntoChunks(text: string, size: number): string[] {
const chunks: string[] = [];
for (let i = 0; i < text.length; i += size) {
chunks.push(text.slice(i, i + size));
}
return chunks;
}
User prompt: "I want a chatbot that answers questions about my product documentation."
The agent will use Firecrawl to crawl the docs site, convert to markdown, chunk the content, store in a vector database, and build a RAG query pipeline.
User prompt: "Track when our competitor updates their pricing page."
The agent will schedule periodic Firecrawl scrapes, compare markdown diffs between runs, and alert on significant changes.
scrapeUrl for single pages — fast, returns markdown + metadatacrawlUrl for entire sites — follows links, respects limitsdocker run mendableai/firecrawl for sensitive dataformats array — request only what you need (markdown, html, extract)development
Expert guidance for Fireworks AI, the platform for running open-source LLMs (Llama, Mixtral, Qwen, etc.) with enterprise-grade speed and reliability. Helps developers integrate Fireworks' inference API, fine-tune models, and deploy custom model endpoints with function calling and structured output support.
tools
Expert guidance for Firebase, Google's platform for building and scaling web and mobile applications. Helps developers set up authentication, Firestore/Realtime Database, Cloud Functions, hosting, storage, and analytics using Firebase's SDK and CLI.
development
When the user needs to build file upload functionality for a web application. Use when the user mentions "file upload," "image upload," "upload endpoint," "multipart upload," "presigned URL," "S3 upload," "file validation," "upload to cloud storage," or "accept user files." Handles upload endpoints, file validation (type, size, magic bytes), cloud storage integration, and upload status tracking. For image/video processing after upload, see media-transcoder.
development
Organize and rename files based on content analysis. Use when a user asks to sort files into folders, rename files by pattern, organize a messy directory, categorize documents by type or content, deduplicate files, or clean up a downloads folder. Handles smart renaming, content-based sorting, and duplicate detection.