skills/firecrawl-scraper/SKILL.md
Web scraping of JavaScript-rendered scientific websites using Firecrawl API
npx skillsauth add lamm-mit/scienceclaw firecrawl-scraperInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Web scraping using the Firecrawl API for JavaScript-rendered scientific websites, database pages, and literature sources. Returns clean Markdown content optimized for LLM processing. Handles dynamic content, authentication flows, and complex single-page applications that static scrapers cannot access.
Particularly useful for scraping scientific databases with JavaScript-heavy frontends, preprint servers, supplementary data pages, and research institution websites.
# Scrape a scientific database page as markdown
python3 skills/firecrawl-scraper/scripts/firecrawl_scrape.py --url "https://www.uniprot.org/uniprotkb/P53_HUMAN"
# Scrape and return raw HTML
python3 skills/firecrawl-scraper/scripts/firecrawl_scrape.py --url "https://www.rcsb.org/structure/1TUP" --format html
# Use explicit API key
python3 skills/firecrawl-scraper/scripts/firecrawl_scrape.py \
--url "https://www.biorxiv.org/content/10.1101/2024.01.01.000001" \
--api-key fc-yourkey123 \
--format markdown
{
"url": "https://www.uniprot.org/uniprotkb/P53_HUMAN",
"content": "# Cellular tumor antigen p53\n\n**Organism:** Homo sapiens...",
"format": "markdown",
"title": "P53_HUMAN - Cellular tumor antigen p53",
"links": [
"https://www.uniprot.org/uniprotkb/P04637",
"https://www.rcsb.org/structure/2OCJ"
]
}
Set your Firecrawl API key as an environment variable:
export FIRECRAWL_API_KEY=fc-yourkey123
Or pass it directly via --api-key. Get a key at https://firecrawl.dev.
tools
Onboard and manage Paperclip AI for research-paper knowledge and agent orchestration
development
Perform AI-powered web searches with real-time information using Perplexity models via LiteLLM and OpenRouter. This skill should be used when conducting web searches for current information, finding recent scientific literature, getting grounded answers with source citations, or accessing information beyond the model knowledge cutoff. Provides access to multiple Perplexity models including Sonar Pro, Sonar Pro Search (advanced agentic search), and Sonar Reasoning Pro through a single OpenRouter API key.
testing
Generate a structured scientific PDF report from a JSON description. Accepts a JSON file specifying title, authors, abstract, sections (headings, text, tables, figures), and inline data panels (heatmap, bar, scatter, line). Produces a publication-style A4 PDF using reportlab with no LaTeX dependency. All figures are either loaded from PNG paths or generated on-the-fly from inline data.
development
Execute arbitrary Python code and return stdout. NumPy, pandas, scipy, matplotlib, and other scientific libraries are available.