skills/scraperapi-research-agent/SKILL.md
Autonomous web research agent — takes a research question, uses ScraperAPI to discover and scrape relevant sources, uploads content as file artifacts to the Anthropic Files API, then feeds everything to Claude for synthesis into a cited research report. All in one flow. Use when user asks: "research X for me and give me a cited report", "investigate Y online and summarize what you find", "do a deep dive on Z using real web sources", "find information about X across multiple websites and cite your sources", "run the scraperapi research agent on this topic". Produces a structured markdown report with inline citations and a numbered source list. Invoke whenever the user wants multi-source web research that requires scraping real pages, not just answering from memory.
npx skillsauth add scraperapi/scraperapi-skills scraperapi-research-agentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
End-to-end autonomous research: ScraperAPI finds and fetches sources → Anthropic Files API ingests them as cited documents → Claude synthesizes a report.
Run it:
# Install dependencies
pip install requests anthropic
# Set env vars
export SCRAPERAPI_API_KEY=your-key
export ANTHROPIC_API_KEY=your-key
# Run
python skills/scraperapi-research-agent/scripts/research_agent.py \
--question "What are the best practices for rate limiting in web APIs?" \
--max-sources 5 \
--output report.md
See scripts/research_agent.py for the full implementation.
Before starting a research run, establish:
--max-sources (5). Do not loop indefinitely.1. PLAN
↓ Claude decomposes the question into 2–3 targeted search queries
2. DISCOVER
↓ ScraperAPI google/search structured endpoint → list of (url, title, snippet)
3. DEDUPLICATE
↓ Filter to top N unique URLs (default: 5), skipping PDFs and low-quality domains
4. FETCH
↓ ScraperAPI scrape each URL as markdown (output_format=markdown)
↓ Skip pages returning < 200 characters (blocked, error pages)
5. UPLOAD
↓ Upload each scraped page to Anthropic Files API as a text/plain artifact
↓ Store file_id for each source
6. SYNTHESIZE
↓ Claude (claude-opus-4-8, adaptive thinking) reads all document artifacts
↓ Returns structured report with inline citations [1], [2]...
7. CLEAN UP
↓ Delete uploaded file artifacts from Anthropic
↓ Write or print the final report
STOP when: max_sources reached, or all queries exhausted (whichever comes first).
The agent stops when any of the following is true:
--max-sources reached (default: 5) — limits credit spend--max-credits exceeded — hard cap on ScraperAPI credit use (optional)Without stop conditions, a research loop will keep fetching until credits are gone.
| Flag | Default | Description |
|------|---------|-------------|
| --question | (required) | Research question |
| --max-sources | 5 | Max pages to scrape (credit budget) |
| --output | stdout | Write report to file |
| --country | us | ScraperAPI country code for geo-targeted results |
| --model | claude-opus-4-8 | Anthropic model for synthesis |
See assets/report_template.md for the report structure.
The report is a markdown document with:
[N] citations| Sources | Scraping credits | Anthropic tokens | Total estimate | |---------|-----------------|-----------------|----------------| | 3 | ~3 | ~15K in / ~2K out | Low | | 5 | ~5 | ~25K in / ~3K out | Medium | | 10 | ~10 | ~50K in / ~5K out | Higher |
Prompt caching applies to the scraped content on repeated runs for the same question.
GET https://api.scraperapi.com/structured/google/search — finds source URLsGET https://api.scraperapi.com/?output_format=markdown — fetches page contentSee ScraperAPI docs for rate limits and credit costs.
Requires ANTHROPIC_API_KEY with access to claude-opus-4-8 and the Files API beta.
development
SERP landscape analysis for SEO strategy decisions. Use this skill when the user wants to understand what a search results page actually looks like for their target keywords — including AI Overview presence and attribution, SERP feature composition, how Google is interpreting query intent, which competitors dominate specific keyword sets, and where organic rankings actually translate to visible traffic. Trigger on requests like "analyze the SERP for [keyword]," "why isn't my content getting traffic even though it ranks," "what does Google show for [keyword]," "which keywords are worth targeting," "is [keyword] dominated by AI Overviews," "who owns the SERP for [topic]," "SERP analysis," "keyword landscape," or any request to understand what's happening on a search results page before making a content or SEO strategy decision.
tools
Run a comprehensive SEO audit using ScraperAPI's live SERP and scraping tools — no setup required. Use this skill whenever the user wants to: audit SEO for a website, understand why a page isn't ranking, check SEO health, analyze keyword rankings, compare against competitors in search results, find content gaps, review on-page signals (titles, meta, headings, schema), diagnose a traffic drop, check indexation, or get prioritized SEO recommendations. Also trigger when the user says things like "why am I not showing up on Google," "my traffic dropped," "how do I rank for X," "what's wrong with my SEO," "SEO check," or "SEO review." This skill works out of the box — it uses the ScraperAPI MCP tools already connected to this session, with no CLI or API key setup needed.
development
Build and implement web scrapers using ScraperAPI. Use this skill whenever the user asks to build, write, create, or implement a scraper, or wants runnable code that extracts data from a website. Trigger on: "build me a scraper for [website]", "write a scraper that fetches product pages from [ecommerce site]", "I need to scrape [data] from [website]", "create a script that extracts [fields] from [URL]", "help me scrape [website] — I need [fields]", "write code to scrape [website]", "make a script that scrapes [website]", "implement a scraper for [URL]". Guides architectural decisions (structured endpoint vs. raw HTML, JS rendering, proxy tier, sync vs. async batch), then generates a complete runnable Python or Node.js script with retry logic, error handling, pagination, and credit estimation.
development
Use this skill whenever the user wants to check, track, or be alerted about product prices on Amazon, Walmart, or via Google Shopping. Trigger on: "monitor the price of this Amazon product", "did the price drop on [Walmart URL]?", "track these ASINs", "compare today's prices to last week", "alert me if [product] goes below $X", "what's the current price of [product]?", "check my price watchlist", "scrape the price of [URL]", "is [product] cheaper anywhere else?". Accepts ASINs, Amazon/Walmart product URLs, or free-text product queries for Google Shopping. Reads an optional baseline JSON file to detect changes, fetches live prices via ScraperAPI's structured endpoints, and reports increases, decreases, restocks, and out-of-stock transitions in a structured change report. Use this skill even when the user does not say the word "monitor" — any one-shot or recurring price-check request belongs here.