skills/tavily-crawl/SKILL.md
Crawl entire websites and extract content from multiple pages using Tavily's Crawl API. Use when you need to explore a documentation site, knowledge base, or multi-page resource. Supports path filtering and semantic instructions.
npx skillsauth add giggle-official/storyclaw-assistant tavily-crawlInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Traverse entire websites and extract content from multiple pages in one API call. Ideal for ingesting documentation sites, knowledge bases, or any multi-page resource into context.
Set TAVILY_API_KEY in ~/.openclaw/.env.local:
TAVILY_API_KEY=tvly-your-api-key-here
tavily-extract would require too many individual URLsUse tavily-extract when you already have specific URLs. Use this when you need to discover and read multiple pages from a starting URL.
Warning:
max_depth ≥ 2can take minutes and return large amounts of data. Always start withmax_depth=1andlimit=20.
Endpoint: POST https://api.tavily.com/crawl
| Field | Type | Default | Description |
| ------------------- | ------- | ------------ | ------------------------------------------------------- |
| url | string | Required | Root URL to begin crawling |
| max_depth | integer | 1 | Levels deep to follow links (1–5) |
| max_breadth | integer | 20 | Max links to follow per page |
| limit | integer | 50 | Total pages cap |
| instructions | string | null | Natural language focus guidance |
| chunks_per_source | integer | 3 | Relevant chunks per page (1–5, requires instructions) |
| extract_depth | string | "basic" | basic or advanced (for JS-rendered sites) |
| format | string | "markdown" | markdown or text |
| select_paths | array | null | Regex patterns to include (e.g. ["/docs/.*"]) |
| exclude_paths | array | null | Regex patterns to exclude (e.g. ["/blog/.*"]) |
| allow_external | boolean | true | Follow links to external domains |
| timeout | float | 150 | Max wait in seconds (10–150) |
{
"base_url": "https://docs.example.com",
"results": [
{
"url": "https://docs.example.com/getting-started",
"raw_content": "# Getting Started\n\n..."
}
],
"response_time": 8.3
}
curl -s -X POST https://api.tavily.com/crawl \
-H "Authorization: Bearer $TAVILY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"max_depth": 1,
"limit": 20
}'
Always use instructions + chunks_per_source when feeding results into LLM context — returns only relevant chunks instead of full pages, preventing context overflow:
curl -s -X POST https://api.tavily.com/crawl \
-H "Authorization: Bearer $TAVILY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"max_depth": 2,
"instructions": "Find authentication guides and API reference",
"chunks_per_source": 3,
"select_paths": ["/docs/.*", "/api/.*"],
"exclude_paths": ["/changelog/.*", "/blog/.*"]
}'
curl -s -X POST https://api.tavily.com/crawl \
-H "Authorization: Bearer $TAVILY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"max_depth": 2,
"select_paths": ["/docs/.*"],
"exclude_paths": ["/docs/legacy/.*"],
"limit": 30
}'
Use map when you only need to discover what pages exist before deciding which to crawl or extract:
curl -s -X POST https://api.tavily.com/map \
-H "Authorization: Bearer $TAVILY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"max_depth": 2,
"instructions": "Find all API reference and authentication pages"
}'
Response returns a list of URLs only — much faster than crawl.
| max_depth | Typical Pages | Time | | --------- | ------------- | ------------ | | 1 | 10–50 | Seconds | | 2 | 50–500 | Minutes | | 3+ | 500–5000+ | Many minutes |
max_depth=1 and limit=20, then increase if neededlimit to prevent runaway crawlsinstructions + chunks_per_source for agentic workflows — omit only when saving full page dataselect_paths to focus on relevant site sections (docs, API, guides)map first to understand the site structure, then crawl or extract specific sectionsallow_external: false prevents leaving the target sitetools
A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.
development
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
development
X2C Distribution and Wallet API — publish video to X2C platform, manage assets (balance, claim X2C, swap to USDC, withdraw, transactions).
data-ai
Manage X (Twitter) accounts — post tweets, like, reply, retweet, view timeline, search, auto-interact, analyze data.