.claude/skills/crawl-sites/SKILL.md
Crawl and extract content from configured sites using the provider engine
npx skillsauth add the-agency-ai/the-agency .claude/skills/crawl-sitesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Crawl configured sites and extract structured content using the configured crawler engine.
--site <name> — crawl a specific configured site (default: all)--output <path> — output directory for extracted content--dry-run — show what would be crawled without executing--diff — show changes since last crawlRead the crawl provider from claude/config/agency.yaml under crawl.provider.
# agency.yaml
crawl:
provider: "playwright" # or "wget", "scrapy", "webfetch"
sites:
- name: "docs"
url: "https://docs.example.com"
patterns: ["/**/*.html"]
- name: "blog"
url: "https://blog.example.com"
patterns: ["/posts/*"]
The provider maps to a tool: ./claude/tools/crawl-{provider}
Verify ./claude/tools/crawl-{provider} exists and is executable. If not:
webfetch provider: use the built-in WebFetch tool directly (no external tool needed)ls ./claude/tools/crawl-*Read the crawl.sites array from agency.yaml. Each site entry has:
name — identifier for the siteurl — base URL to crawlpatterns — URL patterns to includeIf --site specified, filter to that site only.
For each site, execute: ./claude/tools/crawl-{provider} {url} {patterns} {output}
Or for webfetch provider, use the WebFetch tool directly with each URL.
Show the user:
--diff, show what changed since last crawlEach crawl-{provider} tool must accept:
--patterns — comma-separated URL patterns--output — output directory--dry-run — list URLs without fetchingwebfetch (uses built-in WebFetch)crawl.sites section to agency.yamlbusiness
Sync worktree with master — merge, copy settings, run sandbox-sync, report changes
tools
List all git worktrees with status info (branch, clean/dirty, deps)
tools
Remove a git worktree and optionally delete its branch
development
Create a new git worktree with dedicated branch and bootstrapped dev environment