Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

bjesuiter/jb-docs-scraper

Name: jb-docs-scraper
Author: bjesuiter

skills/jb-docs-scraper/SKILL.md

npx skillsauth add bjesuiter/skills jb-docs-scraper

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Documentation Scraper

Scrape any documentation website into local markdown files. Uses crawl4ai for async web crawling.

Quick Start

# Scrape any documentation URL
uv run --with crawl4ai python ./references/scrape_docs.py <URL>

# Examples
uv run --with crawl4ai python ./references/scrape_docs.py https://mediasoup.org/documentation/v3/
uv run --with crawl4ai python ./references/scrape_docs.py https://docs.rombo.co/tailwind

Output goes to ./docs/<auto-detected-name>/ by default.

Prerequisites (First Time Only)

uv run --with crawl4ai playwright install

Usage

uv run --with crawl4ai python ./references/scrape_docs.py <URL> [OPTIONS]

Options

| Option | Description | Default | |--------|-------------|---------| | -o, --output PATH | Output directory | ./docs/<auto-detected-name> | | --max-depth N | Maximum link depth | 6 | | --max-pages N | Maximum pages to scrape | 500 | | --url-pattern PATTERN | URL filter (glob) | Auto-detected | | -q, --quiet | Suppress verbose output | False |

Examples

# Basic - scrape to ./docs/documentation_v3/
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://mediasoup.org/documentation/v3/

# Custom output directory
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://docs.rombo.co/tailwind \
  --output ./my-tailwind-docs

# Limit crawl scope
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://tanstack.com/start/latest/docs/framework/react/overview \
  --max-pages 50 \
  --max-depth 3

# Custom URL pattern filter
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://example.com/docs/api/v2/ \
  --url-pattern "*api/v2/*"

How It Works

Auto-detects domain and URL pattern from the input URL
Crawls using BFS (breadth-first search) strategy
Filters to stay within the documentation section
Converts pages to clean markdown
Saves with directory structure mirroring the URL paths

Output Structure

docs/<name>/
  index.md           # Root page
  getting-started.md
  api/
    overview.md
    client.md
  guides/
    installation.md

Troubleshooting

| Issue | Solution | |-------|----------| | Playwright browser binaries are missing | Run uv run --with crawl4ai playwright install | | Empty output | Check if URL pattern matches actual doc URLs. Try --url-pattern | | Missing pages | Increase --max-depth or --max-pages | | Wrong pages scraped | Use stricter --url-pattern |

Tips

Test first - Use --max-pages 10 to verify config before full crawl
Check output name - Script auto-detects from URL path segments
Rerun safe - Files are overwritten, duplicates skipped

bjesuiter/jb-docs-scraper

skills/jb-docs-scraper/SKILL.md

Use when scraping docs websites into local markdown files, crawling docs, or building AI-readable docs context.

development

Updated Jun 4, 2026

$ install --global

skillsauth

npx skillsauth add bjesuiter/skills jb-docs-scraper

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 4, 2026, 4:37 AM111.5s2 files scanned

SKILL.md

name:: jb-docs-scraper
description:: Use when scraping docs websites into local markdown files, crawling docs, or building AI-readable docs context.

Documentation Scraper

Scrape any documentation website into local markdown files. Uses crawl4ai for async web crawling.

Quick Start

# Scrape any documentation URL
uv run --with crawl4ai python ./references/scrape_docs.py <URL>

# Examples
uv run --with crawl4ai python ./references/scrape_docs.py https://mediasoup.org/documentation/v3/
uv run --with crawl4ai python ./references/scrape_docs.py https://docs.rombo.co/tailwind

Output goes to ./docs/<auto-detected-name>/ by default.

Prerequisites (First Time Only)

uv run --with crawl4ai playwright install

Usage

uv run --with crawl4ai python ./references/scrape_docs.py <URL> [OPTIONS]

Options

Examples

# Basic - scrape to ./docs/documentation_v3/
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://mediasoup.org/documentation/v3/

# Custom output directory
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://docs.rombo.co/tailwind \
  --output ./my-tailwind-docs

# Limit crawl scope
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://tanstack.com/start/latest/docs/framework/react/overview \
  --max-pages 50 \
  --max-depth 3

# Custom URL pattern filter
uv run --with crawl4ai python ./references/scrape_docs.py \
  https://example.com/docs/api/v2/ \
  --url-pattern "*api/v2/*"

How It Works

Auto-detects domain and URL pattern from the input URL
Crawls using BFS (breadth-first search) strategy
Filters to stay within the documentation section
Converts pages to clean markdown
Saves with directory structure mirroring the URL paths

Output Structure

docs/<name>/
  index.md           # Root page
  getting-started.md
  api/
    overview.md
    client.md
  guides/
    installation.md

Troubleshooting

Tips

Test first - Use --max-pages 10 to verify config before full crawl
Check output name - Script auto-detects from URL path segments
Rerun safe - Files are overwritten, duplicates skipped

Related Skills

bjesuiter/jb-clawpatch-review

testing

VerifiedTrustedCommunity

Use when the user mentions Clawpatch/clawpatch.ai, semantic feature review, repo-wide AI audit, persistent findings, or clawpatch init/map/review/report/fix/revalidate.

SKILL.mdUpdated Jun 4, 2026

bjesuiter/jb-clawpatch-review

bjesuiter/jb-autoreview

development

VerifiedTrustedCommunity

Use when the user asks for autoreview, Codex/Claude second-model review, or final review of dirty changes, a branch, commit, or PR before ship.

SKILL.mdUpdated Jun 4, 2026

bjesuiter/jb-autoreview

bjesuiter/jb-release

testing

VerifiedTrustedCommunity

Use when the user asks to cut, prepare, publish, tag, or verify a release, especially npm/package releases.

SKILL.mdUpdated May 30, 2026

bjesuiter/jb-tuna-script

tools

VerifiedTrustedCommunity

Use when adding, writing, fixing, or exposing a script for the Tuna macOS launcher.

SKILL.mdUpdated May 17, 2026

bjesuiter/jb-tuna-script

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/bjesuiter/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/skills/jb-docs-scraper ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

bjesuiter/skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT