Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mulatta/crwl-cli

Name: crwl-cli
Author: mulatta

crwl-cli/skills/SKILL.md

npx skillsauth add mulatta/skillz crwl-cli

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Use

Use crwl-cli for public docs, articles, blogs, product pages, and index pages that need LLM-readable markdown or structured links. Do not use it for logged-in/private pages, login flows, clicking/typing, uploads, downloads, or browser automation.

Workflow

Choose approach before crawling:

| Situation | Approach | | ----------------------------------------------------------------------------------- | -------------------------------------------- | | Single page (article, docs, blog post) | crwl-cli fetch URL | | Multiple pages linked from one page (product listings, search results, index pages) | JSON links pipeline | | Public CMS homepage with notices, menus, sliders, or portal links | --format json --scan-full-page --block-ads | | JS-rendered content missing | Add --wait-for or --scan-full-page | | Ad/tracker noise | Add --block-ads | | Basic bot blocking on public page | Add --stealth --user-agent-mode random |

Never manually copy URLs from markdown output. For link discovery, crawl with --format json and extract .links with jq. Markdown links may be truncated or malformed; .links contains structured hrefs.

Core examples

# Single public page → filtered markdown
crwl-cli fetch https://docs.python.org/3/library/asyncio.html

# Limit noisy pages to main content
crwl-cli fetch https://docs.python.org/3/ --css "#content"

# Diagnose/render pipelines with structured output
crwl-cli fetch https://example.com --format json

# Raw markdown when filtered markdown misses content
crwl-cli fetch https://example.com --format raw

# JS-rendered content
crwl-cli fetch https://example.com --wait-for ".loaded"

# Quote URLs with query strings so the shell does not split on &
crwl-cli fetch 'https://grad.example.edu/site/index.do?epTicket=LOG&lang=en' \
  --format json --scan-full-page --block-ads

# Fast text-only crawl
crwl-cli fetch https://example.com --text-mode

Multi-step crawling

Use when a page links to multiple detail pages you need to read. Public CMS homepages often mix notices, menus, sliders, and portal/login links; extract structured links and follow only public content links.

# 1. Crawl listing/index page as JSON
crwl-cli fetch https://shop.example.com/products --format json > listing.json

# 2. Extract canonical detail URLs from .links, not .markdown
jq -r '.links.internal[] | select(.href | test("/products/")) | .href' listing.json > urls.txt

# 3. Batch crawl details
crwl-cli fetch --urls-file urls.txt --format json

--format json output includes:

{
  "url": "...",
  "success": true,
  "status_code": 200,
  "markdown": "...",
  "links": {
    "internal": [{ "href": "...", "text": "...", "title": "..." }],
    "external": [{ "href": "...", "text": "...", "title": "..." }]
  },
  "error": null
}

Options

Input

URL — single public URL to crawl.
--urls-file FILE — one URL per line. Empty lines and # comments ignored. Use for batch crawling URLs extracted from JSON links.

Output

--format md|raw|json
- md default: filtered markdown for LLM reading.
- raw: unfiltered markdown for debugging missing content.
- json: structured output for pipelines and diagnostics.
--screenshot — capture a screenshot for rendering/debugging issues.

Scope / extraction

--css SELECTOR — limit extraction to a CSS selector.
--exclude-tags TAGS — comma-separated tags to exclude. Default: nav,footer,script,style.
--wait-for SELECTOR — wait for a CSS selector before extraction.
--scan-full-page — scroll through the full page before extraction; use for lazy-loaded public content.

Headless browser behavior

--text-mode — disable images for faster text-only crawls.
--block-ads — block common ad and tracker requests.
--stealth — enable Crawl4AI/Playwright stealth mode for basic bot blocking.
--user-agent-mode default|random — use default or randomized user agent.
--viewport WIDTHxHEIGHT — set viewport, e.g. 1920x1080.
--ignore-https-errors — ignore invalid TLS certificates.

Timing / cache

--timeout MS — page timeout in milliseconds. Default: 30000.
--cache — enable local cache. Default is off; use only when stale content is acceptable.

Not supported

Auth profiles or persistent browser profiles.
Cookie/session import.
Login flows or private pages, even when linked from an otherwise public page.
Clicking, typing, uploads, downloads.
Non-headless browser mode.
Arbitrary browser config passthrough.

Troubleshooting

| Problem | Try | | ----------------------- | -------------------------------------------------------------- | | Empty markdown | --format raw, --wait-for SELECTOR, or --scan-full-page | | Too much noise | --css SELECTOR or --exclude-tags TAGS | | Slow pages | --timeout 60000 | | Images slow things down | --text-mode | | Ad/tracker noise | --block-ads | | Basic bot block | --stealth --user-agent-mode random | | Need links | --format json, then read .links.internal[] / .external[] | | Login required | Stop. crwl-cli is for public headless crawling only. |

mulatta/crwl-cli

crwl-cli/skills/SKILL.md

Headless crawler for public web pages. Use to extract clean markdown, structured links, and batch crawl docs/articles.

tools

Updated May 22, 2026

$ install --global

skillsauth

npx skillsauth add mulatta/skillz crwl-cli

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 22, 2026, 6:23 AM164.2s1 file scanned

SKILL.md

name:: crwl-cli
description:: Headless crawler for public web pages. Use to extract clean markdown, structured links, and batch crawl docs/articles.

Use

Workflow

Choose approach before crawling:

Core examples

# Single public page → filtered markdown
crwl-cli fetch https://docs.python.org/3/library/asyncio.html

# Limit noisy pages to main content
crwl-cli fetch https://docs.python.org/3/ --css "#content"

# Diagnose/render pipelines with structured output
crwl-cli fetch https://example.com --format json

# Raw markdown when filtered markdown misses content
crwl-cli fetch https://example.com --format raw

# JS-rendered content
crwl-cli fetch https://example.com --wait-for ".loaded"

# Quote URLs with query strings so the shell does not split on &
crwl-cli fetch 'https://grad.example.edu/site/index.do?epTicket=LOG&lang=en' \
  --format json --scan-full-page --block-ads

# Fast text-only crawl
crwl-cli fetch https://example.com --text-mode

Multi-step crawling

# 1. Crawl listing/index page as JSON
crwl-cli fetch https://shop.example.com/products --format json > listing.json

# 2. Extract canonical detail URLs from .links, not .markdown
jq -r '.links.internal[] | select(.href | test("/products/")) | .href' listing.json > urls.txt

# 3. Batch crawl details
crwl-cli fetch --urls-file urls.txt --format json

--format json output includes:

{
  "url": "...",
  "success": true,
  "status_code": 200,
  "markdown": "...",
  "links": {
    "internal": [{ "href": "...", "text": "...", "title": "..." }],
    "external": [{ "href": "...", "text": "...", "title": "..." }]
  },
  "error": null
}

Options

Input

URL — single public URL to crawl.
--urls-file FILE — one URL per line. Empty lines and # comments ignored. Use for batch crawling URLs extracted from JSON links.

Output

--format md|raw|json
- md default: filtered markdown for LLM reading.
- raw: unfiltered markdown for debugging missing content.
- json: structured output for pipelines and diagnostics.
--screenshot — capture a screenshot for rendering/debugging issues.

Scope / extraction

--css SELECTOR — limit extraction to a CSS selector.
--exclude-tags TAGS — comma-separated tags to exclude. Default: nav,footer,script,style.
--wait-for SELECTOR — wait for a CSS selector before extraction.
--scan-full-page — scroll through the full page before extraction; use for lazy-loaded public content.

Headless browser behavior

--text-mode — disable images for faster text-only crawls.
--block-ads — block common ad and tracker requests.
--stealth — enable Crawl4AI/Playwright stealth mode for basic bot blocking.
--user-agent-mode default|random — use default or randomized user agent.
--viewport WIDTHxHEIGHT — set viewport, e.g. 1920x1080.
--ignore-https-errors — ignore invalid TLS certificates.

Timing / cache

--timeout MS — page timeout in milliseconds. Default: 30000.
--cache — enable local cache. Default is off; use only when stale content is acceptable.

Not supported

Auth profiles or persistent browser profiles.
Cookie/session import.
Login flows or private pages, even when linked from an otherwise public page.
Clicking, typing, uploads, downloads.
Non-headless browser mode.
Arbitrary browser config passthrough.

Troubleshooting

Related Skills

mulatta/biorefs-cli

tools

VerifiedTrustedCommunity

Biomedical literature, reference, and entity research helper. Use whenever the user asks for PubMed/PMC/NCBI/Entrez paper search, PMID/PMCID/DOI conversion, biomedical citation/BibTeX/RIS export, legal OA full-text lookup, gene/protein/RNA/transcript evidence, OpenAlex citation/OA enrichment, Semantic Scholar enrichment, PubChem compound/assay/bioactivity lookup, or bio/medical literature review evidence collection.

SKILL.mdUpdated May 22, 2026

mulatta/kmap-cli

tools

VerifiedTrustedCommunity

Use kmap-cli whenever the user asks for Korea-focused 장소찾기/POI lookup, 주변검색, 맛집 후보 찾기, 대중교통 길찾기, 경유지 transit routing, address geocoding, reverse geocoding, saved home/work aliases, or NAVER/Kakao/TMAP map app handoff. Default to TMAP API for machine-readable place/transit data; use NAVER/Kakao only as URL handoff helpers without NAVER/Kakao API keys. Do not use ODsay.

SKILL.mdUpdated May 19, 2026

mulatta/linkwarden-cli

tools

VerifiedTrustedCommunity

Manage Linkwarden bookmarks, collections, tags, highlights, RSS subscriptions, archives, and API tokens through a restricted CLI. Use when the user asks to save, search, organize, archive, or delete Linkwarden links.

SKILL.mdUpdated May 18, 2026

mulatta/linkwarden-cli

mulatta/vikunja-cli

tools

VerifiedTrustedCommunity

Manage Vikunja projects, tasks, relations, templates, attachments, labels, comments, due/reminder notifications, views, and kanban buckets through a restricted CLI. Use whenever the user asks to inspect or update Vikunja tasks/projects, create structured tasks from sources, attach evidence, link blockers/subtasks/order with task relations, move tasks between projects or kanban buckets, manage workflow labels/comments, or check Vikunja reminders/overdue items. Prefer this skill over raw Vikunja API calls.

SKILL.mdUpdated May 13, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mulatta/skillz.git

# Copy into Claude Code skills folder (global)
cp -r skillz/crwl-cli/skills ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mulatta/skillz

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT