Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

openclaw/web-reader-pro

Name: web-reader-pro
Author: openclaw

0xcjl/web-reader-pro/SKILL.md

npx skillsauth add openclaw/skills web-reader-pro

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Web Reader Pro - OpenClaw Skill

Overview

Web Reader Pro is an advanced web content extraction skill for OpenClaw that uses a multi-tier fallback strategy with intelligent routing, caching, and quality assessment.

Features

1. Three-Tier Fallback Strategy

Tier 1: Jina Reader API - Fast, reliable, best for most websites
Tier 2: Scrapling + Playwright - Dynamic content rendering for JS-heavy sites
Tier 3: WebFetch Fallback - Basic extraction for simple pages

2. Jina Quota Monitoring

Tracks API call count with persistent counter
Warning alerts when approaching quota limits
Automatic fallback to lower-tier methods when quota exhausted

3. Smart Cache Layer

Short-term caching (configurable TTL, default 1 hour)
Cache key based on URL hash
Reduces redundant API calls

4. Extraction Quality Scoring

Scores based on: word count, title detection, content density
Minimum quality threshold (default: 200 words + valid title)
Auto-escalation to next tier if quality below threshold

5. Domain-Level Routing Learning

Learns optimal extraction tier per domain
Persists learned routes in local JSON database
Adapts based on historical success rates

6. Retry with Exponential Backoff

Configurable max retries per tier (default: 3)
Exponential backoff: 1s, 2s, 4s, 8s...
Respects rate limits and transient failures

Installation

# Install dependencies
pip install -r requirements.txt

# Install Scrapling (requires Node.js)
./scripts/install_scrapling.sh

# Or install Scrapling manually
npm install -g @scrapinghub/scrapling

Usage

Basic Usage

from scripts.web_reader_pro import WebReaderPro

reader = WebReaderPro()
result = reader.fetch("https://example.com")
print(result['title'])
print(result['content'])

Advanced Configuration

reader = WebReaderPro(
    jina_api_key="your-jina-key",      # Optional: set via env JINA_API_KEY
    cache_ttl=3600,                      # Cache TTL in seconds (default: 3600)
    quality_threshold=200,               # Min word count for quality (default: 200)
    max_retries=3,                       # Max retries per tier (default: 3)
    enable_learning=True,                # Enable domain learning (default: True)
    scrapling_path="/usr/local/bin/scrapling"  # Path to scrapling binary
)

Result Format

{
    "title": "Page Title",
    "content": "Extracted content in markdown...",
    "url": "https://example.com",
    "tier_used": "jina|scrapling|webfetch",
    "quality_score": 85,
    "cached": False,
    "domain_learned_tier": "jina",
    "extracted_at": "2024-01-01T00:00:00Z"
}

Environment Variables

| Variable | Description | Default | |----------|-------------|---------| | JINA_API_KEY | Jina Reader API key | Required for Tier 1 | | WEB_READER_CACHE_DIR | Cache directory path | ~/.openclaw/cache/web-reader-pro/ | | WEB_READER_LEARNING_DB | Learning database path | ~/.openclaw/data/web-reader-pro/routes.json | | WEB_READER_JINA_QUOTA | Jina quota limit | 100000 |

API Reference

WebReaderPro.fetch(url, force_refresh=False)

Fetch and extract content from a URL.

Parameters:

url (str): Target URL
force_refresh (bool): Bypass cache if True

Returns: Dict with title, content, metadata

WebReaderPro.fetch_with_tier(url, preferred_tier)

Fetch using a specific tier (bypassing automatic selection).

Parameters:

url (str): Target URL
preferred_tier (str): "jina", "scrapling", or "webfetch"

WebReaderPro.get_jina_status()

Get current Jina API quota usage.

Returns: Dict with count, limit, percentage, warnings

WebReaderPro.clear_cache(url=None)

Clear cache for specific URL or all URLs.

Parameters:

url (str, optional): Specific URL to clear, or None for all

WebReaderPro.get_domain_routes()

Get learned domain-to-tier mappings.

Returns: Dict of domain -> preferred tier

Tier Comparison

| Tier | Speed | JS Rendering | Best For | Cost | |------|-------|--------------|----------|------| | Jina | Fast | No | Static pages, articles | API calls | | Scrapling | Medium | Yes | SPAs, dynamic content | CPU | | WebFetch | Fastest | No | Simple pages, fallbacks | Free |

License

MIT

openclaw/web-reader-pro

0xcjl/web-reader-pro/SKILL.md

Advanced web content extraction skill for OpenClaw using multi-tier fallback strategy (Jina → Scrapling → WebFetch) with intelligent routing, caching, quality scoring, and domain learning. Use when: reading article content, extracting web page text, scraping dynamic JS-heavy pages, or fetching WeChat official account articles.

3,729 stars

development

Updated Apr 10, 2026

$ install --global

skillsauth

npx skillsauth add openclaw/skills web-reader-pro

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 3, 2026, 12:33 AM232.0s1 file scanned

SKILL.md

Web Reader Pro - OpenClaw Skill

Overview

Web Reader Pro is an advanced web content extraction skill for OpenClaw that uses a multi-tier fallback strategy with intelligent routing, caching, and quality assessment.

Features

1. Three-Tier Fallback Strategy

Tier 1: Jina Reader API - Fast, reliable, best for most websites
Tier 2: Scrapling + Playwright - Dynamic content rendering for JS-heavy sites
Tier 3: WebFetch Fallback - Basic extraction for simple pages

2. Jina Quota Monitoring

Tracks API call count with persistent counter
Warning alerts when approaching quota limits
Automatic fallback to lower-tier methods when quota exhausted

3. Smart Cache Layer

Short-term caching (configurable TTL, default 1 hour)
Cache key based on URL hash
Reduces redundant API calls

4. Extraction Quality Scoring

Scores based on: word count, title detection, content density
Minimum quality threshold (default: 200 words + valid title)
Auto-escalation to next tier if quality below threshold

5. Domain-Level Routing Learning

Learns optimal extraction tier per domain
Persists learned routes in local JSON database
Adapts based on historical success rates

6. Retry with Exponential Backoff

Configurable max retries per tier (default: 3)
Exponential backoff: 1s, 2s, 4s, 8s...
Respects rate limits and transient failures

Installation

# Install dependencies
pip install -r requirements.txt

# Install Scrapling (requires Node.js)
./scripts/install_scrapling.sh

# Or install Scrapling manually
npm install -g @scrapinghub/scrapling

Usage

Basic Usage

from scripts.web_reader_pro import WebReaderPro

reader = WebReaderPro()
result = reader.fetch("https://example.com")
print(result['title'])
print(result['content'])

Advanced Configuration

reader = WebReaderPro(
    jina_api_key="your-jina-key",      # Optional: set via env JINA_API_KEY
    cache_ttl=3600,                      # Cache TTL in seconds (default: 3600)
    quality_threshold=200,               # Min word count for quality (default: 200)
    max_retries=3,                       # Max retries per tier (default: 3)
    enable_learning=True,                # Enable domain learning (default: True)
    scrapling_path="/usr/local/bin/scrapling"  # Path to scrapling binary
)

Result Format

{
    "title": "Page Title",
    "content": "Extracted content in markdown...",
    "url": "https://example.com",
    "tier_used": "jina|scrapling|webfetch",
    "quality_score": 85,
    "cached": False,
    "domain_learned_tier": "jina",
    "extracted_at": "2024-01-01T00:00:00Z"
}

Environment Variables

API Reference

WebReaderPro.fetch(url, force_refresh=False)

Fetch and extract content from a URL.

Parameters:

url (str): Target URL
force_refresh (bool): Bypass cache if True

Returns: Dict with title, content, metadata

WebReaderPro.fetch_with_tier(url, preferred_tier)

Fetch using a specific tier (bypassing automatic selection).

Parameters:

url (str): Target URL
preferred_tier (str): "jina", "scrapling", or "webfetch"

WebReaderPro.get_jina_status()

Get current Jina API quota usage.

Returns: Dict with count, limit, percentage, warnings

WebReaderPro.clear_cache(url=None)

Clear cache for specific URL or all URLs.

Parameters:

url (str, optional): Specific URL to clear, or None for all

WebReaderPro.get_domain_routes()

Get learned domain-to-tier mappings.

Returns: Dict of domain -> preferred tier

Tier Comparison

License

MIT

Related Skills

openclaw/mcdonalds-skill

tools

VerifiedTrustedCommunity

Use when the user wants to connect to, test, or use the McDonalds service at mcp.mcd.cn, including checking authentication, probing MCP endpoints, listing tools, or calling McDonalds MCP tools through a reusable local CLI.

3,962SKILL.mdUpdated Apr 10, 2026

openclaw/mcdonalds-skill

openclaw/scrapebadger

development

VerifiedTrustedCommunity

Web scraping platform — Twitter/X data, Vinted marketplace, and general web scraping API

3,962SKILL.mdUpdated Apr 10, 2026

openclaw/scrapebadger

openclaw/slowmist-security-cc

development

VerifiedTrustedCommunity

SlowMist AI Agent Security Review — comprehensive security framework for skills, repositories, URLs, on-chain addresses, and products (Claude Code version)

3,962SKILL.mdUpdated Apr 10, 2026

openclaw/slowmist-security-cc

openclaw/humanizer-cn

data-ai

VerifiedTrustedCommunity

去除中文文本中的 AI 写作痕迹，使其读起来自然。基于维基百科 AI 写作特征指南，检测 24 种 AI 模式。触发词：humanizer-cn、去除 AI 痕迹、去除 AI 写作痕迹、中文文本人性化。

3,962SKILL.mdUpdated Apr 10, 2026

openclaw/humanizer-cn

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/openclaw/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/0xcjl/web-reader-pro ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

openclaw/skills

3,729 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT