skills/capabilities/web-archive-scraper/SKILL.md
Search the Wayback Machine for archived versions of websites. Extract cached pages, customer lists, testimonials, and partner directories from sites that have changed or gone offline. Uses the free CDX API — no API key needed.
npx skillsauth add athina-ai/goose-skills web-archive-scraperInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Search the Wayback Machine (Internet Archive) for archived snapshots of websites. Fetch cached page content to find customer lists, testimonials, partner directories, and other information from sites that have changed or shut down.
Only dependency is requests. No API key needed.
# Find all snapshots of a URL
python3 skills/web-archive-scraper/scripts/search_archive.py \
--url "https://botkeeper.com/customers"
# Search with date range
python3 skills/web-archive-scraper/scripts/search_archive.py \
--url "https://botkeeper.com" --from 2025-01-01 --to 2026-02-01
# Search all pages under a domain (prefix match)
python3 skills/web-archive-scraper/scripts/search_archive.py \
--url "https://botkeeper.com" --match prefix --limit 50
# Fetch the actual archived page content
python3 skills/web-archive-scraper/scripts/search_archive.py \
--url "https://botkeeper.com/customers" --fetch
# Output formats
python3 skills/web-archive-scraper/scripts/search_archive.py --url URL --output json
python3 skills/web-archive-scraper/scripts/search_archive.py --url URL --output csv
python3 skills/web-archive-scraper/scripts/search_archive.py --url URL --output summary
web.archive.org/cdx/search/cdx for snapshots matching the URLid_ modifier to skip Wayback toolbar)| Flag | Default | Description |
|------|---------|-------------|
| --url | required | Target URL to search in the archive |
| --match | exact | Match type: exact, prefix, host, domain |
| --from | none | Start date (YYYY-MM-DD) |
| --to | none | End date (YYYY-MM-DD) |
| --limit | 25 | Max number of snapshots to return |
| --fetch | false | Fetch and display the content of the most recent snapshot |
| --fetch-all | false | Fetch content of ALL matched snapshots (use with small --limit) |
| --status | 200 | HTTP status filter (set to "any" to include all) |
| --output | json | Output format: json, csv, summary |
| --collapse | day | Dedup level: none, day, month, year |
{
"url": "https://botkeeper.com/customers",
"timestamp": "20250915143022",
"datetime": "2025-09-15T14:30:22",
"status_code": "200",
"mime_type": "text/html",
"archive_url": "https://web.archive.org/web/20250915143022/https://botkeeper.com/customers",
"raw_url": "https://web.archive.org/web/20250915143022id_/https://botkeeper.com/customers",
"content": "..."
}
The content field is only populated when --fetch or --fetch-all is used.
Free. The Wayback Machine CDX API requires no authentication or API key. Rate limit is ~15 requests/minute.
content-media
Takes an existing screen recording or demo video and adds professional zoom/pan effects synchronized to the narration. Uses transcript-driven zoom targeting and Remotion for rendering. Optionally replaces audio with a soundtrack.
tools
Repurposes long-form video (podcasts, interviews, talks) into short-form vertical clips for Instagram Reels, TikTok, and YouTube Shorts. Handles transcription, moment selection, clip extraction, speaker-tracked reframing (16:9 to 9:16), and animated captions.
development
Creates talking head videos from any source material (docs, changelogs, blog posts, notes, transcripts). Produces multi-scene videos with avatar narration over screenshots/images using HeyGen v2 API. Supports Quick Shot and Full Producer modes.
tools
Generates Instagram-ready product reels from any e-commerce product page URL. Scrapes product images, classifies by type, generates AI-animated clips via Higgsfield API, creates text overlays with style presets, and composes a 15-20 second reel with music. Supports model-based and product-only reels.