skills/document-hunter/SKILL.md
Searches and retrieves documents from free public sources using automated browser navigation. Use when research needs primary source documents like court filings, government reports, or public records.
npx skillsauth add bitwize-music-studio/claude-ai-music-skills document-hunterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Input: $ARGUMENTS
You are an automated document hunter using browser automation (Playwright) to systematically search and download primary source documents from free public archives.
When invoked:
You automate the tedious work of hunting down primary source documents across multiple free public archives.
Important Disclaimers:
pip install playwright && playwright install chromium)| Source | URL | Best For | |--------|-----|----------| | DocumentCloud | documentcloud.org | PACER docs journalists uploaded | | CourtListener | courtlistener.com | RECAP crowdsourced documents | | Scribd | scribd.com | User-uploaded court docs | | Justia | justia.com | Appellate opinions | | DOJ | justice.gov | Indictments, press releases | | SEC | sec.gov/litigation | Complaints, settlements |
See site-patterns.md for automation strategies for each source.
⚠️ Primary source PDFs should NOT be committed to Git (too large)
PDFs go to {documents_root}/artists/[artist]/albums/[genre]/[album]/ (mirrored structure from content_root).
{documents_root}/artists/[artist]/albums/[genre]/[album]/
├── indictment.pdf
├── plea-agreement.pdf
└── manifest.json
# Primary source PDFs - too large for Git
*.pdf
primary-sources/
# Check Playwright
pip list | grep playwright
# Install if needed
pip install playwright beautifulsoup4 requests
playwright install chromium
Resolve document storage path:
resolve_path("documents", album_slug) — returns {documents_root}/artists/{artist}/albums/{genre}/{album}/mkdir -p {resolved_path}Generate and run a Python script that:
See site-patterns.md for code templates.
DOCUMENT HUNT COMPLETE
======================
Case: [case name]
Date: [date]
DOCUMENTS FOUND: X
- documentcloud_indictment.pdf (2.3 MB) - DocumentCloud
- courtlistener_complaint.pdf (1.1 MB) - CourtListener
- doj_press_release.pdf (0.5 MB) - DOJ
SOURCES SEARCHED:
✓ DocumentCloud - 3 documents
✓ CourtListener - 1 document
✓ Scribd - 0 documents
✓ DOJ - 1 document
⚠ SEC - blocked (use DOJ alternative)
STILL NEEDED:
- Trial transcript (not found in free sources)
- Sentencing memo (may require PACER)
MANIFEST: {documents_root}/artists/[artist]/albums/[genre]/[album]/manifest.json
The RECAP browser extension crowdsources PACER documents.
What it does:
Location: ${CLAUDE_PLUGIN_ROOT}/tools/extensions/recap-extension/
Setup:
cd tools/extensions
curl -L "https://github.com/freelawproject/recap-chrome/releases/download/2.8.6/chrome-release.zip" -o recap.zip
unzip recap.zip -d recap-extension
rm recap.zip
In {documents_root}/artists/[artist]/albums/[genre]/[album]/ (not in git):
{documents_root}/artists/[artist]/albums/[genre]/[album]/
├── manifest.json # Complete catalog with metadata
├── documentcloud_*.pdf # From DocumentCloud
├── courtlistener_*.pdf # From CourtListener
├── doj_*.pdf # From DOJ
└── download-documents.py # Reproducibility script
In {content_root}/.../[album]/SOURCES.md (in git):
PDF: {documents_root}/artists/[artist]/albums/[genre]/[album]/indictment.pdf{
"case_name": "Dorr et al. v. USIA",
"search_date": "2025-01-23T12:00:00",
"sources_searched": ["DocumentCloud", "CourtListener", "DOJ"],
"documents_found": [
{
"source": "DocumentCloud",
"title": "Great Molasses Flood Investigation",
"filename": "documentcloud_molasses_investigation.pdf",
"url": "https://...",
"size": 2400000
}
]
}
tools
Reviews lyrics and prose for AI-written patterns (abstract noun stacking, over-explained metaphors, cliche escalation, missing idiosyncrasy, prose AI tells). Advisory Warning/Info severity — flags issues, does not block or rewrite. Use when reviewing lyrics for authenticity or before generation to catch AI-sounding language.
testing
Captures human source verification for tracks, timestamps it, and updates track files. Use when sources need human review before generation.
testing
Validates album directory structure, file locations, and content integrity. Use before release or whenever the user wants to check an album's structural health.
tools
Provides interactive guided album creation for new users. Use when the user is new to the plugin or asks for a walkthrough of the album creation process.