skills/deep-research/SKILL.md
Autonomous deep OSINT research engine for the Bingaman/McSherry guardianship case. Builds intelligence reports through iterative "waves" of web searches, Apify scraping, cross-referencing, and lead generation. Each wave produces new findings that generate leads for the next wave. Uses 132+ Apify actors across 9 categories, 7 OSINT MCP tools (Sherlock, Maigret, Holehe, TheHarvester, SpiderFoot, GHunt, Blackbird), CourtListener API, Firecrawl, and the full osint-skill social media arsenal (55+ actors for Instagram, Facebook, TikTok, YouTube, Google Maps). ALWAYS trigger this skill when Levi says ANY of these (including typos): - "deep research" / "deep reserach" / "deep reseach" - "continue deep research" / "start deep research" / "resume research" - "keep researching" / "run more waves" / "next wave" - "continue OSINT" / "continue scorched earth OSINT" / "continue operation scorched earth" - "scrape" / "investigate" / "dig into" - "search for" / "search the web" / "look on the web" / "look online" - "find out about" / "find info on" / "find information" - "look into" / "look up" / "lookup" - "research" / "reserach" / "reasearch" / "reseach" - "what can you find on" / "what do we know about" - "who is" (followed by a person name in the investigation) - "check on" / "pull up" / "look at" (when referring to a person, org, or entity) - "web search" / "online search" / "google" - "osint" / "intel" / "intelligence" - "apify" / "scrape" / "crawl" - "court records" / "courtlistener" / "court listener" - "deep research scrape" / "full arsenal" / "hit it with everything" - "username search" / "email search" / "sherlock" / "maigret" / "holehe" This skill is designed for COLD START recovery. Even with a brand new PC, a fresh Claude Code install, no saved sessions -- if the data files exist, this skill tells Claude exactly how to pick up where it left off.
npx skillsauth add ValorInvestigator/claude-plugin-toolkit deep-researchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When Levi starts deep research, he is LEAVING. Run wave after wave autonomously until:
KEEP GOING UNTIL YOU CAN'T.
NEVER use the apify-slash-rag-web-browser tool (also called mcp__Apify__apify-slash-rag-web-browser).
This tool has been REMOVED from the MCP server configuration. It is a weak, general-purpose Google search + scrape that:
If you find yourself about to call apify-slash-rag-web-browser, STOP and do this instead:
call-actor with apify/google-search-scraper (SERP results, 10x more queries)call-actor with apify/website-content-crawler (clean markdown)call-actor with apify/playwright-scraper (headless browser)call-actor with jirimoravcik/pdf-text-extractor (free)call-actor with easyapi/google-news-scraper (structured news)call-actor with automation-lab/court-records-scraperWebFetch built-in tool (no Apify needed)All schemas are in references/SCHEMAS.md -- no fetch-actor-details needed for these 10 actors.
Before scraping ANY new target type, you MUST follow this 3-step pipeline:
Step 1 -- DISCOVER: Run search-actors with keywords related to the target
Example: search-actors with keywords "Oregon court records" or "nonprofit 990"
Step 2 -- VALIDATE: Run fetch-actor-details on the best match to get input schema
Example: fetch-actor-details for "automation-lab/court-records-scraper"
Step 3 -- EXECUTE: Run call-actor with the validated schema
Example: call-actor with the exact input format from Step 2
Skip Steps 1-2 ONLY for actors already in the QUICK ACTOR REFERENCE table below (schemas pre-built).
A PreToolUse hook in settings.json HARD BLOCKS the RAG browser. Additionally:
bash ~/.claude/skills/deep-research/scripts/search-barrage.sh "query" general to generate diverse payloadsreferences/SCHEMAS.md for pre-built actor input schemas (no fetch-actor-details needed)| Script | Usage | Purpose |
|--------|-------|---------|
| scripts/diagnose.sh | Run FIRST every session | Check all tool availability |
| scripts/search-barrage.sh "query" type | type: person/org/legal/general | Generate diverse tool call payloads |
| scripts/validate-wave.sh | Run before git commit | Pre-commit quality check |
| scripts/courtlistener.py --query "terms" --court or | Legal research | CourtListener API wrapper |
| Category | Go-To Actor | Free? |
|----------|------------|-------|
| Google SERP | apify/google-search-scraper | No |
| Web Crawling | apify/website-content-crawler | Yes |
| News | easyapi/google-news-scraper | No |
| Email/Contact | caprolok/website-email-phone-finder | No |
| Court Records | automation-lab/court-records-scraper | No |
| PDF Extraction | jirimoravcik/pdf-text-extractor | Yes |
| Headless Browser | apify/playwright-scraper | Yes |
Full catalog: references/ACTOR_INVENTORY.md | Pre-built schemas: references/SCHEMAS.md
| Tool | Input | What It Does |
|------|-------|-------------|
| sherlock_username_search | username | 399+ social media sites |
| maigret_username_search | username | 3000+ sites |
| holehe_email_search | email | 120+ platform check |
| theharvester_domain_search | domain | Emails, subdomains |
Full reference: references/OSINT_MCP_TOOLS.md
Read in order:
references/FILE_LOCATIONS.md -- paths, tokens, git workflowmemory/MEMORY.md -- master indexmemory/active_work.md -- wave logmemory/OPEN_LOOPS.md -- open leadsreports/NETWORK_INTEL_REPORT.md -- first 10 lines (version, wave count)Phase 1: Search barrage (8-10 parallel, multiple tool types) Phase 2: Deep dive (2-3 best findings, specialized actors) Phase 2.5: Gap analysis (every 5th wave) Phase 3: Integrate (update report, sources, leads) Phase 4: Memory update (active_work, OPEN_LOOPS) Phase 5: Git commit and push
Full protocol: references/WAVE_PROTOCOL.md
Eastern Oregon healthcare governance network. Same people control hospital boards, Medicaid CCO ($600M+), behavioral health org, and employ attorneys on both sides of guardianship disputes.
Key entities: EOCCO, GOBHI, Moda/ODS, Good Shepherd/GRH Foundation, Baum Family, OJD/Keffer
Full playbooks: references/INVESTIGATION_PLAYBOOKS.md
Token: 48b2b5fd5858fe301286d216f6d1f4146507910c
Base URL: https://www.courtlistener.com/api/rest/v4/
Quick search: python3 scripts/courtlistener.py --query "terms" --court or
| File | Contents |
|------|----------|
| references/WAVE_PROTOCOL.md | Full 5-phase protocol + gap analysis |
| references/SCHEMAS.md | 10 pre-built actor input schemas |
| references/INVESTIGATION_PLAYBOOKS.md | Person/org/document investigation recipes |
| references/OSINT_MCP_TOOLS.md | 7 Docker OSINT tools |
| references/SOCIAL_MEDIA_ACTORS.md | 55+ social media actors |
| references/SEARCH_HIERARCHY.md | 8-tier search engine priority |
| references/ACTOR_INVENTORY.md | 132-actor catalog |
| references/FILE_LOCATIONS.md | Paths, tokens, git workflow |
development
# Write Article -- Investigative Series in Levi Bakke's Voice You are ghostwriting publishable investigative journalism in Levi's voice. He is a participant-investigator -- IN the story, not observing from outside. ## BEFORE WRITING Read the style guide: [references/style-guide.md](references/style-guide.md) Read the gold standard: `C:\Users\Big Levi\Desktop\DHS Stories\the Canary FINAL.txt` ## THE WRITING PROCESS 1. **Gather** -- Read relevant timeline docs, investigation files, databases
development
Dual-engine web search using BOTH Firecrawl AND Brave Search simultaneously. ALWAYS trigger this skill when Levi uses any of these phrases or close variations: - "search the web" / "search the internet" / "search online" - "www" (used as a verb or shorthand, e.g. "www this", "look it up on the www") - "internet" (as in "check the internet", "find on the internet", "look this up on the internet") - "go online", "look this up online", "check online" - "search for X" when context implies web search (not local files or database) - "find X online", "look up X", "research X on the web" This is Levi's preferred web research protocol. Both engines run together -- Brave for fast broad coverage, Firecrawl for deep scraping. Never use just one without the other when this skill triggers.
development
Web scraping with anti-bot bypass, content extraction, undocumented APIs and poison pill detection. Use when extracting content from websites, handling paywalls, implementing scraping cascades or processing social media. Covers requests, trafilatura, Playwright with stealth mode, yt-dlp and instaloader patterns.
development
# Text to Voice -- Convert Articles to Audio Convert written articles to spoken audio (.mp3) using Google Cloud TTS with Chirp 3: HD Algieba voice. ## VOICE PROFILE - **Voice:** `en-US-Chirp3-HD-Algieba` (male, Chirp 3: HD) - **Speaking Rate:** `1.0` | **Volume Gain:** `0.0` dB - **Audio Encoding:** MP3, 44100 Hz, 192k bitrate (final stitch) - **API Version:** `texttospeech_v1beta1` (Chirp 3 HD requires v1beta1) - **Google Cloud Project:** `valorinvestigates` ## THE TWO-STEP PROCESS 1. **Rew