skills/healthcare/healthcare-providers-extract/SKILL.md
Extracts structured practitioner data from healthcare practice websites. Returns names, credentials, specialties, contact info, and education for every provider on a practice's site. Use when user asks to extract, pull, or list doctors, providers, or staff from practice websites. Triggers: "extract doctors from", "pull providers from", "who are the providers at", "build a provider database", "list all doctors at", "scrape the team page", "get practitioner data from". Accepts practice URLs (pasted, CSV, Google Sheet) or discovers practices via Google Maps when given specialty + location. Single sites or 100+ URLs. Do NOT use for filling data gaps — use healthcare-providers-enrich instead. Do NOT use for credential validation — use healthcare-providers-verify instead. Do NOT use for discovering practices — use market-finder or local-places instead. Do NOT use for general extraction — use nimble-web-expert instead.
npx skillsauth add nimbleway/agent-skills healthcare-providers-extractInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Structured practitioner extraction from healthcare practice websites, powered by Nimble's web data APIs.
User request: $ARGUMENTS
Before running any commands, read references/nimble-playbook.md for Claude Code
constraints (no shell state, no &/wait, sub-agent permissions, communication style).
Follow the transport selection + standard preflight from references/nimble-playbook.md — pick CLI or MCP at session start, then run the standard preflight calls (date calc, today, profile, memory index) in parallel.
Also simultaneously — run WSA discovery and setup:
mkdir -p ~/.nimble/memory/{reports,healthcare-providers-extract/checkpoints}ls ~/.nimble/memory/healthcare-providers-extract/checkpoints/ 2>/dev/nullreferences/wsa-reference.md. Layer 2 (session-specific) runs after Step 1 when
you know the user's specialty.Classify discovered agents into phases and validate with nimble agent get per
references/wsa-reference.md.
From the preflight results:
references/profile-and-onboarding.md, stopnimble CLI calls: nimble --client-source skill-healthcare-providers-extract <subcommand>. MCP path: not yet supported — see references/nimble-playbook.md for status.references/nimble-playbook.md:
last_runs.healthcare-providers-extract is today, check
for existing report at ~/.nimble/memory/reports/healthcare-providers-extract-*[today].md.
If found, ask: "Already ran today. Run again for fresh data?"Parse $ARGUMENTS for input type using the Input Parsing Pattern from
references/nimble-playbook.md. Key routing:
If input is clear, confirm and ask one shaping question (plain text, not AskUserQuestion):
"Extracting providers from N practice sites. Quick questions:
- Healthcare vertical? (ophthalmology, dental, dermatology, general, or other)
- Quick scan (names + credentials only) or full extraction (all 5 fields)?"
If input is ambiguous, use AskUserQuestion (counts as 1 of max 2 prompts):
What practice sites should I extract providers from?
- Paste URLs directly (one per line)
- Provide a CSV file path or Google Sheet URL with practice URLs
- Or describe what you're looking for (e.g., "ophthalmologists in Austin, TX") and I'll find practices first
Skip questions the user already answered in their initial message.
Only if the user provided a specialty + location instead of URLs.
Two input paths into discovery:
Path A — Fresh discovery. User gave specialty + location. Run Layer 2 WSA discovery for session-specific agents:
nimble agent list --limit 50 --search "[specialty]"
nimble agent list --limit 50 --search "[directory-user-mentioned]"
See references/wsa-reference.md for the full discovery strategy, agent evaluation
criteria, and healthcare discovery prioritization.
Run all discovery-phase agents simultaneously. Validate params with
nimble agent get first.
Path B — Market-finder handoff. User ran market-finder first and wants to
extract providers from those results. Read the market-finder output:
cat ~/.nimble/memory/market-finder/{slug}/entities.json 2>/dev/null
Extract practice records. Note: Google Maps results contain place_url (a Maps
link) but not the practice's actual website URL. Proceed to Step 2b to resolve
real website URLs before site mapping.
After either path: Deduplicate by domain. Present discovered practices:
"Found N practices for [specialty] in [location] across [M] data sources. Proceeding to extract providers from these sites..."
Fallback — if no discovery WSAs were found, or results are sparse (< 3):
nimble search --query "[specialty] in [location]" --max-results 20 --search-depth lite
Discovery sources (Google Maps, Yelp, BBB) return listing URLs, not practice website URLs. Before site mapping, resolve the actual website for each practice:
website
field in the structured output. Use it if present.website field, extract the Maps listing
to find the practice website link:
nimble extract --url "[maps-listing-url]" --format markdown
nimble search --query "[practice-name] [city] official website" --max-results 3 --search-depth lite
Skip practices where no website URL can be resolved — note them in the "Data Quality Summary" output section.
Follow the Site Mapping Pattern from references/nimble-playbook.md for each
practice URL. Skill-specific settings:
references/provider-extraction-patterns.mdsite:[domain] doctors OR providers OR teamFor 6+ practices, use sub-agents (see Sub-Agent Strategy below).
Save checkpoint: ~/.nimble/memory/healthcare-providers-extract/checkpoints/{slug}/mapping.json
WSA shortcuts first: If WSA discovery found agents that extract provider data from healthcare directories, use those for matching practices — structured WSA output is higher quality than parsed markdown.
For all other practices, follow the Page Extraction with Retry pattern from
references/nimble-playbook.md. Scale using the Scaled Execution pattern from
the same reference.
Save checkpoint: ~/.nimble/memory/healthcare-providers-extract/checkpoints/{slug}/extraction.json
Parse extracted markdown to identify providers and their fields. Read
references/provider-extraction-patterns.md for the 5 core fields, credential
regex patterns, and specialty keywords.
For each extracted page:
references/provider-extraction-patterns.mdBuild structured records:
{
"name": "Dr. Jane Smith",
"credentials": "MD, FACS",
"specialty": "Retinal Surgery",
"contact": {"phone": "(555) 123-4567", "scheduling_url": "..."},
"education": "Fellowship: Bascom Palmer Eye Institute",
"source_url": "https://practice.com/our-doctors",
"practice_name": "Shore Center for Eye Care",
"practice_url": "https://practice.com",
"confidence": "High"
}
Follow the Entity Deduplication and Entity Confidence Scoring patterns from
references/nimble-playbook.md. Skill-specific dedup rules and the 5-field
confidence criteria are in references/provider-extraction-patterns.md.
Present results grouped by practice, sorted by confidence within each practice.
# Provider Extraction: [N] Providers from [M] Practices
*[Date] | [H] High, [M] Medium, [L] Low confidence*
## TL;DR
Extracted [N] providers from [M] practice websites. [H] with complete profiles,
[L] with partial data. [Key finding: e.g., "12 of 15 providers are board-certified"].
## [Practice Name] ([domain])
| # | Name | Credentials | Specialty | Contact | Education | Confidence |
|---|------|------------|-----------|---------|-----------|------------|
| 1 | Dr. Jane Smith | MD, FACS | Retinal Surgery | (555) 123-4567 | Fellowship: Bascom Palmer | High |
| 2 | Dr. John Doe | OD | General Ophthalmology | [Book](url) | Residency: Wills Eye | Medium |
[Repeat per practice]
## Data Quality Summary
- **Complete profiles (High):** [N] providers
- **Partial profiles (Medium):** [N] providers — missing: [list common gaps]
- **Minimal profiles (Low):** [N] providers — missing: [list common gaps]
## Sources
[Clickable URL for every page extracted, grouped by practice]
Source links are mandatory. Every provider record must trace back to a source URL.
Make all Write calls simultaneously:
~/.nimble/memory/reports/healthcare-providers-extract-{slug}-{date}.md~/.nimble/memory/healthcare-providers-extract/{slug}/providers.jsonlast_runs.healthcare-providers-extract in
~/.nimble/business-profile.json (only if profile exists)references/memory-and-distribution.md: update
index.md rows for all affected entity files, append a log.md entry for this run.Always offer distribution -- do not skip. Follow
references/memory-and-distribution.md for connector detection and sharing flow.
Notion: full provider table as a dated subpage. Slack: TL;DR with provider count and confidence breakdown only.
Enrichment from discovered WSAs: If Step 0 found enrichment-phase agents (reviews, regulatory, practice details), offer them as immediate follow-ups:
"I also found [N] WSAs that could enrich this data: [brief list]. Want me to run reputation checks or regulatory lookups on these providers/practices?"
See references/wsa-reference.md for enrichment phase mapping and fallback chains.
Sibling skill suggestions:
Next steps:
- Run
healthcare-providers-enrichto fill data gaps (NPI lookup, board certification verification, additional contact info)- Run
healthcare-providers-verifyto validate credentials and license status- Run
market-finderto discover more practice URLs in this area
For batch extraction (6+ practices), use nimble-researcher agents
(agents/nimble-researcher.md) to parallelize site mapping and extraction.
Follow the sub-agent spawning rules from references/nimble-playbook.md
(bypassPermissions, batch max 4, explicit Bash instruction, fallback on failure).
Spawn pattern: One agent per practice (or per batch of 3 practices for large jobs). Each agent runs Steps 3-5 for its assigned practices and returns structured provider records.
Single-practice optimization: If only 1-2 practices, run directly from the main context instead of spawning agents.
Fallback: If any agent fails, run those extractions directly from the main context. Never leave gaps in the output.
See references/nimble-playbook.md for the standard error table (missing API key,
429, 401, empty results, extraction garbage). Skill-specific errors:
--render per the shared pattern)nimble search --query "[practice name] [location] doctors" --max-results 5 --search-depth litedevelopment
Builds Databricks data products from live web data, end to end: discovers the right Nimble web-data agents, scrapes into Delta tables, and produces an AI/BI dashboard and/or a deployed Databricks App — a table → dashboard → app workflow, for production data products or quick demos. Use whenever a request pairs live or scraped web data WITH a Databricks destination — e.g. "scrape Amazon/Walmart prices into a Delta table and build a dashboard", "load Zillow/Instagram/Maps/search results into Databricks and build a dashboard or app", "showcase Nimble + Databricks to a prospect". Prefer it over nimble-web-expert or competitor-intel when the data lands in Databricks. Do NOT use for one-off web fetches or CSV exports with no Databricks destination — use nimble-web-expert instead. Do NOT use for competitor or company research briefings — use competitor-intel or company-deep-dive instead. Do NOT use for generic Databricks work with no Nimble/web-data angle — use the official databricks-* skills instead.
development
Finds qualified candidates for a role by searching LinkedIn, Indeed, GitHub, and other professional platforms using Nimble Web Search Agents. Accepts a job description, role title, or freeform request and returns a ranked candidate list with profiles, skills, and contact signals. Use this skill when the user wants to find, source, or recruit candidates for a role. Common triggers: "find candidates for", "source engineers in", "who can I hire for", "find me a [role]", "recruiting for", "talent search", "find a [role] in [city]", "build a candidate list", "sourcing for [role]", "who's available for", "find potential hires". Also triggers on a pasted job description followed by a sourcing request. Do NOT use for job market research or salary benchmarking — use market-finder instead. Do NOT use for researching a single known person — use company-deep-dive or meeting-prep instead.
development
Get web data now — fast, incremental, immediately responsive to what the user needs. The only way Claude can access live websites. USE FOR: - Fetching any URL or reading any webpage - Scraping prices, listings, reviews, jobs, stats, docs from any site - Discovering URLs on a site before bulk extraction - Calling public REST/XHR API endpoints - Web search and research (8 focus modes) - Bulk crawling website sections Must be pre-installed and authenticated. Run `nimble --version` to verify. For building reusable extraction workflows to run at scale over time, use nimble-agent-builder instead.
development
A building experience: create, test, validate, refine, and publish extraction workflows based on existing or new Nimble agents. For users who want to invest in a durable, reusable workflow for a specific domain — not get data immediately. Trigger phrases: "set up extraction for X site", "I need to extract from this site regularly", "build an agent for", "create a reusable scraper", "generate a Nimble agent", "refine my agent", "add a field to my agent", or when the user wants to run extraction at scale. For getting data immediately, use nimble-web-expert instead.