skills/capabilities/conference-speaker-scraper/SKILL.md
Extract speaker names, titles, companies, and bios from conference websites. Supports direct HTML scraping and Apify web scraper fallback for JS-heavy sites. Use for pre-event research and outreach targeting.
npx skillsauth add gooseworks-ai/goose-skills conference-speaker-scraperInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Extract speaker names, titles, companies, and bios from conference website /speakers pages. Supports direct HTML scraping with multiple extraction strategies, plus Apify fallback for JS-heavy sites.
No API key needed for direct scraping mode.
# Scrape speakers from a conference page
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
--url "https://example.com/speakers"
# Use Apify for JS-heavy sites
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
--url "https://example.com/speakers" --mode apify
# Custom conference name (otherwise inferred from URL)
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \
--url "https://example.com/speakers" --conference "Sage Future 2026"
# Output formats
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output json # default
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output csv
python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output summary
Fetches the page HTML and tries multiple extraction strategies in order, using whichever returns the most results:
<h2>/<h3> + <p> structures<script type="application/ld+json"> with speaker dataUses apify/cheerio-scraper actor with a custom page function that targets common speaker card selectors. Standard POST/poll/GET dataset pattern.
| Flag | Default | Description |
|------|---------|-------------|
| --url | required | Conference speakers page URL |
| --conference | inferred | Conference name (otherwise inferred from URL domain) |
| --mode | direct | direct (HTML scraping) or apify (Apify cheerio scraper) |
| --output | json | Output format: json, csv, or summary |
| --token | env var | Apify token (only needed for apify mode) |
| --timeout | 300 | Max seconds for Apify run |
{
"name": "Jane Smith",
"title": "VP of Finance",
"company": "Acme Corp",
"bio": "Jane leads the finance transformation at...",
"linkedin_url": "https://linkedin.com/in/janesmith",
"image_url": "https://...",
"conference": "Sage Future 2026",
"source_url": "https://sagefuture2026.com/speakers"
}
apify/cheerio-scraper -- minimal Apify creditsHTML scraping is inherently fragile across conference sites. The multi-strategy approach maximizes coverage, but JS-heavy sites will require Apify mode. When direct scraping returns 0 results, try --mode apify.
development
End-to-end skill that turns a single reference image into a fully-installed, example-rendered style preset for the goose-graphics composite. Analyzes the image, writes the slim style spec, registers it in styles/index.json, generates all 7 format examples using the standard brief, renders PNGs via Playwright, and updates examples/manifest.json. Invoke with /goose-graphics-create-style.
development
Evaluate YC batch companies for investment — scrapes the YC directory, researches each company and its founders (work history, LinkedIn, website), assesses founder-company fit, and exports to Google Sheets with priority rankings. Use when asked to evaluate YC companies, research a YC batch, screen startups, or do due diligence on YC companies.
tools
Take screenshots of any website using Notte browser automation. Use when asked to screenshot, capture, or snap a webpage.
development
Search the web, platforms, and datasets. Use when asked to search, find, look up, research, or discover information from the web, YouTube, Amazon, eBay, news, academic sources, or any online platform.