skills/research/SKILL.md
This skill should be used when the user asks to "find papers", "search academic literature", "find citations", "literature search", "find research on", "what does the literature say about", or any request to search for academic papers across multiple sources.
npx skillsauth add edwinhu/workflows researchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Multi-source academic search with deduplication, DOI resolution, and journal filtering.
Always read ${CLAUDE_SKILL_DIR}/../google-scholar/domain-knowledge.local.md before presenting results.
NEVER run the sources manually in sequence. ALWAYS use the research script. This is not negotiable.
uv run python3 "${CLAUDE_SKILL_DIR}/scripts/research.py" "<query>" [--n 50] [--min-citations N]
The script parallelizes all sources and DOI resolution automatically. Doing it manually serializes everything and triples wall time.
| Source | Tool | Strength | Default |
|--------|------|----------|---------|
| scholar lookup | Keyword/citation-ranked | Finance classics, foundational papers | ✅ |
| consensus CLI | Empirical corpus, sorted by citations | Accounting/finance empirical literature | ✅ |
| Paperpile bib | Personal library (My Library.bib) | Papers already in your collection | ✅ |
| scholar search | NL semantic | Law reviews, conceptual literature | opt-in (--scholar-search) |
scholar search is opt-in because it shares rate limits with scholar lookup and 429s when run in parallel. Add --scholar-search when you specifically want semantic/NL results.
The script outputs a JSON array. Each paper has:
{
"title": "...",
"authors": ["..."],
"year": 2023,
"journal": "...", // original journal label (may be SSRN)
"journal_resolved": "...", // CrossRef-resolved journal (present if SSRN label was resolved)
"doi": "...",
"citations": 150,
"takeaway": "...",
"url": "...",
"sources": ["lookup", "consensus"] // all sources that returned this paper
}
After running the script, read ${CLAUDE_SKILL_DIR}/../google-scholar/domain-knowledge.local.md and cross-reference each paper's effective journal (use journal_resolved if present, else journal) against the trusted list:
sources: ["lookup", "consensus"] (multiple sources) = higher confidencebib source = already in user's library (flag with 📚)★ [Title](url) — Authors (Year), *Journal*, N citations [sources]
> Takeaway: ...
📚 ★ [Title](url) — Authors (Year), *Journal* [in your library]
> Takeaway: ...
Trusted papers first (sorted by citations desc), then non-trusted in a collapsed table.
uv run python3 research.py "<query>".mcp__consensus__search → STOP. It is rate-limited to 3 results; the script uses the CLI binary automatically.journal field when journal_resolved is present → STOP. The SSRN label hides the real venue; always prefer journal_resolved.# Standard search
uv run python3 "${CLAUDE_SKILL_DIR}/scripts/research.py" "mandatory disclosure"
# With citation floor
uv run python3 "${CLAUDE_SKILL_DIR}/scripts/research.py" "poison pill" --min-citations 50
# More results from Consensus
uv run python3 "${CLAUDE_SKILL_DIR}/scripts/research.py" "corporate governance" --n 100
# Disable streaming (wait for all sources, output pretty-printed JSON)
uv run python3 "${CLAUDE_SKILL_DIR}/scripts/research.py" "mandatory disclosure" --no-stream
Without --stream, the script waits for all four sources before emitting anything — Consensus takes ~60s, so fast sources (bib <1s, Scholar ~10s) sit idle.
With --stream, the script emits one NDJSON line per event as it happens:
{"event": "source", "source": "bib", "papers": [...]}
{"event": "source", "source": "scholar-lookup", "papers": [...]}
{"event": "source", "source": "scholar-search", "papers": [...]}
{"event": "source", "source": "consensus", "papers": [...]}
{"event": "final", "papers": [...]}
source events: raw papers from each source as it completes (may have duplicates across sources)final event: deduplicated + CrossRef-resolved unified setProcess source events as they arrive to present early results; use final for the complete deduplicated list. Pass --no-stream for batch mode (pretty-printed JSON after all sources complete).
tools
Use when "query Dewey Data", "deweydata.io", "SafeGraph places/patterns/spend", "Advan foot traffic", "POI / points of interest", "mobility data", "dataplor", "Veraset", "PassBy", "crypto/Bitcoin ATM locations", or any pull from the Dewey Data academic marketplace (UVA/NYU Platform Subscription) via the deweypy/deweydatapy client, DuckDB, or the Dewey MCP server.
development
Use when submitting jobs to UVA HPC (Rivanna/Afton), writing Slurm scripts (sbatch/srun/squeue), converting SGE to Slurm, running compute on any Slurm-managed cluster, or building WRDS data pipelines with polars on HPC. Triggers: 'submit to HPC', 'sbatch', 'squeue', 'slurm job', 'run on Rivanna', 'run on Afton', 'HPC array job', 'convert SGE to Slurm', 'polars on HPC', 'WRDS from HPC'.
testing
Internal skill for literature review and source materialization. Called after brainstorm, before setup. NOT user-facing.
development
This skill should be used when the user asks to "add paper", "paperpile add", "fetch PDF for", "find and add", "search paperpile", "find in paperpile", "paperpile search", "label paper", "trash paper", "download paper", "paperpile index", "edit paper metadata", "update paper title", "fix paper author", "paperpile edit", "find PDF online", "search google for PDF", "resolve PDF", "fetch PDF for citation", "get full-text for DOI", "resolve cite to PDF", or any request to manage their Paperpile library or resolve a citation to a local PDF.