skills/43-wentorai-research-plugins/skills/literature/discovery/semantic-scholar-recs-guide/SKILL.md
Paper discovery via recommendation APIs (OpenAlex, CrossRef citation networks)
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research semantic-scholar-recs-guideInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Leverage the OpenAlex and CrossRef APIs to discover related papers, traverse citation networks, and build comprehensive reading lists programmatically.
OpenAlex indexes over 250 million academic works and provides a free, no-key-required API that supports:
Base URL: https://api.openalex.org
CrossRef URL: https://api.crossref.org
Use OpenAlex's concept graph and citation data to discover related work from seed papers.
import requests
HEADERS = {"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai)"}
WORK_ID = "W2741809807" # OpenAlex work ID
# Get the seed paper's concepts
response = requests.get(
f"https://api.openalex.org/works/{WORK_ID}",
headers=HEADERS
)
paper = response.json()
concepts = [c["id"] for c in paper.get("concepts", [])[:3]]
# Find works sharing the same concepts, sorted by citations
for concept_id in concepts:
related = requests.get(
"https://api.openalex.org/works",
params={"filter": f"concepts.id:{concept_id}", "sort": "cited_by_count:desc", "per_page": 10},
headers=HEADERS
)
for w in related.json().get("results", []):
print(f"[{w.get('publication_year')}] {w.get('title')} (citations: {w.get('cited_by_count')})")
import requests
def search_crossref(query, limit=10, sort="is-referenced-by-count"):
"""Search CrossRef for papers sorted by citation count."""
resp = requests.get(
"https://api.crossref.org/works",
params={"query": query, "rows": limit, "sort": sort, "order": "desc"},
headers={"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai; mailto:[email protected])"}
)
return resp.json().get("message", {}).get("items", [])
results = search_crossref("transformer attention mechanism")
for w in results:
title = w.get("title", [""])[0] if w.get("title") else ""
print(f" {title} — Cited by: {w.get('is-referenced-by-count', 0)}")
Walk the citation graph to discover foundational and derivative works.
work_id = "W2741809807"
response = requests.get(
"https://api.openalex.org/works",
params={
"filter": f"cites:{work_id}",
"sort": "cited_by_count:desc",
"per_page": 20
},
headers=HEADERS
)
for w in response.json().get("results", []):
print(f" [{w.get('publication_year')}] {w.get('title')} ({w.get('cited_by_count')} cites)")
response = requests.get(
f"https://api.openalex.org/works/{work_id}",
headers=HEADERS
)
paper = response.json()
ref_ids = paper.get("referenced_works", [])
# Fetch details for referenced works
for ref_id in ref_ids[:20]:
ref = requests.get(f"https://api.openalex.org/works/{ref_id.split('/')[-1]}", headers=HEADERS).json()
print(f" [{ref.get('publication_year')}] {ref.get('title')} ({ref.get('cited_by_count')} cites)")
Combine search, concept discovery, and citation traversal into a discovery pipeline:
| Step | Method | Purpose | |------|--------|---------| | 1. Seed selection | Manual or keyword search | Identify 3-5 highly relevant papers | | 2. Expand via concepts | OpenAlex concept graph | Find thematically related work | | 3. Forward citation | OpenAlex cites filter | Find recent derivative works | | 4. Backward citation | referenced_works field | Find foundational papers | | 5. Deduplicate | OpenAlex work ID matching | Remove duplicates across steps | | 6. Rank & filter | Sort by year, citations, relevance | Prioritize reading order |
def build_reading_list(seed_ids, max_papers=50):
"""Build a ranked reading list from seed papers."""
seen = set()
candidates = []
for seed_id in seed_ids:
# Get concepts from seed paper
paper = requests.get(f"https://api.openalex.org/works/{seed_id}", headers=HEADERS).json()
concept_ids = [c["id"] for c in paper.get("concepts", [])[:2]]
# Find related works via concepts
for cid in concept_ids:
related = requests.get(
"https://api.openalex.org/works",
params={"filter": f"concepts.id:{cid}", "sort": "cited_by_count:desc", "per_page": 20},
headers=HEADERS
).json().get("results", [])
for w in related:
wid = w.get("id", "").split("/")[-1]
if wid not in seen:
seen.add(wid)
candidates.append(w)
# Get citing works
citing = requests.get(
"https://api.openalex.org/works",
params={"filter": f"cites:{seed_id}", "sort": "cited_by_count:desc", "per_page": 20},
headers=HEADERS
).json().get("results", [])
for w in citing:
wid = w.get("id", "").split("/")[-1]
if wid not in seen:
seen.add(wid)
candidates.append(w)
# Rank by citation count and recency
candidates.sort(key=lambda p: (p.get("publication_year", 0), p.get("cited_by_count", 0)), reverse=True)
return candidates[:max_papers]
User-Agent headerselect parameter to reduce payload sizepage and per_page for pagination on large result setstools
Show mcp-stata identity, connected tools, and status. Use when the user asks if mcp-stata is available, asks about access to the toolkit, or asks what Stata tools are connected.
tools
Activate when users mention Stata commands, .do files, regressions, econometrics, stored results, graphs, dataset inspection, replication, or Stata errors. Route the task through mcp-stata tools and the specialized research skills instead of treating it as plain text coding.
development
Build and review paper-ready regression, balance, and summary tables from Stata outputs. Use when the user needs a clean table for a draft, appendix, or coauthor share-out.
tools
Install, configure, update, or verify mcp-stata across Claude Code, Codex, Gemini CLI, Cursor, Windsurf, and VS Code. Activate when users ask to set up the Stata toolkit or troubleshoot the installation.