Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/Traversing Citation Networks

Name: Traversing Citation Networks
Author: brycewang-stanford

skills/05-kthorn-research-superpower/research/traversing-citations/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research Traversing Citation Networks

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Traversing Citation Networks

Overview

Intelligently follow citations backward (references) and forward (citing papers) using Semantic Scholar API.

Core principle: Only follow citations relevant to user's query. Avoid exponential explosion by filtering before traversing.

When to Use

Use this skill when:

Found a highly relevant paper (score ≥ 7)
Need to find related work
User asks "what papers cite this?"
Building comprehensive understanding of a topic

When NOT to use:

Paper scored < 7 (not relevant enough to follow)
Already at 50 papers (check with user first)
Citations look off-topic from abstract

Citation Traversal Strategy

1. Get Paper ID from Semantic Scholar

Lookup by DOI:

curl "https://api.semanticscholar.org/graph/v1/paper/DOI:10.1234/example.2023?fields=paperId,title,year"

Response:

{
  "paperId": "abc123def456",
  "title": "Paper Title",
  "year": 2023
}

Save paperId - needed for citations/references queries

2. Backward Traversal (References)

Get references from paper:

curl "https://api.semanticscholar.org/graph/v1/paper/abc123def456/references?fields=contexts,intents,title,year,abstract,externalIds&limit=100"

Response format:

{
  "data": [
    {
      "citedPaper": {
        "paperId": "xyz789",
        "title": "Referenced Paper Title",
        "year": 2020,
        "abstract": "...",
        "externalIds": {
          "DOI": "10.5678/referenced.2020",
          "PubMed": "87654321"
        }
      },
      "contexts": [
        "...as described in previous work [15]...",
        "...we used the method from [15] to..."
      ],
      "intents": ["methodology", "background"]
    }
  ]
}

Filter for relevance:

For each reference, check:

Context keywords: Do citation contexts mention user's query terms?
- Example: If user asks about "IC50 values", look for contexts mentioning "IC50", "activity", "potency"
Title match: Does title contain relevant keywords?
Intent: Is intent "methodology" or "result" (more relevant) vs "background" (less relevant)?

Scoring:

Context keywords match: +3 points
Title keywords match: +2 points
Intent is methodology/result: +2 points
Recent (< 5 years old): +1 point

Only add to queue if score ≥ 5

3. Forward Traversal (Citations)

Get papers citing this one:

curl "https://api.semanticscholar.org/graph/v1/paper/abc123def456/citations?fields=title,year,abstract,externalIds&limit=100"

Response format:

{
  "data": [
    {
      "citingPaper": {
        "paperId": "def456ghi",
        "title": "Newer Paper Citing This",
        "year": 2024,
        "abstract": "We extended the work of [original paper]...",
        "externalIds": {
          "DOI": "10.9012/citing.2024"
        }
      }
    }
  ]
}

Filter for relevance:

For each citing paper:

Title match: Keywords present in title?
Abstract match: User's query terms in abstract?
Recency: Newer papers often build on findings (prioritize < 2 years)
Citation count: If Semantic Scholar provides, highly cited papers more likely relevant

Scoring:

Title keywords match: +3 points
Abstract keywords match: +2 points
Recent (< 2 years): +2 points
Moderate recency (2-5 years): +1 point

Only add to queue if score ≥ 5

4. Deduplication

Before adding to queue:

Check papers-reviewed.json:

doi = paper["externalIds"].get("DOI")
if doi in papers_reviewed:
    skip  # Already processed
else:
    add to queue

CRITICAL: After evaluating any paper from citation traversal, add it to papers-reviewed.json regardless of score. This prevents re-processing the same paper from multiple sources.

Track citation relationship in citations/citation-graph.json:

{
  "10.1234/example.2023": {
    "references": ["10.5678/ref1.2020", "10.5678/ref2.2021"],
    "cited_by": ["10.9012/cite1.2024", "10.9012/cite2.2024"]
  }
}

CRITICAL: Use ONLY citation-graph.json for citation tracking. Do NOT create custom files like forward_citation_pmids.txt or citation_analysis.md. All findings go in SUMMARY.md.

5. Process Queue

Add relevant citations to processing queue:

{
  "doi": "10.5678/referenced.2020",
  "title": "Referenced Paper",
  "relevance_score": 7,
  "source": "backward_from:10.1234/example.2023",
  "context": "Method citation - describes IC50 measurement protocol"
}

Then:

Evaluate using evaluating-paper-relevance skill
If relevant, extract data and potentially traverse its citations too

Smart Traversal Limits

To avoid explosion:

Only traverse papers scoring ≥ 7 in initial evaluation
Only follow citations scoring ≥ 5 in relevance filtering
Limit traversal depth to 2 levels (original → references → references of references)
Check with user after every 50 papers total

Breadth-first strategy:

Get all references + citations for current paper
Filter and score them
Add high-scoring ones to queue
Process next paper in queue
Repeat until queue empty or hit limit

Progress Reporting

Report as you traverse:

🔗 Analyzing citations for: "Original Paper Title"
   → Found 45 references, 12 look relevant
   → Found 23 citing papers, 8 look relevant
   → Adding 20 papers to queue

📄 [51/127] Following reference: "Method for measuring IC50"
   Source: Referenced by original paper in Methods section
   Abstract score: 7 → Fetching full text...

API Rate Limiting

Semantic Scholar limits:

Free tier: 100 requests per 5 minutes
With API key: 1000 requests per 5 minutes

Be efficient:

Request multiple fields in one call (?fields=title,abstract,externalIds,year)
Use limit=100 to get more results per request
Cache responses - don't re-fetch same paper

If rate limited:

Wait 5 minutes
Report to user: "⏸️ Rate limited by Semantic Scholar API. Waiting 5 minutes..."
Consider getting API key for higher limits

Integration with Other Skills

After traversing citations:

Queue now has N new papers to evaluate
For each, use evaluating-paper-relevance skill
If relevant, extract to SUMMARY.md
If highly relevant (≥9), traverse its citations too
Update citation-graph.json to track relationships

Quick Reference

| Task | API Endpoint | |------|--------------| | Get paper by DOI | GET /graph/v1/paper/DOI:{doi}?fields=paperId,title | | Get references | GET /graph/v1/paper/{paperId}/references?fields=contexts,title,abstract,externalIds | | Get citations | GET /graph/v1/paper/{paperId}/citations?fields=title,abstract,externalIds | | Check if processed | Look up DOI in papers-reviewed.json | | Filter relevance | Score based on context/title/intent/recency |

Relevance Filtering Checklist

Before adding citation to queue:

[ ] Check if already in papers-reviewed.json (skip if yes)
[ ] Score based on context/title keywords (need ≥ 5)
[ ] Verify external ID (DOI or PMID) exists
[ ] Add source tracking ("backward_from:DOI" or "forward_from:DOI")
[ ] Add to queue with metadata

Common Mistakes

Not tracking all evaluated papers: Only adding relevant papers to papers-reviewed.json → Add EVERY paper after evaluation to prevent re-review Creating custom analysis files: Making forward_citation_pmids.txt, CITATION_ANALYSIS.md, etc. → Use ONLY citation-graph.json and SUMMARY.md Following all citations: Exponential explosion → Filter before adding to queue Ignoring context: Citation might be tangential → Read context strings Not deduplicating: Re-process same papers → Always check papers-reviewed.json before and after evaluation Too deep: Following 5+ levels → Limit to 2 levels, check with user Missing forward citations: Only checking references → Use both backward and forward No rate limiting awareness: API blocks you → Add delays, handle 429 errors

Example Workflow

1. User asks: "Find selectivity data for BTK inhibitors"
2. Search finds Paper A (score: 9, has great IC50 data)
3. Traverse citations for Paper A:
   - References: 45 total, 12 relevant (mention "selectivity", "IC50")
   - Citations: 23 total, 8 relevant (newer papers on BTK)
4. Add 20 papers to queue
5. Evaluate first queued paper (score: 8)
6. Extract data, traverse its citations (add 5 more)
7. Continue until queue empty or user says stop

Next Steps

After traversing citations:

Process queued papers with evaluating-paper-relevance
Update SUMMARY.md with new findings
Check if reached checkpoint (50 papers or 5 minutes)
If checkpoint: ask user to continue or stop

brycewang-stanford/Traversing Citation Networks

skills/05-kthorn-research-superpower/research/traversing-citations/SKILL.md

Smart backward and forward citation following via Semantic Scholar, with relevance filtering and deduplication

2,869 stars

testing

Updated Jul 16, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research Traversing Citation Networks

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 16, 2026, 5:04 AM130.2s1 file scanned

SKILL.md

name:: Traversing Citation Networks
description:: Smart backward and forward citation following via Semantic Scholar, with relevance filtering and deduplication
when_to_use:: After finding relevant paper. When need to find related work. When following references or citations. When building citation graph. When exploring paper connections.
version:: 1.0.0

Traversing Citation Networks

Overview

Intelligently follow citations backward (references) and forward (citing papers) using Semantic Scholar API.

Core principle: Only follow citations relevant to user's query. Avoid exponential explosion by filtering before traversing.

When to Use

Use this skill when:

Found a highly relevant paper (score ≥ 7)
Need to find related work
User asks "what papers cite this?"
Building comprehensive understanding of a topic

When NOT to use:

Paper scored < 7 (not relevant enough to follow)
Already at 50 papers (check with user first)
Citations look off-topic from abstract

Citation Traversal Strategy

1. Get Paper ID from Semantic Scholar

Lookup by DOI:

curl "https://api.semanticscholar.org/graph/v1/paper/DOI:10.1234/example.2023?fields=paperId,title,year"

Response:

{
  "paperId": "abc123def456",
  "title": "Paper Title",
  "year": 2023
}

Save paperId - needed for citations/references queries

2. Backward Traversal (References)

Get references from paper:

curl "https://api.semanticscholar.org/graph/v1/paper/abc123def456/references?fields=contexts,intents,title,year,abstract,externalIds&limit=100"

Response format:

{
  "data": [
    {
      "citedPaper": {
        "paperId": "xyz789",
        "title": "Referenced Paper Title",
        "year": 2020,
        "abstract": "...",
        "externalIds": {
          "DOI": "10.5678/referenced.2020",
          "PubMed": "87654321"
        }
      },
      "contexts": [
        "...as described in previous work [15]...",
        "...we used the method from [15] to..."
      ],
      "intents": ["methodology", "background"]
    }
  ]
}

Filter for relevance:

For each reference, check:

Context keywords: Do citation contexts mention user's query terms?
- Example: If user asks about "IC50 values", look for contexts mentioning "IC50", "activity", "potency"
Title match: Does title contain relevant keywords?
Intent: Is intent "methodology" or "result" (more relevant) vs "background" (less relevant)?

Scoring:

Context keywords match: +3 points
Title keywords match: +2 points
Intent is methodology/result: +2 points
Recent (< 5 years old): +1 point

Only add to queue if score ≥ 5

3. Forward Traversal (Citations)

Get papers citing this one:

curl "https://api.semanticscholar.org/graph/v1/paper/abc123def456/citations?fields=title,year,abstract,externalIds&limit=100"

Response format:

{
  "data": [
    {
      "citingPaper": {
        "paperId": "def456ghi",
        "title": "Newer Paper Citing This",
        "year": 2024,
        "abstract": "We extended the work of [original paper]...",
        "externalIds": {
          "DOI": "10.9012/citing.2024"
        }
      }
    }
  ]
}

Filter for relevance:

For each citing paper:

Title match: Keywords present in title?
Abstract match: User's query terms in abstract?
Recency: Newer papers often build on findings (prioritize < 2 years)
Citation count: If Semantic Scholar provides, highly cited papers more likely relevant

Scoring:

Title keywords match: +3 points
Abstract keywords match: +2 points
Recent (< 2 years): +2 points
Moderate recency (2-5 years): +1 point

Only add to queue if score ≥ 5

4. Deduplication

Before adding to queue:

Check papers-reviewed.json:

doi = paper["externalIds"].get("DOI")
if doi in papers_reviewed:
    skip  # Already processed
else:
    add to queue

CRITICAL: After evaluating any paper from citation traversal, add it to papers-reviewed.json regardless of score. This prevents re-processing the same paper from multiple sources.

Track citation relationship in citations/citation-graph.json:

{
  "10.1234/example.2023": {
    "references": ["10.5678/ref1.2020", "10.5678/ref2.2021"],
    "cited_by": ["10.9012/cite1.2024", "10.9012/cite2.2024"]
  }
}

CRITICAL: Use ONLY citation-graph.json for citation tracking. Do NOT create custom files like forward_citation_pmids.txt or citation_analysis.md. All findings go in SUMMARY.md.

5. Process Queue

Add relevant citations to processing queue:

{
  "doi": "10.5678/referenced.2020",
  "title": "Referenced Paper",
  "relevance_score": 7,
  "source": "backward_from:10.1234/example.2023",
  "context": "Method citation - describes IC50 measurement protocol"
}

Then:

Evaluate using evaluating-paper-relevance skill
If relevant, extract data and potentially traverse its citations too

Smart Traversal Limits

To avoid explosion:

Only traverse papers scoring ≥ 7 in initial evaluation
Only follow citations scoring ≥ 5 in relevance filtering
Limit traversal depth to 2 levels (original → references → references of references)
Check with user after every 50 papers total

Breadth-first strategy:

Get all references + citations for current paper
Filter and score them
Add high-scoring ones to queue
Process next paper in queue
Repeat until queue empty or hit limit

Progress Reporting

Report as you traverse:

🔗 Analyzing citations for: "Original Paper Title"
   → Found 45 references, 12 look relevant
   → Found 23 citing papers, 8 look relevant
   → Adding 20 papers to queue

📄 [51/127] Following reference: "Method for measuring IC50"
   Source: Referenced by original paper in Methods section
   Abstract score: 7 → Fetching full text...

API Rate Limiting

Semantic Scholar limits:

Free tier: 100 requests per 5 minutes
With API key: 1000 requests per 5 minutes

Be efficient:

Request multiple fields in one call (?fields=title,abstract,externalIds,year)
Use limit=100 to get more results per request
Cache responses - don't re-fetch same paper

If rate limited:

Wait 5 minutes
Report to user: "⏸️ Rate limited by Semantic Scholar API. Waiting 5 minutes..."
Consider getting API key for higher limits

Integration with Other Skills

After traversing citations:

Queue now has N new papers to evaluate
For each, use evaluating-paper-relevance skill
If relevant, extract to SUMMARY.md
If highly relevant (≥9), traverse its citations too
Update citation-graph.json to track relationships

Quick Reference

Relevance Filtering Checklist

Before adding citation to queue:

[ ] Check if already in papers-reviewed.json (skip if yes)
[ ] Score based on context/title keywords (need ≥ 5)
[ ] Verify external ID (DOI or PMID) exists
[ ] Add source tracking ("backward_from:DOI" or "forward_from:DOI")
[ ] Add to queue with metadata

Common Mistakes

Example Workflow

1. User asks: "Find selectivity data for BTK inhibitors"
2. Search finds Paper A (score: 9, has great IC50 data)
3. Traverse citations for Paper A:
   - References: 45 total, 12 relevant (mention "selectivity", "IC50")
   - Citations: 23 total, 8 relevant (newer papers on BTK)
4. Add 20 papers to queue
5. Evaluate first queued paper (score: 8)
6. Extract data, traverse its citations (add 5 more)
7. Continue until queue empty or user says stop

Next Steps

After traversing citations:

Process queued papers with evaluating-paper-relevance
Update SUMMARY.md with new findings
Check if reached checkpoint (50 papers or 5 minutes)
If checkpoint: ask user to continue or stop

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/05-kthorn-research-superpower/research/traversing-citations ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

2,869 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT