Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

brycewang-stanford/semantic-paper-radar

Name: semantic-paper-radar
Author: brycewang-stanford

skills/43-wentorai-research-plugins/skills/literature/discovery/semantic-paper-radar/SKILL.md

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research semantic-paper-radar

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Semantic Paper Radar

Overview

Traditional literature search relies on keyword matching—you find papers that contain the exact terms you search for. Semantic paper discovery goes further by understanding the meaning of research content and finding papers that are conceptually related, even when they use different terminology. This is especially powerful for interdisciplinary research, where the same idea may be expressed in completely different vocabularies across fields.

The Semantic Paper Radar skill provides methods for using embedding-based semantic search, vector databases, and AI-powered synthesis to build a comprehensive, continuously updated view of the literature relevant to your research. It enables you to discover papers you would never find through keyword search alone and to synthesize findings across large bodies of work.

This skill covers setting up a personal semantic search index over your paper collection, querying public semantic search APIs, and using LLM-powered analysis to extract themes and connections from clusters of related papers.

Semantic Search Fundamentals

How Embedding-Based Search Works

Semantic search represents both your query and each paper as dense numerical vectors (embeddings) in a high-dimensional space. Papers whose embeddings are close to your query's embedding are semantically similar, regardless of the specific words used.

Key components:

Embedding model: Converts text to vectors. Models like SPECTER2, SciBERT, or general-purpose models like text-embedding-3-small work well for academic text.
Vector database: Stores and indexes embeddings for fast similarity search. Options include ChromaDB (local), Qdrant, Pinecone, or Weaviate.
Similarity metric: Cosine similarity is standard for comparing text embeddings.

Using OpenAlex's Search API

OpenAlex indexes 250M+ works and supports search queries across all disciplines:

# Search works via the OpenAlex API
curl "https://api.openalex.org/works?search=attention+mechanisms+for+graph+neural+networks&per_page=20"

The search endpoint uses relevance-ranked matching. Combine with concept filters and citation data for more targeted discovery. For true semantic matching, build a local embedding index (see below).

Building a Personal Semantic Index

For deeper control, build a local semantic search index over your own paper collection:

import chromadb
from sentence_transformers import SentenceTransformer

# Initialize
model = SentenceTransformer("allenai/specter2")
client = chromadb.PersistentClient(path="./paper_index")
collection = client.get_or_create_collection(
    name="my_papers",
    metadata={"hnsw:space": "cosine"}
)

# Index a paper
abstract = "We propose a novel attention mechanism for graph neural networks..."
embedding = model.encode(abstract).tolist()
collection.add(
    documents=[abstract],
    embeddings=[embedding],
    metadatas=[{"title": "Graph Attention v2", "year": 2025, "arxiv_id": "2501.xxxxx"}],
    ids=["paper_001"]
)

# Query
results = collection.query(
    query_embeddings=[model.encode("message passing in GNNs").tolist()],
    n_results=10
)

This local index lets you search across all papers you have collected using natural language queries. As you add more papers, the index becomes a personalized discovery tool tuned to your specific research interests.

Discovery Workflows

Concept Expansion Radar

Use semantic search to expand your awareness beyond your current reading:

Seed: Take the abstract of your current paper (or a paragraph describing your research question).
Search: Run it as a semantic query against a large corpus (OpenAlex, CrossRef, or your local index).
Filter: Remove papers you have already read. Sort by a combination of semantic similarity and recency.
Cluster: Group the top 50 results into thematic clusters using k-means or HDBSCAN on their embeddings.
Explore clusters: Each cluster represents a related subtopic. Read the most-cited paper in each cluster to understand the connection to your work.

Cross-Disciplinary Bridge Detection

Semantic search excels at finding papers from other fields that address similar problems:

Describe your research problem in plain, non-technical language.
Run this as a semantic query without restricting to your field's journals or categories.
Review results from unexpected fields—these are potential interdisciplinary connections.
For each bridge paper, check its reference list for more domain-specific work in that field.

Novelty Radar

Set up periodic semantic searches to detect new papers in your area:

Define 3-5 "concept vectors" by encoding descriptions of your core research interests.
Weekly, search against newly published papers (last 7 days) from arXiv or OpenAlex.
Rank new papers by maximum similarity to any of your concept vectors.
Papers above your similarity threshold enter your reading queue automatically.

Semantic Synthesis

Once you have discovered a cluster of related papers, use AI-assisted synthesis to extract insights across the collection:

Theme Extraction

Feed the abstracts of a cluster of papers to an LLM and ask for:

Common themes and findings across the papers
Points of disagreement or contradiction
Methodological trends (what approaches are gaining vs. losing popularity)
Open questions that none of the papers fully address

Evidence Mapping

Create a structured evidence map from your semantic cluster:

| Theme | Supporting Papers | Contradicting Papers | Strength of Evidence | |-------|-------------------|----------------------|---------------------| | Theme A | [1], [3], [7] | [5] | Strong | | Theme B | [2], [4] | None | Moderate | | Theme C | [6] | [1], [8] | Contested |

This provides a bird's-eye view of where consensus exists and where debates remain open.

Gap Identification

Compare your research question against the semantic landscape of existing work. Regions of embedding space where your query falls but few papers exist represent potential research gaps—areas where your contribution would be most novel.

References

OpenAlex API: https://api.openalex.org
SPECTER2 model: https://huggingface.co/allenai/specter2
ChromaDB: https://www.trychroma.com
ResearchGPT: https://github.com/mukulpatnaik/researchgpt
OpenAlex: https://openalex.org

brycewang-stanford/semantic-paper-radar

skills/43-wentorai-research-plugins/skills/literature/discovery/semantic-paper-radar/SKILL.md

Semantic literature discovery and synthesis using embeddings

1,312 stars

data-ai

Updated May 29, 2026

$ install --global

skillsauth

npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research semantic-paper-radar

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 19, 2026, 2:15 AM29.9s1 file scanned

SKILL.md

name:: semantic-paper-radar
description:: Semantic literature discovery and synthesis using embeddings
emoji:: 📡
category:: literature
subcategory:: discovery
keywords:: ["semantic search", "embeddings", "literature synthesis", "paper discovery", "vector search", "knowledge mapping"]
source:: wentor-research-plugins

Semantic Paper Radar

Overview

Semantic Search Fundamentals

How Embedding-Based Search Works

Key components:

Embedding model: Converts text to vectors. Models like SPECTER2, SciBERT, or general-purpose models like text-embedding-3-small work well for academic text.
Vector database: Stores and indexes embeddings for fast similarity search. Options include ChromaDB (local), Qdrant, Pinecone, or Weaviate.
Similarity metric: Cosine similarity is standard for comparing text embeddings.

Using OpenAlex's Search API

OpenAlex indexes 250M+ works and supports search queries across all disciplines:

# Search works via the OpenAlex API
curl "https://api.openalex.org/works?search=attention+mechanisms+for+graph+neural+networks&per_page=20"

The search endpoint uses relevance-ranked matching. Combine with concept filters and citation data for more targeted discovery. For true semantic matching, build a local embedding index (see below).

Building a Personal Semantic Index

For deeper control, build a local semantic search index over your own paper collection:

import chromadb
from sentence_transformers import SentenceTransformer

# Initialize
model = SentenceTransformer("allenai/specter2")
client = chromadb.PersistentClient(path="./paper_index")
collection = client.get_or_create_collection(
    name="my_papers",
    metadata={"hnsw:space": "cosine"}
)

# Index a paper
abstract = "We propose a novel attention mechanism for graph neural networks..."
embedding = model.encode(abstract).tolist()
collection.add(
    documents=[abstract],
    embeddings=[embedding],
    metadatas=[{"title": "Graph Attention v2", "year": 2025, "arxiv_id": "2501.xxxxx"}],
    ids=["paper_001"]
)

# Query
results = collection.query(
    query_embeddings=[model.encode("message passing in GNNs").tolist()],
    n_results=10
)

Discovery Workflows

Concept Expansion Radar

Use semantic search to expand your awareness beyond your current reading:

Seed: Take the abstract of your current paper (or a paragraph describing your research question).
Search: Run it as a semantic query against a large corpus (OpenAlex, CrossRef, or your local index).
Filter: Remove papers you have already read. Sort by a combination of semantic similarity and recency.
Cluster: Group the top 50 results into thematic clusters using k-means or HDBSCAN on their embeddings.
Explore clusters: Each cluster represents a related subtopic. Read the most-cited paper in each cluster to understand the connection to your work.

Cross-Disciplinary Bridge Detection

Semantic search excels at finding papers from other fields that address similar problems:

Describe your research problem in plain, non-technical language.
Run this as a semantic query without restricting to your field's journals or categories.
Review results from unexpected fields—these are potential interdisciplinary connections.
For each bridge paper, check its reference list for more domain-specific work in that field.

Novelty Radar

Set up periodic semantic searches to detect new papers in your area:

Define 3-5 "concept vectors" by encoding descriptions of your core research interests.
Weekly, search against newly published papers (last 7 days) from arXiv or OpenAlex.
Rank new papers by maximum similarity to any of your concept vectors.
Papers above your similarity threshold enter your reading queue automatically.

Semantic Synthesis

Once you have discovered a cluster of related papers, use AI-assisted synthesis to extract insights across the collection:

Theme Extraction

Feed the abstracts of a cluster of papers to an LLM and ask for:

Common themes and findings across the papers
Points of disagreement or contradiction
Methodological trends (what approaches are gaining vs. losing popularity)
Open questions that none of the papers fully address

Evidence Mapping

Create a structured evidence map from your semantic cluster:

This provides a bird's-eye view of where consensus exists and where debates remain open.

Gap Identification

References

OpenAlex API: https://api.openalex.org
SPECTER2 model: https://huggingface.co/allenai/specter2
ChromaDB: https://www.trychroma.com
ResearchGPT: https://github.com/mukulpatnaik/researchgpt
OpenAlex: https://openalex.org

Related Skills

brycewang-stanford/literature-review-tools

tools

VerifiedTrustedCommunity

Recommend AND run open-source AI tools, agents, Claude Code / Codex skills, and MCP servers for any stage of a literature review — searching, reading, extracting, synthesizing, screening, citation-checking, and paper writing. Use when the user asks "what tool should I use to..." OR "install/run/use <tool> to ..." for research/lit-review work: automating a survey or related-work section, PDF→Markdown extraction for LLMs (MinerU/marker/docling), PRISMA / systematic review (ASReview), citation-backed Q&A over PDFs (PaperQA2), wiring papers into Claude/Cursor via MCP (arxiv/paper-search/zotero servers), or chatting with a Zotero library. Ships a launcher (scripts/litrun.py) that installs each tool in an isolated venv and runs it. Curated catalog of 70+ vetted projects. 支持中英文（用于「文献综述工具选型」与「一键安装/运行」）。

3,109SKILL.mdUpdated Jul 28, 2026

brycewang-stanford/literature-review-tools

brycewang-stanford/auto-empirical-research-skills

development

VerifiedTrustedCommunity

Route empirical-research requests through the Auto-Empirical Research Skills catalog when this whole repository is installed as one skill in Codex, CodeBuddy, Claude Code, or another IDE. Use to choose and load the right vendored AERS skill for causal inference, econometrics, replication, data acquisition, manuscript writing, peer review and referee responses, citation checking, de-AIGC editing, or full empirical-paper workflows without reading the entire repository at once.

3,109SKILL.mdUpdated Jun 27, 2026

brycewang-stanford/auto-empirical-research-skills

brycewang-stanford/aer-preregistration

documentation

VerifiedTrustedCommunity

Use when the project collects primary data or runs a field, lab, or survey experiment, before the intervention begins — write the pre-analysis plan, size the sample from a power calculation, and register with the AEA RCT Registry. Apply after the design is chosen in aer-identification and before any outcome data are seen.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/aer-preregistration

brycewang-stanford/economist-data-skill

tools

VerifiedTrustedCommunity

Guide economists to authoritative data sources with explicit, confirmed data specifications before retrieval; interfaces with Playwright MCP to navigate portals and extract real data, not articles about data.

3,021SKILL.mdUpdated Jul 23, 2026

brycewang-stanford/economist-data-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.git

# Copy into Claude Code skills folder (global)
cp -r Awesome-Agent-Skills-for-Empirical-Research/skills/43-wentorai-research-plugins/skills/literature/discovery/semantic-paper-radar ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

1,312 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT