skills/literature/search/citeseerx-api/SKILL.md
Search computer science literature via the CiteSeerX digital library
npx skillsauth add wentorai/research-plugins citeseerx-apiInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
CiteSeerX is a scientific literature digital library focusing on computer and information science, with 10M+ documents and 100M+ citations. It provides autonomous citation indexing — extracting and linking citations without manual curation. The API supports document search, citation lookup, and metadata retrieval. Free, no authentication required.
https://citeseerx.ist.psu.edu/api
# Keyword search
curl "https://citeseerx.ist.psu.edu/api/search?q=graph+neural+networks&start=0&rows=20"
# Search by title
curl "https://citeseerx.ist.psu.edu/api/search?q=title:attention+is+all+you+need"
# Search by author
curl "https://citeseerx.ist.psu.edu/api/search?q=author:hinton&rows=25"
# Filter by year
curl "https://citeseerx.ist.psu.edu/api/search?q=federated+learning&year=2024"
# Sort by citation count
curl "https://citeseerx.ist.psu.edu/api/search?q=reinforcement+learning&sort=citationCount+desc"
# Get document metadata
curl "https://citeseerx.ist.psu.edu/api/document?doi=10.1.1.123.456"
# Get citations for a document
curl "https://citeseerx.ist.psu.edu/api/citations?doi=10.1.1.123.456"
# Get citing documents
curl "https://citeseerx.ist.psu.edu/api/citedby?doi=10.1.1.123.456"
| Parameter | Description | Example |
|-----------|-------------|---------|
| q | Search query | q=deep+learning |
| start | Pagination offset | start=20 |
| rows | Results per page | rows=50 |
| sort | Sort field | citationCount desc |
| year | Filter by year | year=2024 |
| doi | CiteSeerX document ID | doi=10.1.1.123.456 |
{
"response": {
"numFound": 5200,
"docs": [
{
"id": "10.1.1.123.456",
"title": "Graph Neural Networks: A Review",
"authors": ["Zhou, Jie", "Cui, Ganqu"],
"year": 2020,
"abstract": "Graph neural networks have been widely applied...",
"venue": "AI Open",
"citationCount": 3500,
"url": "https://citeseerx.ist.psu.edu/doc/10.1.1.123.456"
}
]
}
}
import requests
BASE_URL = "https://citeseerx.ist.psu.edu/api"
def search_citeseerx(query: str, rows: int = 20,
sort_by_citations: bool = False) -> list:
"""Search CiteSeerX computer science literature."""
params = {
"q": query,
"rows": rows,
"start": 0,
}
if sort_by_citations:
params["sort"] = "citationCount desc"
resp = requests.get(f"{BASE_URL}/search", params=params, timeout=30)
resp.raise_for_status()
data = resp.json()
results = []
for doc in data.get("response", {}).get("docs", []):
results.append({
"id": doc.get("id"),
"title": doc.get("title"),
"authors": doc.get("authors", []),
"year": doc.get("year"),
"venue": doc.get("venue"),
"citations": doc.get("citationCount", 0),
"abstract": doc.get("abstract", "")[:300],
"url": doc.get("url"),
})
return results
def get_citations(doc_id: str) -> list:
"""Get papers cited by a document."""
resp = requests.get(
f"{BASE_URL}/citations",
params={"doi": doc_id},
timeout=30,
)
resp.raise_for_status()
return resp.json().get("citations", [])
def get_cited_by(doc_id: str) -> list:
"""Get papers that cite a document."""
resp = requests.get(
f"{BASE_URL}/citedby",
params={"doi": doc_id},
timeout=30,
)
resp.raise_for_status()
return resp.json().get("citedby", [])
# Example: find most-cited CS papers on a topic
papers = search_citeseerx("knowledge distillation",
rows=10, sort_by_citations=True)
for p in papers:
print(f"[{p['year']}] {p['title']} (cited: {p['citations']})")
# Example: citation chain analysis
if papers:
refs = get_citations(papers[0]["id"])
print(f"\nReferences of top paper ({len(refs)} citations):")
for r in refs[:5]:
print(f" -> {r.get('title', 'Unknown')}")
tools
10 document processing skills. Trigger: extracting text from PDFs, parsing references, document Q&A. Design: parsing pipelines (GROBID, marker) and structured extraction tools.
documentation
Guide to tldraw for infinite canvas whiteboarding and diagram creation
testing
Create graphical abstracts, schematic diagrams, and scientific illustrations
documentation
Create UML diagrams and architecture visualizations with PlantUML