arxiv-search/SKILL.md
Search arXiv for preprints and scholarly articles across physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering, systems science, and economics. Supports field-specific queries (title, author, abstract, category), boolean logic, date filtering, and bulk retrieval with pagination.
npx skillsauth add kltng/humanities-skills arxiv-searchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Search and retrieve metadata for preprints on arXiv, the open-access repository for scholarly articles.
The arXiv API returns Atom 1.0 XML — there is no JSON option. You must parse XML to extract results. The Python script handles this automatically.
export.arxiv.org, not arxiv.orghttp://export.arxiv.org/api/query?search_query=all:transformer&max_results=5
Using arxiv.org directly will not work for API queries.
Use AND, OR, ANDNOT — lowercase will not work:
search_query=ti:attention AND ti:transformer
search_query=cat:cs.AI ANDNOT cat:cs.CL
%28 and %29%22search_query=ti:%22large+language+model%22
search_query=au:bengio+AND+%28cat:cs.LG+OR+cat:cs.AI%29
search_query=submittedDate:[202401010000+TO+202412312359]
Format: YYYYMMDDTHHMM in GMT, 24-hour. The T is literal. Combine with other queries using AND.
arXiv asks for a minimum 3-second delay between API calls. Results update once daily, so there is no reason to poll frequently.
max_results caps at 2000 per request| Prefix | Field |
|--------|-------|
| ti | Title |
| au | Author |
| abs | Abstract |
| co | Comment |
| jr | Journal Reference |
| cat | Subject Category |
| rn | Report Number |
| id_list | Specific arXiv IDs (comma-separated, passed as separate param) |
| all | All fields simultaneously |
Use scripts/arxiv_api.py for programmatic access (zero dependencies):
from scripts.arxiv_api import ArxivAPI
api = ArxivAPI()
# Simple keyword search
results = api.search("all:transformer attention mechanism", max_results=10)
# Field-specific search
results = api.search("ti:diffusion AND cat:cs.CV", max_results=5)
# Author search
results = api.search("au:hinton AND cat:cs.LG", max_results=20)
# Date-filtered search
results = api.search(
"cat:cs.AI AND submittedDate:[202401010000 TO 202412312359]",
sort_by="submittedDate",
sort_order="descending",
max_results=50
)
# Fetch specific papers by arXiv ID
results = api.fetch_by_ids(["2301.07041", "2303.08774", "2305.10601"])
# Paginated retrieval
page1 = api.search("cat:cs.CL", max_results=100, start=0)
page2 = api.search("cat:cs.CL", max_results=100, start=100)
# Each result is a dict with keys:
# id, title, summary, authors, published, updated,
# categories, primary_category, links, comment, journal_ref, doi
| Category | Description |
|----------|-------------|
| cs.AI | Artificial Intelligence |
| cs.CL | Computation and Language (NLP) |
| cs.CV | Computer Vision |
| cs.LG | Machine Learning |
| cs.CR | Cryptography and Security |
| cs.DS | Data Structures and Algorithms |
| cs.SE | Software Engineering |
| math.AG | Algebraic Geometry |
| math.CO | Combinatorics |
| physics.hep-th | High Energy Physics - Theory |
| quant-ph | Quantum Physics |
| stat.ML | Machine Learning (Statistics) |
| econ.GN | General Economics |
| q-bio.GN | Genomics |
Full list: https://arxiv.org/category_taxonomy
references/api_reference.md — Complete API specs for query parameters, response fields, and error handlingscripts/arxiv_api.py — Python client with XML parsing, rate limiting, and pagination supportdevelopment
Query the China Biographical Database (CBDB) locally via SQLite for biographical data on 656K+ historical Chinese figures from the 7th century BCE through the 19th century CE. Use when searching for Chinese historical figures, scholars, officials, or literary figures — their biographical details, family/kinship networks, official postings, social associations, examination records, or addresses. Runs entirely locally after initial database download (~556 MB). Faster and more flexible than the API version.
development
Interact with a local Zotero 8 desktop application through its HTTP API at localhost:23119. Use this skill whenever the user wants to search, fetch, add, edit, or organize bibliographic items in their Zotero library, import citations (BibTeX, RIS, etc.), attach files, manage collections and tags, or retrieve full-text content from Zotero. Triggers on mentions of Zotero, citation management, reference libraries, bibliographic databases, or local library management. Also use when chaining with other catalog skills (Harvard, LOC, HathiTrust, etc.) to save found records into the user's Zotero library.
development
Search for items and properties on Wikidata and retrieve entity details, claims, and external identifiers. Supports both keyword search (Wikidata Action API) and semantic/hybrid search (Wikidata Vector Database), plus direct entity retrieval (Special:EntityData) and structured querying (WDQS SPARQL).
testing
Query and explore the TGAZ (Temporal Gazetteer) SQLite database of 82,000+ historical Chinese placenames spanning 763 BCE to 1911 CE. Use this skill whenever the user asks about historical Chinese places, administrative geography, dynastic jurisdictions, place name evolution, or wants to query tgaz.db. Also trigger when the user mentions CHGIS, TGAZ, historical gazetteer, Chinese historical GIS, or asks questions like "what was X called in dynasty Y", "what counties existed in year Z", "where was X located", or any spatial/temporal query about Chinese historical geography. This skill is relevant even for casual questions like "tell me about ancient Chang'an" or "Tang dynasty cities near the Yellow River".