openclaw-skills/arxiv/SKILL.md
用于按关键词、作者、分类或编号检索 arXiv 论文。
npx skillsauth add seaworld008/commonly-used-high-value-skills arxivInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Search and retrieve academic papers from arXiv via their free REST API. No API key, no dependencies — just curl.
| Action | Command |
|--------|---------|
| Search papers | curl "https://export.arxiv.org/api/query?search_query=all:QUERY&max_results=5" |
| Get specific paper | curl "https://export.arxiv.org/api/query?id_list=2402.03300" |
| Read abstract (web) | web_extract(urls=["https://arxiv.org/abs/2402.03300"]) |
| Read full paper (PDF) | web_extract(urls=["https://arxiv.org/pdf/2402.03300"]) |
The API returns Atom XML. Parse with grep/sed or pipe through python3 for clean output.
curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5"
curl -s "https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5&sortBy=submittedDate&sortOrder=descending" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom'}
root = ET.parse(sys.stdin).getroot()
for i, entry in enumerate(root.findall('a:entry', ns)):
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
arxiv_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
published = entry.find('a:published', ns).text[:10]
authors = ', '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
summary = entry.find('a:summary', ns).text.strip()[:200]
cats = ', '.join(c.get('term') for c in entry.findall('a:category', ns))
print(f'{i+1}. [{arxiv_id}] {title}')
print(f' Authors: {authors}')
print(f' Published: {published} | Categories: {cats}')
print(f' Abstract: {summary}...')
print(f' PDF: https://arxiv.org/pdf/{arxiv_id}')
print()
"
| Prefix | Searches | Example |
|--------|----------|---------|
| all: | All fields | all:transformer+attention |
| ti: | Title | ti:large+language+models |
| au: | Author | au:vaswani |
| abs: | Abstract | abs:reinforcement+learning |
| cat: | Category | cat:cs.AI |
| co: | Comment | co:accepted+NeurIPS |
# AND (default when using +)
search_query=all:transformer+attention
# OR
search_query=all:GPT+OR+all:BERT
# AND NOT
search_query=all:language+model+ANDNOT+all:vision
# Exact phrase
search_query=ti:"chain+of+thought"
# Combined
search_query=au:hinton+AND+cat:cs.LG
| Parameter | Options |
|-----------|---------|
| sortBy | relevance, lastUpdatedDate, submittedDate |
| sortOrder | ascending, descending |
| start | Result offset (0-based) |
| max_results | Number of results (default 10, max 30000) |
# Latest 10 papers in cs.AI
curl -s "https://export.arxiv.org/api/query?search_query=cat:cs.AI&sortBy=submittedDate&sortOrder=descending&max_results=10"
# By arXiv ID
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300"
# Multiple papers
curl -s "https://export.arxiv.org/api/query?id_list=2402.03300,2401.12345,2403.00001"
After fetching metadata for a paper, generate a BibTeX entry:
{% raw %}
curl -s "https://export.arxiv.org/api/query?id_list=1706.03762" | python3 -c "
import sys, xml.etree.ElementTree as ET
ns = {'a': 'http://www.w3.org/2005/Atom', 'arxiv': 'http://arxiv.org/schemas/atom'}
root = ET.parse(sys.stdin).getroot()
entry = root.find('a:entry', ns)
if entry is None: sys.exit('Paper not found')
title = entry.find('a:title', ns).text.strip().replace('\n', ' ')
authors = ' and '.join(a.find('a:name', ns).text for a in entry.findall('a:author', ns))
year = entry.find('a:published', ns).text[:4]
raw_id = entry.find('a:id', ns).text.strip().split('/abs/')[-1]
cat = entry.find('arxiv:primary_category', ns)
primary = cat.get('term') if cat is not None else 'cs.LG'
last_name = entry.find('a:author', ns).find('a:name', ns).text.split()[-1]
print(f'@article{{{last_name}{year}_{raw_id.replace(\".\", \"\")},')
print(f' title = {{{title}}},')
print(f' author = {{{authors}}},')
print(f' year = {{{year}}},')
print(f' eprint = {{{raw_id}}},')
print(f' archivePrefix = {{arXiv}},')
print(f' primaryClass = {{{primary}}},')
print(f' url = {{https://arxiv.org/abs/{raw_id}}}')
print('}')
"
{% endraw %}
After finding a paper, read it:
# Abstract page (fast, metadata + abstract)
web_extract(urls=["https://arxiv.org/abs/2402.03300"])
# Full paper (PDF → markdown via Firecrawl)
web_extract(urls=["https://arxiv.org/pdf/2402.03300"])
For local PDF processing, see the ocr-and-documents skill.
| Category | Field |
|----------|-------|
| cs.AI | Artificial Intelligence |
| cs.CL | Computation and Language (NLP) |
| cs.CV | Computer Vision |
| cs.LG | Machine Learning |
| cs.CR | Cryptography and Security |
| stat.ML | Machine Learning (Statistics) |
| math.OC | Optimization and Control |
| physics.comp-ph | Computational Physics |
Full list: https://arxiv.org/category_taxonomy
The scripts/search_arxiv.py script handles XML parsing and provides clean output:
python scripts/search_arxiv.py "GRPO reinforcement learning"
python scripts/search_arxiv.py "transformer attention" --max 10 --sort date
python scripts/search_arxiv.py --author "Yann LeCun" --max 5
python scripts/search_arxiv.py --category cs.AI --sort date
python scripts/search_arxiv.py --id 2402.03300
python scripts/search_arxiv.py --id 2402.03300,2401.12345
No dependencies — uses only Python stdlib.
arXiv doesn't provide citation data or recommendations. Use the Semantic Scholar API for that — free, no key needed for basic use (1 req/sec), returns JSON.
# By arXiv ID
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300?fields=title,authors,citationCount,referenceCount,influentialCitationCount,year,abstract" | python3 -m json.tool
# By Semantic Scholar paper ID or DOI
curl -s "https://api.semanticscholar.org/graph/v1/paper/DOI:10.1234/example?fields=title,citationCount"
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/citations?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool
curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300/references?fields=title,authors,year,citationCount&limit=10" | python3 -m json.tool
curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=GRPO+reinforcement+learning&limit=5&fields=title,authors,year,citationCount,externalIds" | python3 -m json.tool
curl -s -X POST "https://api.semanticscholar.org/recommendations/v1/papers/" \
-H "Content-Type: application/json" \
-d '{"positivePaperIds": ["arXiv:2402.03300"], "negativePaperIds": []}' | python3 -m json.tool
curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=Yann+LeCun&fields=name,hIndex,citationCount,paperCount" | python3 -m json.tool
title, authors, year, abstract, citationCount, referenceCount, influentialCitationCount, isOpenAccess, openAccessPdf, fieldsOfStudy, publicationVenue, externalIds (contains arXiv ID, DOI, etc.)
python scripts/search_arxiv.py "your topic" --sort date --max 10curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID?fields=citationCount,influentialCitationCount"web_extract(urls=["https://arxiv.org/abs/ID"])web_extract(urls=["https://arxiv.org/pdf/ID"])curl -s "https://api.semanticscholar.org/graph/v1/paper/arXiv:ID/references?fields=title,citationCount&limit=20"curl -s "https://api.semanticscholar.org/graph/v1/author/search?query=NAME"| API | Rate | Auth | |-----|------|------| | arXiv | ~1 req / 3 seconds | None needed | | Semantic Scholar | 1 req / second | None (100/sec with API key) |
python3 -m json.tool for readabilityhep-th/0601001) vs new (2402.03300)https://arxiv.org/pdf/{id} — Abstract: https://arxiv.org/abs/{id}https://arxiv.org/html/{id}ocr-and-documents skillarxiv.org/abs/1706.03762 always resolves to the latest versionarxiv.org/abs/1706.03762v1 points to a specific immutable version<id> field returns the versioned URL (e.g., http://arxiv.org/abs/1706.03762v7)Papers can be withdrawn after submission. When this happens:
<summary> field contains a withdrawal notice (look for "withdrawn" or "retracted")testing
Orchestrating specialist AI agent teams as a meta-coordinator. Decomposes requests into minimum viable chains, spawns each as an independent session in AUTORUN modes, and drives to final output. Use when a task spans multiple specialist domains, requires parallel agent execution, or needs hub-and-spoke routing across the skill ecosystem.
tools
用于 Next.js App Router 模式开发,包含 RSC、Server Actions 和路由最佳实践。来源:skills.sh 10.2K installs。
tools
Deploy web projects to Netlify using the Netlify CLI (`npx netlify`). Use when the user asks to deploy, host, publish, or link a site/repo on Netlify, including preview and production deploys.
tools
Guides and best practices for working with Neon Serverless Postgres. Covers setup, connection methods, branching, autoscaling, scale-to-zero, read replicas, connection pooling, Neon Auth, and the Neon CLI, MCP server, REST API, TypeScript SDK, and Python SDK. Use when users ask about "Neon setup", "connect to Neon", "Neon project", "DATABASE_URL", "serverless Postgres", "Neon CLI", "neonctl", "Neon MCP", "Neon Auth", "@neondatabase/serverless", "@neondatabase/neon-js", "scale to zero", "Neon autoscaling", "Neon read replica", or "Neon connection pooling".