skills/arxiv/SKILL.md
Search, download, and summarize academic papers from arXiv. Use when user says "search arxiv", "download paper", "fetch arxiv", "arxiv search", "get paper pdf", or wants to find and save papers from arXiv to the local paper library.
npx skillsauth add wanshuiyin/Auto-claude-code-research-in-sleep arxivInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Search topic or arXiv paper ID: $ARGUMENTS
papers/ in the current project directory.arxiv_fetch.py, resolved per
shared-references/integration-contract.md §2
(Policy D1 — primary + fallback cascade). If unresolved (canonical
chain exhausted), fall back to the inline Python alternative
documented in Step 2.Overrides (append to arguments):
/arxiv "attention mechanism" - max: 20- return up to 20 results/arxiv "2301.07041" - download- download a specific paper by ID/arxiv "query" - dir: literature/- save PDFs to a custom directory/arxiv "query" - download: all- download all result PDFs
Parse $ARGUMENTS for directives:
2301.07041 or cs/0601001- max: N: override MAX_RESULTS (e.g., - max: 20)- dir: PATH: override PAPER_DIR (e.g., - dir: literature/)- download: download the first result's PDF after listing- download: all: download PDFs for all resultsIf the argument matches an arXiv ID pattern (YYMM.NNNNN or category/NNNNNNN), skip the search and go directly to Step 3.
Resolve $ARXIV_FETCHER via the canonical strict-safe chain (see
shared-references/integration-contract.md §2):
cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
if [ -z "${ARIS_REPO:-}" ] && [ -f .aris/installed-skills.txt ]; then
ARIS_REPO=$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null) || true
fi
ARXIV_FETCHER=".aris/tools/arxiv_fetch.py"
[ -f "$ARXIV_FETCHER" ] || ARXIV_FETCHER="tools/arxiv_fetch.py"
[ -f "$ARXIV_FETCHER" ] || { [ -n "${ARIS_REPO:-}" ] && ARXIV_FETCHER="$ARIS_REPO/tools/arxiv_fetch.py"; }
[ -f "$ARXIV_FETCHER" ] || ARXIV_FETCHER=""
If $ARXIV_FETCHER is non-empty, run:
python3 "$ARXIV_FETCHER" search "QUERY" --max MAX_RESULTS
If $ARXIV_FETCHER is empty (Policy D1 cascade), fall back to inline Python:
python3 - <<'PYEOF'
import json
import urllib.parse
import urllib.request
import xml.etree.ElementTree as ET
NS = "http://www.w3.org/2005/Atom"
query = urllib.parse.quote("QUERY")
url = (f"http://export.arxiv.org/api/query"
f"?search_query={query}&start=0&max_results=MAX_RESULTS"
f"&sortBy=relevance&sortOrder=descending")
with urllib.request.urlopen(url, timeout=30) as r:
root = ET.fromstring(r.read())
papers = []
for entry in root.findall(f"{{{NS}}}entry"):
aid = entry.findtext(f"{{{NS}}}id", "").split("/abs/")[-1].split("v")[0]
title = (entry.findtext(f"{{{NS}}}title", "") or "").strip().replace("\n", " ")
abstract = (entry.findtext(f"{{{NS}}}summary", "") or "").strip().replace("\n", " ")
authors = [a.findtext(f"{{{NS}}}name", "") for a in entry.findall(f"{{{NS}}}author")]
published = entry.findtext(f"{{{NS}}}published", "")[:10]
cats = [c.get("term", "") for c in entry.findall(f"{{{NS}}}category")]
papers.append({
"id": aid,
"title": title,
"authors": authors,
"abstract": abstract,
"published": published,
"categories": cats,
"pdf_url": f"https://arxiv.org/pdf/{aid}.pdf",
"abs_url": f"https://arxiv.org/abs/{aid}",
})
print(json.dumps(papers, ensure_ascii=False, indent=2))
PYEOF
Present results as a table:
| # | arXiv ID | Title | Authors | Date | Category |
|---|------------|---------------------|----------------|------------|----------|
| 1 | 2301.07041 | Attention Is All... | Vaswani et al. | 2017-06-12 | cs.LG |
When a single paper ID is requested (either directly or from Step 2):
python3 "$ARXIV_FETCHER" search "id:ARXIV_ID" --max 1
# or fallback:
python3 -c "
import urllib.request, xml.etree.ElementTree as ET
NS = 'http://www.w3.org/2005/Atom'
url = 'http://export.arxiv.org/api/query?id_list=ARXIV_ID'
with urllib.request.urlopen(url, timeout=30) as r:
root = ET.fromstring(r.read())
# print full details ...
"
Display: title, all authors, categories, full abstract, published date, PDF URL, abstract URL.
When download is requested, for each paper ID to download:
# Using fetch script:
python3 "$ARXIV_FETCHER" download ARXIV_ID --dir PAPER_DIR
# Fallback:
mkdir -p PAPER_DIR && python3 -c "
import pathlib
import sys
import urllib.request
out = pathlib.Path('PAPER_DIR/ARXIV_ID.pdf')
if out.exists():
print(f'Already exists: {out}')
sys.exit(0)
req = urllib.request.Request(
'https://arxiv.org/pdf/ARXIV_ID.pdf',
headers={'User-Agent': 'arxiv-skill/1.0'},
)
with urllib.request.urlopen(req, timeout=60) as r:
out.write_bytes(r.read())
print(f'Downloaded: {out} ({out.stat().st_size // 1024} KB)')
"
After each download:
Downloaded: papers/2301.07041.pdf (842 KB)For each paper (downloaded or fetched by API):
## [Title]
- **arXiv**: [ID] - [abs_url]
- **Authors**: [full author list]
- **Date**: [published]
- **Categories**: [cs.LG, cs.AI, ...]
- **Abstract**: [full abstract]
- **Key contributions** (extracted from abstract):
- [contribution 1]
- [contribution 2]
- [contribution 3]
- **Local PDF**: papers/[ID].pdf (if downloaded)
Required when research-wiki/ exists in the project; skip silently
otherwise. When the wiki dir exists, resolve $WIKI_SCRIPT per the
canonical chain at
shared-references/wiki-helper-resolution.md
(Variant B — warn-and-skip), then ingest every paper returned by this
invocation:
if [ -d research-wiki/ ]; then
cd "$(git rev-parse --show-toplevel 2>/dev/null || pwd)" || exit 1
ARIS_REPO="${ARIS_REPO:-$(awk -F'\t' '$1=="repo_root"{print $2; exit}' .aris/installed-skills.txt 2>/dev/null)}"
WIKI_SCRIPT=".aris/tools/research_wiki.py"
[ -f "$WIKI_SCRIPT" ] || WIKI_SCRIPT="tools/research_wiki.py"
[ -f "$WIKI_SCRIPT" ] || { [ -n "${ARIS_REPO:-}" ] && WIKI_SCRIPT="$ARIS_REPO/tools/research_wiki.py"; }
[ -f "$WIKI_SCRIPT" ] || {
echo "WARN: research_wiki.py not found; arxiv results delivered, wiki ingest skipped. Fix: bash tools/install_aris.sh, export ARIS_REPO, or cp <ARIS-repo>/tools/research_wiki.py tools/." >&2
WIKI_SCRIPT=""
}
if [ -n "$WIKI_SCRIPT" ]; then
for each arxiv_id in results:
python3 "$WIKI_SCRIPT" ingest_paper research-wiki/ \
--arxiv-id "<arxiv_id>"
fi
fi
The helper handles metadata fetch, slug, dedup, page creation, index
rebuild, and log append in a single call — do not handwrite
papers/<slug>.md. See
shared-references/integration-contract.md
for the canonical-helper rule. Missed ingests can be backfilled later
with python3 "$WIKI_SCRIPT" sync research-wiki/ --arxiv-ids <id1>,<id2>,...
after resolving $WIKI_SCRIPT as above.
Summarize what was done:
Found N papers for "query"Downloaded: papers/2301.07041.pdf (842 KB) (for each download)Wiki-ingested N papers (if research-wiki/ was present)Suggest follow-up skills:
/research-lit "topic" - multi-source review: Zotero + Obsidian + local PDFs + web
/novelty-check "idea" - verify your idea is novel against these papers
2301.07041) and old (cs/0601001)/research-lit with - sources: web as a fallbackdata-ai
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
development
Get a deep critical review of research from GPT using a secondary Codex agent. Use when user says "review my research", "help me review", "get external review", or wants critical feedback on research ideas, papers, or experimental results.
data-ai
Generate and rank research ideas given a broad direction. Use when user says "找idea", "brainstorm ideas", "generate research ideas", "what can we work on", or wants to explore a research area for publishable directions.
development
Autonomous multi-round research review loop. Repeatedly reviews using a secondary Codex agent, implements fixes, and re-reviews until positive assessment or max rounds reached. Use when user says "auto review loop", "review until it passes", or wants autonomous iterative improvement.