skills/research-lit/SKILL.md
Search and analyze research papers, find related work, summarize key ideas. Use when user says "find papers", "related work", "literature review", "what does this paper say", or needs to understand academic papers.
npx skillsauth add shaun-z/auto-claude-code-research-in-sleep research-litInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Research topic: $ARGUMENTS
codex — Default: Codex MCP (xhigh). Override with — reviewer: oracle-pro for GPT-5.4 Pro via Oracle MCP. See shared-references/reviewer-routing.md.papers/ in the current project directoryliterature/ in the current project directoryCLAUDE.md under ## Paper Librarytrue, download top 3-5 most relevant arXiv PDFs to PAPER_LIBRARY after search. When false (default), only fetch metadata (title, abstract, authors) via arXiv API — no files are downloaded.ARXIV_DOWNLOAD = true.💡 Overrides:
/research-lit "topic" — paper library: ~/my_papers/— custom local PDF path/research-lit "topic" — sources: zotero, local— only search Zotero + local PDFs/research-lit "topic" — sources: zotero— only search Zotero/research-lit "topic" — sources: web— only search the web (skip all local)/research-lit "topic" — sources: web, semantic-scholar— also search Semantic Scholar for published venue papers (IEEE, ACM, etc.)/research-lit "topic" — sources: deepxiv— only search via DeepXiv progressive retrieval/research-lit "topic" — sources: all, deepxiv— use default sources plus DeepXiv/research-lit "topic" — arxiv download: true— download top relevant arXiv PDFs/research-lit "topic" — arxiv download: true, max download: 10— download up to 10 PDFs
This skill checks multiple sources in priority order. All are optional — if a source is not configured or not requested, skip it silently.
Parse $ARGUMENTS for a — sources: directive:
— sources: is specified: Only search the listed sources (comma-separated). Valid values: zotero, obsidian, local, web, semantic-scholar, deepxiv, exa, all.all — search every available source in priority order (semantic-scholar, deepxiv, and exa are excluded from all; they must be explicitly listed).Examples:
/research-lit "diffusion models" → all (default, no S2)
/research-lit "diffusion models" — sources: all → all (default, no S2)
/research-lit "diffusion models" — sources: zotero → Zotero only
/research-lit "diffusion models" — sources: zotero, web → Zotero + web
/research-lit "diffusion models" — sources: local → local PDFs only
/research-lit "topic" — sources: obsidian, local, web → skip Zotero
/research-lit "topic" — sources: web, semantic-scholar → web + S2 API (IEEE/ACM venue papers)
/research-lit "topic" — sources: deepxiv → DeepXiv only
/research-lit "topic" — sources: all, deepxiv → default sources + DeepXiv
/research-lit "topic" — sources: all, semantic-scholar → all + S2 API
/research-lit "topic" — sources: exa → Exa only (broad web + content extraction)
/research-lit "topic" — sources: all, exa → default sources + Exa web search
| Priority | Source | ID | How to detect | What it provides |
|----------|--------|----|---------------|-----------------|
| 1 | Zotero (via MCP) | zotero | Try calling any mcp__zotero__* tool — if unavailable, skip | Collections, tags, annotations, PDF highlights, BibTeX, semantic search |
| 2 | Obsidian (via MCP) | obsidian | Try calling any mcp__obsidian-vault__* tool — if unavailable, skip | Research notes, paper summaries, tagged references, wikilinks |
| 3 | Local PDFs | local | Glob: papers/**/*.pdf, literature/**/*.pdf | Raw PDF content (first 3 pages) |
| 4 | Web search | web | Always available (WebSearch) | arXiv, Semantic Scholar, Google Scholar |
| 5 | Semantic Scholar API | semantic-scholar | tools/semantic_scholar_fetch.py exists | Published venue papers (IEEE, ACM, Springer) with structured metadata: citation counts, venue info, TLDR. Only runs when explicitly requested via — sources: semantic-scholar or — sources: web, semantic-scholar |
| 6 | DeepXiv CLI | deepxiv | tools/deepxiv_fetch.py and installed deepxiv CLI | Progressive paper retrieval: search, brief, head, section, trending, web search. Only runs when explicitly requested via — sources: deepxiv or — sources: all, deepxiv |
| 7 | Exa Search | exa | tools/exa_search.py and installed exa-py SDK | AI-powered broad web search with content extraction (highlights, text, summaries). Covers blogs, docs, news, companies, and research papers beyond arXiv/S2. Only runs when explicitly requested via — sources: exa or — sources: all, exa |
Graceful degradation: If no MCP servers are configured, the skill works exactly as before (local PDFs + web search). Zotero and Obsidian are pure additions.
Skip this step entirely if Zotero MCP is not configured.
Try calling a Zotero MCP tool (e.g., search). If it succeeds:
/paper-write later)📚 Zotero annotations are gold — they show what the user personally highlighted as important, which is far more valuable than generic summaries.
Skip this step entirely if Obsidian MCP is not configured.
Try calling an Obsidian MCP tool (e.g., search). If it succeeds:
#diffusion-models, #paper-review)📝 Obsidian notes represent the user's processed understanding — more valuable than raw paper content for understanding their perspective.
Before searching online, check if the user already has relevant papers locally:
Locate library: Check PAPER_LIBRARY paths for PDF files
Glob: papers/**/*.pdf, literature/**/*.pdf
De-duplicate against Zotero: If Step 0a found papers, skip any local PDFs already covered by Zotero results (match by filename or title).
Filter by relevance: Match filenames and first-page content against the research topic. Skip clearly unrelated papers.
Summarize relevant papers: For each relevant local PDF (up to MAX_LOCAL_PAPERS):
Build local knowledge base: Compile summaries into a "papers you already have" section. This becomes the starting point — external search fills the gaps.
📚 If no local papers are found, skip to Step 1. If the user has a comprehensive local collection, the external search can be more targeted (focus on what's missing).
arXiv API search (always runs, no download by default):
Locate the fetch script and search arXiv directly:
# Try to find arxiv_fetch.py
SCRIPT=$(find tools/ -name "arxiv_fetch.py" 2>/dev/null | head -1)
# If not found, check ARIS install
[ -z "$SCRIPT" ] && SCRIPT=$(find ~/.claude/skills/arxiv/ -name "arxiv_fetch.py" 2>/dev/null | head -1)
# Search arXiv API for structured results (title, abstract, authors, categories)
python3 "$SCRIPT" search "QUERY" --max 10
If arxiv_fetch.py is not found, fall back to WebSearch for arXiv (same as before).
The arXiv API returns structured metadata (title, abstract, full author list, categories, dates) — richer than WebSearch snippets. Merge these results with WebSearch findings and de-duplicate.
Semantic Scholar API search (only when semantic-scholar is in sources):
When the user explicitly requests — sources: semantic-scholar (or — sources: web, semantic-scholar), search for published venue papers beyond arXiv:
S2_SCRIPT=$(find tools/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
[ -z "$S2_SCRIPT" ] && S2_SCRIPT=$(find ~/.claude/skills/semantic-scholar/ -name "semantic_scholar_fetch.py" 2>/dev/null | head -1)
# Search for published CS/Engineering papers with quality filters
python3 "$S2_SCRIPT" search "QUERY" --max 10 \
--fields-of-study "Computer Science,Engineering" \
--publication-types "JournalArticle,Conference"
If semantic_scholar_fetch.py is not found, skip silently.
Why use Semantic Scholar? Many IEEE/ACM journal papers are NOT on arXiv. S2 fills the gap for published venue-only papers with citation counts and venue metadata.
De-duplication between arXiv and S2: Match by arXiv ID (S2 returns externalIds.ArXiv):
venue/publicationVenue — if it has been published in a journal/conference (e.g. IEEE TWC, JSAC), use S2's metadata (venue, citationCount, DOI) as the authoritative version, since the published version supersedes the preprint. Keep the arXiv PDF link for download.externalIds.ArXiv are venue-only papers not on arXiv — these are the unique value of this source.DeepXiv search (only when deepxiv is in sources):
When the user explicitly requests — sources: deepxiv (or includes deepxiv in a combined source list), use the DeepXiv adapter for progressive retrieval:
python3 tools/deepxiv_fetch.py search "QUERY" --max 10
Then deepen only for the most relevant papers:
python3 tools/deepxiv_fetch.py paper-brief ARXIV_ID
python3 tools/deepxiv_fetch.py paper-head ARXIV_ID
python3 tools/deepxiv_fetch.py paper-section ARXIV_ID "Experiments"
If tools/deepxiv_fetch.py or the deepxiv CLI is unavailable, skip this source gracefully and continue with the remaining requested sources.
Why use DeepXiv? It is useful when a broad search should be followed by staged reading rather than immediate full-paper loading. This reduces unnecessary context while still surfacing structure, TLDRs, and the most relevant sections.
De-duplication against arXiv and S2:
deepxiv as an additional sourceExa search (only when exa is in sources):
When the user explicitly requests — sources: exa (or includes exa in a combined source list), use the Exa tool for broad AI-powered web search with content extraction:
EXA_SCRIPT=$(find tools/ -name "exa_search.py" 2>/dev/null | head -1)
# Search for research papers with highlights
python3 "$EXA_SCRIPT" search "QUERY" --max 10 --category "research paper" --content highlights
# Search for broader web content (blogs, docs, news)
python3 "$EXA_SCRIPT" search "QUERY" --max 10 --content highlights
If tools/exa_search.py or the exa-py SDK is unavailable, skip this source gracefully and continue with the remaining requested sources.
Why use Exa? Exa provides AI-powered search across the broader web (blogs, documentation, news, company pages) with built-in content extraction. It fills a gap between academic databases (arXiv, S2) and generic WebSearch by returning richer content with each result.
De-duplication against arXiv, S2, and DeepXiv:
Optional PDF download (only when ARXIV_DOWNLOAD = true):
After all sources are searched and papers are ranked by relevance:
# Download top N most relevant arXiv papers
python3 "$SCRIPT" download ARXIV_ID --dir papers/
For each relevant paper (from all sources), extract:
Present as a structured literature table:
| Paper | Venue | Method | Key Result | Relevance to Us | Source |
|-------|-------|--------|------------|-----------------|--------|
Plus a narrative summary of the landscape (3-5 paragraphs).
If Zotero BibTeX was exported, include a references.bib snippet for direct use in paper writing.
literature/ or papers/Required when research-wiki/ exists. Skip entirely (no action, no
error) if the directory is absent. Per
shared-references/integration-contract.md,
this step follows the canonical ingest contract — business logic lives
in tools/research_wiki.py, not in this prose.
📋 Research Wiki ingest (runs once, at end of research-lit):
[ ] 1. Predicate: `research-wiki/` exists? If no, skip this step.
[ ] 2. For each of the top 8–12 relevant papers (arxiv IDs collected above):
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--arxiv-id <id> [--thesis "<one-line>"] [--tags <t1>,<t2>]
[ ] 3. For each explicit relationship to an existing wiki entity,
add an edge:
python3 tools/research_wiki.py add_edge research-wiki/ \
--from "paper:<slug>" --to "<target_node_id>" \
--type <extends|contradicts|addresses_gap|inspired_by|...> \
--evidence "<one-sentence quote or reasoning>"
[ ] 4. Confirm papers/<slug>.md files were created (helper prints
"Paper ingested: ..."); if any failed with a network error,
retry or fall back to the --title/--authors/--year manual form.
ingest_paper handles slug generation, arXiv metadata fetch, dedup
(skips an existing paper by arXiv id), page rendering, index.md
rebuild, query_pack.md rebuild, and log append in a single call —
do not manually write papers/<slug>.md. If the helper is
unavailable (e.g., offline on a non-ARIS machine), log the gap and let
/research-wiki sync --arxiv-ids … backfill later.
For non-arXiv sources (Semantic Scholar only, IEEE/ACM journals without arXiv mirrors, blog posts), pass manual metadata instead:
python3 tools/research_wiki.py ingest_paper research-wiki/ \
--title "<full title>" --authors "A, B, C" --year <yyyy> \
--venue "<venue>" [--external-id-doi "<doi>"] [--thesis "..."]
mcp__zotero__search or mcp__zotero-mcp__search_items). Try the most common patterns and adapt.development
Generate publication-quality academic illustrations through a local Codex app-server bridge that uses Codex native image generation. This is a separate experimental alternative to `paper-illustration`, intended for Claude Code users who want a GPT-image-style renderer without modifying the original skill.
development
Two-way sync between a local paper directory and an Overleaf project via the Overleaf Git bridge (Premium feature). Lets you keep ARIS audit/edit workflows on the local copy while collaborators edit in the Overleaf web UI. Token never touches the agent — user does the one-time auth via macOS Keychain. Use when user says "同步 overleaf", "overleaf sync", "推送到 overleaf", "connect overleaf", "Overleaf 桥接", "pull overleaf", "push overleaf", or wants to bridge their ARIS paper directory with an Overleaf project.
development
Zero-context verification that every bibliographic entry in the paper is real, correctly attributed, and used in a context the cited paper actually supports. Uses a fresh cross-model reviewer with web/DBLP/arXiv lookup to catch hallucinated authors, wrong years, fabricated venues, version mismatches, and wrong-context citations (cite present but the cited paper does not establish the claim). Use when user says "审查引用", "check citations", "citation audit", "verify references", "引用核对", or before submission to ensure bibliography integrity.
data-ai
Paragraph-level structural blueprint for 10-12 page systems papers targeting OSDI, SOSP, ASPLOS, NSDI, and EuroSys. Provides page allocation, paragraph templates, and writing patterns. Use when user says "写系统论文", "systems paper structure", "OSDI paper", "SOSP paper", or wants fine-grained structural guidance for a systems conference submission.