Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

LiYu0524/paper-research

Name: paper-research
Author: LiYu0524

skills/paper-research/SKILL.md

npx skillsauth add LiYu0524/Auto-Reasearch-Skills paper-research

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Paper Research

Overview

Run a fast, reproducible “survey → shortlist → synthesize” loop for research topics, backed by small scripts that fetch arXiv metadata/PDFs/BibTeX, extract text, and generate structured Markdown briefs.

Quick start (recommended workflow)

Create a topic workspace directory (keep everything together):
- Example: notes/implicit-reasoning-survey/
Search arXiv and (optionally) download PDFs:
- Run: python3 scripts/arxiv_survey.py --terms "implicit reasoning" "hidden chain-of-thought" "multilingual reasoning" --max-results 100 --download-pdfs --pdf-dir ./pdfs --out ./arxiv.jsonl
Extract text (+ rough sections) from PDFs:
- Run: python3 scripts/pdf_extract.py --pdf-dir ./pdfs --out-dir ./texts --sections
Fetch BibTeX for the found arXiv IDs:
- Run: python3 scripts/arxiv_bibtex.py --from-jsonl ./arxiv.jsonl --out ./refs.bib
Generate a structured research brief (table + clusters + TODO slots for notes):
- Run: python3 scripts/generate_report.py --jsonl ./arxiv.jsonl --out ./REPORT.md

Then ask Codex to synthesize (taxonomy/benchmarks/experiments) using REPORT.md + your notes.

Workflow decision tree

A) “I need a lit review plan + paper outline”

Do this:

Use the scripts to produce REPORT.md (table + clusters) and refs.bib.
Build a survey plan as a set of falsifiable questions + “what evidence would change my mind”.
Output deliverables (in this order):
- Lit review plan (subtopics → why → representative papers to read first)
- Benchmarks/metrics (existing + proposed) aligned to the hypothesis
- Validation experiments (including representation/probing/interventions)
- Paper outline + expected contributions

When relevant, include “fastest path to reproduce” (datasets, eval harnesses, probing code).

B) “I need a reproducibility-first shortlist”

Prioritize:

Open-source repos (training recipe, evaluation harness, probing code)
Clear protocol (hyperparams, seeds, compute, preprocessing)
Reusable artifacts (scripts, configs, checkpoints, datasets)

Do this:

Run arxiv_survey.py with stricter terms and fewer results (e.g., 30–80).
Ask Codex to rank papers in REPORT.md by reproducibility criteria:
- Code availability, license clarity, dataset accessibility, protocol completeness
Produce:
- Ranked shortlist with repo links (if available)
- “Reusable parts” per paper (eval harness / probing / training recipe)
- Minimal reproduction plan (timeboxed: 2h / 1d / 1w)

C) “I need an evaluation suite + detection experiments (multilingual latent reasoning)”

Use this structure:

Hypothesis → operational definition (what counts as “English latent reasoning”)
Tasks:
- Multi-step reasoning across languages (same semantics, different surface forms)
- Translation-free reasoning (language-neutral, symbol-heavy, or synthetic)
- Controlled prompts enforcing target-language output
Metrics that separate reasoning vs fluency:
- Task accuracy, step-consistency proxies, calibration, controllability, latency
Representation-level detection:
- Layer-wise language ID / probing on activations
- Activation patching/interventions (swap “language subspace” signals)
- Forced-language and mixed-language ablations
Expected signatures + failure modes (confounds: translation, tokenization, data mixture)

Use assets/experiment_checklist.md as the backbone checklist.

Templates (assets/)

Copy and fill these as working docs:

assets/research_brief.md → one-topic brief (taxonomy + top papers + open questions)
assets/paper_comparison_table.md → consistent per-paper extraction fields
assets/experiment_checklist.md → step-by-step experimental checklist

Scripts

All scripts are pure-Python (stdlib) where possible. pdf_extract.py supports optional extractors; if none are available, it prints a clear install hint.

`scripts/arxiv_survey.py`

Search arXiv via the official Atom API, write results to JSONL, and optionally download PDFs.

`scripts/arxiv_bibtex.py`

Fetch BibTeX from arxiv.org for a list of arXiv IDs or a JSONL produced by arxiv_survey.py.

`scripts/pdf_extract.py`

Extract text from PDFs into .txt and optionally produce rough section splits (heuristics).

`scripts/dedupe_jsonl.py`

Dedupe a JSONL file by arxiv_id and near-duplicate titles (useful when iterating queries).

`scripts/generate_report.py`

Generate a structured Markdown report (table + clusters + TODO note slots) from arxiv.jsonl.

References

Read when you need query patterns or a report schema:

references/arxiv_query_guide.md
references/report_fields.md

Output quality bar (what “good” looks like)

Prefer explicit assumptions + failure modes over broad claims.
Prefer checklists and protocols over vague “future work”.
Always separate: (1) claim, (2) evidence, (3) test that could falsify it.

LiYu0524/paper-research

skills/paper-research/SKILL.md

End-to-end paper research support for arXiv/literature surveys, reproducibility-focused paper shortlisting, and experiment design. Use when you need to (1) search arXiv with complex queries, (2) download PDFs, extract text/sections, and fetch BibTeX, (3) dedupe/cluster results into a structured report, and (4) turn findings into a lit-review plan, benchmark/evaluation suite, and representation/probing experiment checklist (e.g., implicit reasoning, hidden-CoT, multilingual reasoning, cross-lingual alignment).

9 stars

testing

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add LiYu0524/Auto-Reasearch-Skills paper-research

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 2:53 AM20.0s14 files scanned

SKILL.md

name:: paper-research
description:: End-to-end paper research support for arXiv/literature surveys, reproducibility-focused paper shortlisting, and experiment design. Use when you need to (1) search arXiv with complex queries, (2) download PDFs, extract text/sections, and fetch BibTeX, (3) dedupe/cluster results into a structured report, and (4) turn findings into a lit-review plan, benchmark/evaluation suite, and representation/probing experiment checklist (e.g., implicit reasoning, hidden-CoT, multilingual reasoning, cross-lingual alignment).

Paper Research

Overview

Quick start (recommended workflow)

Create a topic workspace directory (keep everything together):
- Example: notes/implicit-reasoning-survey/
Search arXiv and (optionally) download PDFs:
- Run: python3 scripts/arxiv_survey.py --terms "implicit reasoning" "hidden chain-of-thought" "multilingual reasoning" --max-results 100 --download-pdfs --pdf-dir ./pdfs --out ./arxiv.jsonl
Extract text (+ rough sections) from PDFs:
- Run: python3 scripts/pdf_extract.py --pdf-dir ./pdfs --out-dir ./texts --sections
Fetch BibTeX for the found arXiv IDs:
- Run: python3 scripts/arxiv_bibtex.py --from-jsonl ./arxiv.jsonl --out ./refs.bib
Generate a structured research brief (table + clusters + TODO slots for notes):
- Run: python3 scripts/generate_report.py --jsonl ./arxiv.jsonl --out ./REPORT.md

Then ask Codex to synthesize (taxonomy/benchmarks/experiments) using REPORT.md + your notes.

Workflow decision tree

A) “I need a lit review plan + paper outline”

Do this:

Use the scripts to produce REPORT.md (table + clusters) and refs.bib.
Build a survey plan as a set of falsifiable questions + “what evidence would change my mind”.
Output deliverables (in this order):
- Lit review plan (subtopics → why → representative papers to read first)
- Benchmarks/metrics (existing + proposed) aligned to the hypothesis
- Validation experiments (including representation/probing/interventions)
- Paper outline + expected contributions

When relevant, include “fastest path to reproduce” (datasets, eval harnesses, probing code).

B) “I need a reproducibility-first shortlist”

Prioritize:

Open-source repos (training recipe, evaluation harness, probing code)
Clear protocol (hyperparams, seeds, compute, preprocessing)
Reusable artifacts (scripts, configs, checkpoints, datasets)

Do this:

Run arxiv_survey.py with stricter terms and fewer results (e.g., 30–80).
Ask Codex to rank papers in REPORT.md by reproducibility criteria:
- Code availability, license clarity, dataset accessibility, protocol completeness
Produce:
- Ranked shortlist with repo links (if available)
- “Reusable parts” per paper (eval harness / probing / training recipe)
- Minimal reproduction plan (timeboxed: 2h / 1d / 1w)

C) “I need an evaluation suite + detection experiments (multilingual latent reasoning)”

Use this structure:

Hypothesis → operational definition (what counts as “English latent reasoning”)
Tasks:
- Multi-step reasoning across languages (same semantics, different surface forms)
- Translation-free reasoning (language-neutral, symbol-heavy, or synthetic)
- Controlled prompts enforcing target-language output
Metrics that separate reasoning vs fluency:
- Task accuracy, step-consistency proxies, calibration, controllability, latency
Representation-level detection:
- Layer-wise language ID / probing on activations
- Activation patching/interventions (swap “language subspace” signals)
- Forced-language and mixed-language ablations
Expected signatures + failure modes (confounds: translation, tokenization, data mixture)

Use assets/experiment_checklist.md as the backbone checklist.

Templates (assets/)

Copy and fill these as working docs:

assets/research_brief.md → one-topic brief (taxonomy + top papers + open questions)
assets/paper_comparison_table.md → consistent per-paper extraction fields
assets/experiment_checklist.md → step-by-step experimental checklist

Scripts

All scripts are pure-Python (stdlib) where possible. pdf_extract.py supports optional extractors; if none are available, it prints a clear install hint.

`scripts/arxiv_survey.py`

Search arXiv via the official Atom API, write results to JSONL, and optionally download PDFs.

`scripts/arxiv_bibtex.py`

Fetch BibTeX from arxiv.org for a list of arXiv IDs or a JSONL produced by arxiv_survey.py.

`scripts/pdf_extract.py`

Extract text from PDFs into .txt and optionally produce rough section splits (heuristics).

`scripts/dedupe_jsonl.py`

Dedupe a JSONL file by arxiv_id and near-duplicate titles (useful when iterating queries).

`scripts/generate_report.py`

Generate a structured Markdown report (table + clusters + TODO note slots) from arxiv.jsonl.

References

Read when you need query patterns or a report schema:

references/arxiv_query_guide.md
references/report_fields.md

Output quality bar (what “good” looks like)

Prefer explicit assumptions + failure modes over broad claims.
Prefer checklists and protocols over vague “future work”.
Always separate: (1) claim, (2) evidence, (3) test that could falsify it.

Related Skills

LiYu0524/paper-reviewer

testing

VerifiedTrustedCommunity

Review research papers (especially PDFs). Use when the user asks to read/通读/讲解/总结/审稿 a paper and wants a Chinese-first explanation of what it does, what is novel (创新点), plus reviewer-style strengths/weaknesses, major/minor concerns, and questions to authors.

9SKILL.mdUpdated Apr 23, 2026

LiYu0524/paper-reviewer

LiYu0524/paper-banana

research

VerifiedTrustedCommunity

学术插图生成 - 使用 PaperBanana 多智能体框架从方法文本自动生成框架图和统计图

9SKILL.mdUpdated Apr 23, 2026

LiYu0524/paper-banana

LiYu0524/google-docs

development

VerifiedTrustedCommunity

Manage Google Docs and Google Drive with full document operations and file management. Includes Markdown support for creating formatted documents with headings, bold, italic, lists, tables, and checkboxes. Also supports Drive operations (upload, download, share, search).

9SKILL.mdUpdated Apr 23, 2026

LiYu0524/drawio

data-ai

VerifiedTrustedCommunity

Generate draw.io diagrams as .drawio files, optionally export to PNG/SVG/PDF with embedded XML

9SKILL.mdUpdated Apr 23, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/LiYu0524/Auto-Reasearch-Skills.git

# Copy into Claude Code skills folder (global)
cp -r Auto-Reasearch-Skills/skills/paper-research ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

LiYu0524/Auto-Reasearch-Skills

9 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT