skills/05-kthorn-research-superpower/research/finding-open-access-papers/SKILL.md
<!-- ╔══════════════════════════════════════════════════════════════╗ ║ 本文件为开源 Skill 原始文档,收录仅供学习与研究参考 ║ ║ CoPaper.AI 收集整理 | https://copaper.ai ║ ╚══════════════════════════════════════════════════════════════╝ 来源仓库: https://github.com/kthorn/research-superpower 项目名称: research-superpower 开源协议: MIT License 收录日期: 2026-04-02 声明: 本文件版权归原作者所有。此处收录旨在为社会科学实证研究者 提供 AI Agent Skills 的集中参考。如有侵权,请联系删除。 --> --- name: Finding Open Access Papers description: Use
npx skillsauth add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research skills/05-kthorn-research-superpower/research/finding-open-access-papersInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use Unpaywall to find legally available open access versions of papers that appear to be behind paywalls.
Core principle: Many paywalled papers have free versions (preprints, author manuscripts, institutional repositories). Unpaywall finds them.
Use this skill when:
Use BEFORE giving up on full text access
Simple REST API - no authentication required for reasonable usage
curl "https://api.unpaywall.org/v2/DOI?email=YOUR_EMAIL"
Parameters:
DOI - The paper's DOI (URL-encoded if needed)email - User's email (required, for courtesy/contact)IMPORTANT: Ask user for their email at the start of research session. Do NOT use placeholder emails like [email protected] or [email protected].
Example:
curl "https://api.unpaywall.org/v2/10.1038/[email protected]"
{
"doi": "10.1038/nature12373",
"title": "Paper Title",
"is_oa": true,
"best_oa_location": {
"url": "https://europepmc.org/articles/pmc3858213",
"url_for_pdf": "https://europepmc.org/articles/pmc3858213?pdf=render",
"version": "publishedVersion",
"license": "cc-by",
"host_type": "repository"
},
"oa_locations": [
{
"url": "https://europepmc.org/articles/pmc3858213",
"version": "publishedVersion"
},
{
"url": "https://arxiv.org/abs/1234.5678",
"version": "submittedVersion"
}
]
}
is_oa (boolean)
true - Open access version availablefalse - No free version foundbest_oa_location (object or null)
oa_locations (array)
version types:
publishedVersion - Final published version (best)acceptedVersion - Author's accepted manuscript (good)submittedVersion - Preprint before peer review (useful)# Try DOI first
curl -L "https://doi.org/10.1234/example.2023"
# If paywall detected (403, subscription required, etc):
curl "https://api.unpaywall.org/v2/10.1234/[email protected]"
# Parse JSON response
response=$(curl -s "https://api.unpaywall.org/v2/DOI?email=EMAIL")
# Check if OA available
is_oa=$(echo $response | jq -r '.is_oa')
if [ "$is_oa" = "true" ]; then
# Get best PDF URL
pdf_url=$(echo $response | jq -r '.best_oa_location.url_for_pdf // .best_oa_location.url')
# Download
curl -L -o "papers/paper.pdf" "$pdf_url"
fi
When OA found:
⚠️ Paper behind paywall at publisher
✓ Found open access version via Unpaywall!
Source: Europe PMC (published version)
PDF: https://europepmc.org/articles/pmc3858213?pdf=render
→ Downloading...
When no OA found:
⚠️ Paper behind paywall at publisher
✗ No open access version found via Unpaywall
Options:
- Request via institutional access
- Contact authors for preprint
- Continue with abstract only
If multiple locations available:
Priority order:
publishedVersion from publisher or PMCacceptedVersion from institutional repositorysubmittedVersion from preprint server (arXiv, bioRxiv)Add to full text fetching workflow:
Stage 2: Fetch Full Text
Try in order:
A. PubMed Central (free full text)
B. DOI resolution → If paywall, try Unpaywall
C. Unpaywall direct lookup
D. Preprints (bioRxiv, arXiv)
Updated workflow:
# 1. Try PMC
pmc_result=$(curl "https://eutils.ncbi.nlm.nih.gov/...")
if has_pmc_fulltext; then
fetch_pmc
exit 0
fi
# 2. Try DOI
doi_result=$(curl -L "https://doi.org/$doi")
if is_paywall; then
# 3. Try Unpaywall
unpaywall_result=$(curl "https://api.unpaywall.org/v2/$doi?email=$EMAIL")
if has_oa; then
fetch_unpaywall_pdf
exit 0
fi
fi
# 4. No full text available
report_no_fulltext
Free tier (with email):
Best practices:
import requests
import time
def find_open_access(doi, email):
"""
Find open access version via Unpaywall
Returns: (pdf_url, version, source) or (None, None, None)
"""
url = f"https://api.unpaywall.org/v2/{doi}"
params = {"email": email}
try:
response = requests.get(url, params=params, timeout=10)
response.raise_for_status()
data = response.json()
if not data.get('is_oa'):
return None, None, None
best_loc = data.get('best_oa_location')
if not best_loc:
return None, None, None
pdf_url = best_loc.get('url_for_pdf') or best_loc.get('url')
version = best_loc.get('version', 'unknown')
source = best_loc.get('host_type', 'unknown')
return pdf_url, version, source
except Exception as e:
print(f"Error checking Unpaywall for {doi}: {e}")
return None, None, None
# Usage
doi = "10.1038/nature12373"
pdf_url, version, source = find_open_access(doi, "[email protected]")
if pdf_url:
print(f"Found {version} at {source}")
print(f"PDF: {pdf_url}")
# Download PDF
response = requests.get(pdf_url)
with open(f'papers/{doi.replace("/", "_")}.pdf', 'wb') as f:
f.write(response.content)
else:
print("No open access version found")
time.sleep(0.1) # Rate limiting
Repositories:
Preprint servers:
Publisher sites:
DOI not found:
{
"error": "true",
"message": "DOI not found"
}
→ Check DOI format, try alternative identifiers
Network errors:
Malformed response:
is_oa fieldoa_locations array if best_oa_location missing| Task | Command |
|------|---------|
| Check if OA available | curl "https://api.unpaywall.org/v2/DOI?email=EMAIL" |
| Get best PDF URL | Parse .best_oa_location.url_for_pdf |
| List all OA sources | Parse .oa_locations[] |
| Check version type | Look at .version field |
| Download PDF | curl -L -o paper.pdf "$pdf_url" |
Called by:
evaluating-paper-relevance - When full text not in PMCanswering-research-questions - For highly relevant papersUpdates:
papers-reviewed.json - Note if OA foundSUMMARY.md - Include OA source infoUsing placeholder email: Using [email protected] or [email protected] → Ask user for their real email
Not including email: Required parameter, requests will fail
Checking every paper: Only check when needed (score ≥7, no PMC)
Ignoring version type: Published version better than preprint
Single source only: Check oa_locations array for alternatives
No rate limiting: Add delays even though no hard limit
Successful when:
After finding OA version:
development
Conduct rigorous thematic analysis (TA) of qualitative data following Braun and Clarke's (2006) six-phase framework. Use whenever the user mentions 'thematic analysis', 'TA', 'Braun and Clarke', 'qualitative coding', 'identifying themes', or asks for help analysing interviews, focus groups, open-ended survey responses, or transcripts to identify patterns. Also trigger for questions about inductive vs theoretical coding, semantic vs latent themes, essentialist vs constructionist epistemology, building a thematic map, or writing up a qualitative findings section. Covers all six phases, the four upfront analytic decisions, the 15-point quality checklist, and the five common pitfalls. Produces a Word document write-up and an annotated thematic map. Does NOT cover IPA, grounded theory, discourse analysis, conversation analysis, or narrative analysis — use a different method for those.
development
Guide users through writing a systematic literature review (SLR) following the PRISMA 2020 framework. Use this skill whenever the user mentions 'systematic review', 'systematic literature review', 'SLR', 'PRISMA', 'PRISMA 2020', 'PRISMA flow diagram', 'PRISMA checklist', or asks for help writing, structuring, or auditing a literature review that follows reporting guidelines. Also trigger when the user asks about inclusion/exclusion criteria for a review, search strategies for databases like Scopus/WoS/PubMed, study selection processes, risk of bias assessment, or narrative synthesis for a review paper. This skill covers the full PRISMA 2020 checklist (27 items), produces a Word document manuscript in strict journal article format, generates an annotated PRISMA flow diagram, and enforces APA 7th Edition referencing throughout. It does NOT cover meta-analysis or statistical pooling. By Chuah Kee Man.
testing
Performs placebo-in-time sensitivity analysis with hierarchical null model and optional Bayesian assurance. Use when checking model robustness, verifying lack of pre-intervention effects, or estimating study power.
data-ai
Fit, summarize, plot, and interpret a chosen CausalPy experiment. Use after the causal method has been selected, including when configuring PyMC/sklearn models and scale-aware custom priors.