skills/fetch-preprint-recent/SKILL.md
Fetches preprints posted to bioRxiv or medRxiv within a given date window, then keyword-filters the results client-side. Wraps the public `api.biorxiv.org/details/{server}/{from}/{to}/{cursor}` endpoint, handles cursor pagination, normalizes records to a stable shape (doi, title, authors, abstract, date, server, version, url), and applies a keyword-OR match against title + abstract. Domain-neutral — usable for any biology / clinical preprint scan, not just one project. Use when user mentions bioRxiv, medRxiv, weekly preprint scan, fetch preprints, last-N-days preprints, or when a literature-scan agent needs structured preprint records.
npx skillsauth add lyndonkl/claude fetch-preprint-recentInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Fetch preprints from bioRxiv or medRxiv for a date window, normalize the records, and keyword-filter them.
- [ ] Step 1: Validate inputs (server, from, to, keywords)
- [ ] Step 2: Page through the details endpoint until messages.cursor exhausted
- [ ] Step 3: Normalize each record to the canonical shape
- [ ] Step 4: Dedupe within the window (keep latest version per DOI)
- [ ] Step 5: Keyword-filter on lowercased title + abstract
- [ ] Step 6: Return matched records + a summary (total fetched, total matched, pages fetched)
Step 1 — Validate inputs
Required:
server: one of biorxiv or medrxiv (the API uses these literal strings)from: YYYY-MM-DD, inclusiveto: YYYY-MM-DD, inclusive, must be ≥ fromkeywords: list of strings; may include multi-word phrases; case-insensitive matchingReject if window > 31 days (the API supports it but you almost never want to dump a month of preprints in one call without a stronger filter — flag and confirm).
Step 2 — Page through the endpoint
The endpoint is:
https://api.biorxiv.org/details/{server}/{from}/{to}/{cursor}
cursor=0.{
"messages": [{"status": "ok", "interval": "2026-05-04/2026-05-10", "cursor": "0", "count": 100, "total": 327}],
"collection": [ { record }, { record }, ... ]
}
collection, increment cursor by 100 (the page size is fixed) and refetch until cursor + count >= total.Use WebFetch with the URL. If WebFetch returns malformed JSON or a 5xx, retry once with a 2-second backoff; on second failure, return partial results with a fetch_errors field listing the failed cursors.
Step 3 — Normalize each record
The API returns fields like doi, title, authors, author_corresponding, author_corresponding_institution, date, version, type, license, category, jatsxml, abstract, published, server. Reduce to:
{
"id": "10.1101/2026.05.07.123456", // doi
"title": "...",
"authors": ["Smith J", "Doe A", ...], // split the API's `authors` string on `;`
"abstract": "...",
"date": "2026-05-07",
"server": "biorxiv", // or "medrxiv"
"version": 1,
"category": "neuroscience", // bioRxiv subject area
"url": "https://www.biorxiv.org/content/10.1101/2026.05.07.123456v1",
"published_doi": null // populated if the preprint has been published; from `published` field
}
URL pattern: https://www.{server}.org/content/{doi}v{version} (no https://doi.org/ redirect — direct to the preprint server keeps the abstract page accessible).
Step 4 — Dedupe within the window
The same DOI can appear with multiple version values if the authors revised mid-window. Keep the highest version per DOI. Drop the rest.
Step 5 — Keyword-filter
For each kept record, check whether any keyword (or phrase) appears in lowercase(title + " " + abstract). Match logic:
"protein language model" → must appear as a contiguous substring."crispr" → must appear with word boundaries (don't match "crisper").Track which keyword(s) matched per paper — downstream paper-relevance-filter will use that signal.
Step 6 — Return
Return a payload like:
{
"server": "biorxiv",
"window": "2026-05-04/2026-05-10",
"fetched_total": 327,
"matched_total": 14,
"pages_fetched": 4,
"fetch_errors": [],
"records": [ {normalized record + "matched_keywords": [...]} , ... ]
}
Cache the raw API JSON (pre-normalization) to the agent's .cache/ directory under {YYYY-WW}-{server}.json so a re-run can skip the network if the user wants to re-synthesize without re-fetching.
Pattern A — One server, one window: standard call. Use this in a weekly digest.
Pattern B — Multi-week catch-up: call once per week, never one giant 28-day window. The cursor pagination is fine but the keyword filter is more honest at weekly granularity (matches the way papers are released and discussed).
Pattern C — Preprint-only follow-up of a known paper: if you already have a DOI, do not use this skill. Use a direct WebFetch on https://api.biorxiv.org/details/{server}/{doi} instead.
from/to dates before the call.10.1101/... string.| Field | Source | Notes |
| -------------- | ------------------------------------------------------------- | --------------------------------------------------------------- |
| Endpoint | api.biorxiv.org/details/{server}/{from}/{to}/{cursor} | Same host serves both bioRxiv and medRxiv; only {server} varies |
| Auth | None | Public API. Be polite — don't hammer. |
| Page size | 100, fixed | Cursor is the offset into the window's results |
| Window cap | 31 days (soft); 7 days is the typical weekly call | Wider windows = thousands of records before keyword filter |
| Rate limit | Not formally documented; ~1 req/sec is safe | Backoff on 5xx |
| URL pattern | https://www.{server}.org/content/{doi}v{version} | Linkable to abstract page |
| Server values | biorxiv, medrxiv | Case-sensitive in path |
testing
--- name: advisory-edit description: A strict advisory-only editing discipline for a writer who dictates ("speaks out") essays and wants help WITHOUT having their voice changed. The editor directs structure, flags grammar, and suggests strategic language — but never modifies the writer's text unless the writer explicitly says "apply" / "make that change" / "rewrite this." Produces a line-referenced, suggestion-only critique where every item is marked the writer's call. Four passes: structural, l
testing
Provides the house style for analyst-grade strategist writing — third-person register with sparing first-person, no em dashes, no "not X, not Y, not Z" negation cascades, numbered footnote citations rather than inline source parentheticals, specific opinion-signaling phrases, and topic-forward paragraph structure modeled on voice patterns observed in Damodaran's Musings on Markets and Thompson's Stratechery. Use when consolidating working notes into a finished long-form strategist or analyst report that must read as written by a senior human analyst rather than an AI assistant.
testing
Renders a markdown report to a PDF using pandoc with xelatex (11pt serif body, 1-inch margins, numbered footnotes, formal heading hierarchy). Requires a one-time install of pandoc and a LaTeX engine on the user's machine — basictex on macOS or texlive-xetex on Linux. Does not attempt automatic install. Fails loudly with the exact install commands if pandoc or xelatex is missing on the user's PATH. Use when producing a finished strategist or analyst report PDF from a polished markdown source.
testing
Produces step-by-step computational walkthroughs of vector and matrix operations as a sequence of numbered "frames", showing the explicit state at each step. The text-equivalent of a 3Blue1Brown animation — each frame shows what changed and why, so the learner can re-trace the operation by hand. Use when the learner needs to *see* a computation unfold (eigenvalue computation, attention with 3 tokens, gradient descent step, SVD on a 2×2, layer norm on a 3-vector, softmax of a small input), when an explanation has been given but the learner needs to ground it in a worked example, or when introducing an operation that's intimidating in symbol form but trivial in pencil-and-paper form.