skills/literature-digest/SKILL.md
Produce the weekly Ersilia literature digest covering AI/ML for drug discovery, antibiotic and antimicrobial discovery, NTDs and AMR, and open science for global health — through an explicit LMIC and decolonisation lens. Use this skill whenever the user asks to prepare, run, or refresh the literature digest. Triggers include: "weekly literature digest", "literature digest for Ersilia", "/literature-digest", "lit digest this week", "what did we miss last week", "digest the literature". Always use this skill for digest requests even if the ask seems simple.
npx skillsauth add ersilia-os/claude-ersilia-skills literature-digestInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You produce the weekly literature digest for Ersilia Open Source Initiative — a curated markdown file covering AI/ML for drug discovery, antibiotic and antimicrobial discovery, NTDs and AMR, and open science for global health.
The digest is read by the whole Ersilia team. Assume shared internal context (Hub, H3D, GC-ADDA, Chemical Checker, Boltz-2, etc.) but write so a new team scientist can follow.
The relevance bar is three-pillar: an item qualifies if it touches either an AI/ML
method plausibly applicable to drug discovery, or an Ersilia-relevant disease, or an
open-science / capacity-building / LMIC-policy story. Apply the equity lens throughout —
LMIC-led work (per references/lmic-countries.md) gets the 🌍 marker and a ranking bonus.
The user will invoke this skill with:
--from 2026-05-13 --to 2026-05-20. Default: the last 7
days ending today.--out <path> to override the local staging path. Default:
digests/YY-MM-DD-literature-digest.md (end date of the window, 2-digit year)
relative to this skill folder. For 2026-05-21 the file is
26-05-21-literature-digest.md. The local file is a working copy; the canonical
home is the remote repo ersilia-os/digests at
literature/YY-MM-DD-literature-digest.md (Step 8).--force to override the recent-digest guards in Step 0.--dry-run to fetch and rank but not write the digest file (used for testing).If anything is unclear, ask one focused question before proceeding. Never invent missing inputs.
Run the steps in order. Track progress with TaskCreate / TaskUpdate if the run is
non-trivial.
Gate A — required MCPs. The Slack MCP (workspace ersilia-workspace) and the
Gmail MCP must both be available in this session. These are the two highest-signal
sources for the digest, and the spec explicitly forbids generating a digest without
them. To verify, attempt one cheap read against each:
slack_search_channels with query="literature". If it errors with
"MCP not available" / "tool not found" / similar, treat it as missing.search_threads with query="newer_than:1d" (any small query). Same
failure modes.If either check fails, STOP. Tell the user which MCP is unavailable and refuse to proceed. Do not fetch from any other source either — a digest without Slack and Gmail is not a digest we ship. The user can re-invoke once the MCPs are connected.
Gate B — references not stale. Check whether the skill's reference files are overdue for their quarterly refresh:
python scripts/check_references_freshness.py
OK, continue.DUE, the references are past their 90-day refresh
cadence. Pause the run and tell the user. Offer two choices: (a) run the
refresh procedure documented in references/refresh-procedure.md now (it
re-derives the topic / author / journal / Hub priors from the Hub catalogue,
Slack #literature, Google Drive grants, and Gmail Scholar alerts), or
(b) explicitly defer and proceed with stale references — record the deferral
in references/_state.json's refresh_log with an explanatory note. Never
refresh silently; the changes are editorial.OK and DUE) so an
emergency digest can still ship if the user defers. Exit 1 means the state
file itself is broken — STOP and fix that before continuing.Gate C — no recent digest (remote first, then local). The digest is weekly, not
redundant — and the canonical home is the remote ersilia-os/digests repo, so the
remote is the authoritative check. Run both, remote first:
python scripts/check_remote_digest.py --days 7
python scripts/check_recent_digest.py --days 7
check_remote_digest.py exits non-zero, the remote was unreachable. STOP —
do not silently fall through to the local check, because a successful local run
could clobber published work on retry. Surface the error to the user.check_remote_digest.py prints a path on stdout, STOP. A digest already
exists in the remote repo for this window. The default is never to run again
— tell the user the remote path (and the html URL from stderr) and explicitly ask
whether they want to override with --force. Do not assume yes.check_recent_digest.py prints a path, STOP the same way (a local working
copy exists; offer to upload the existing one instead of regenerating).The cutoff is the date encoded in the filename (YY-MM-DD-literature-digest.md,
or the legacy YY-MM-DD-digest.md), not the filesystem mtime. A digest dated within
the last 7 days blocks a new run.
Read these reference files into context before fetching anything:
references/search-landscape.md — topics, keywords, authors, journals, task taxonomy,
and the ranking weights.references/lmic-countries.md — World Bank low/lower-middle income countries (the 🌍
rule).references/source-catalogue.md — which sources are in v1 vs. deferred, and the schema
for normalised items.references/output-template.md — exact structure of the digest file.Do not paraphrase these — quote them when relevant.
python scripts/parse_prior_digests.py --last 8 --out /tmp/seen.txt
This pulls the last 8 published digests from ersilia-os/digests/literature/ via
gh, downloads their bodies, and emits a flat list of DOIs, arXiv IDs, and URLs
to exclude. The remote repo is the source of truth — local digests/ files
are just working copies and may be stale.
literature/ directory does not exist yet (first-ever run), the
script emits an empty file and exits 0 — fine.--also-local flag is available for development; production runs do not
use it.Four MVP connectors, two patterns. Run the API-based ones (bioRxiv, Europe PMC) in parallel; the MCP-based ones (Slack, Gmail) need a two-step collection because the MCP is not callable from Python subprocesses.
Track per-connector status (success / failure + reason) as you go — it feeds the digest's connector semaphore (Step 7). Per-connector item counts and failure reasons are NOT included in the digest itself; the semaphore is the only connector signal that ships.
API-based — straight subprocess calls:
python scripts/fetch_biorxiv.py --from {start} --to {end} --out /tmp/bx.json
python scripts/fetch_europepmc.py --from {start} --to {end} --out /tmp/epmc.json
Slack — workspace handle is ersilia-workspace:
slack_search_channels to locate #literature (or the closest match — log
the channel name).slack_read_channel to read messages in the date range.slack_read_user_profile for
a real name to use as attribution./tmp/slack_raw.json. Each
message needs text and ts; include user_real_name (or user_name),
permalink, and channel_name whenever available.python scripts/fetch_slack.py --raw /tmp/slack_raw.json --out /tmp/slack.json
Gmail — high-signal source: Google Scholar alerts, curated newsletters, publisher table-of-contents emails, and collaborator threads sharing links. Do not include the user's email address anywhere in the digest — label the connector by what it covers, never by the inbox it reads.
Use the Gmail MCP yourself, scoped to the user's inbox (the harness already knows
which account). The single highest-recall query is the user-maintained
Research-Updates Gmail label — they tag everything literature-relevant under
it. Pull from it exhaustively first, then top up with the per-sender buckets if
anything looks missing:
Research-Updates label:
label:Research-Updates after:YYYY/MM/DD (use the window start date). This
subsumes the Scholar alerts, journal ToC ealerts (Nature/NMI/Nat Comms/Comm
Chem/Comm Med/Comm Bio, Cell Press, eLife, Wiley/ChemMedChem, ACS journal
alerts for JCIM/J Med Chem/ACS Med Chem Lett/ACS Infect Dis/ACS Omega,
Lancet, BMJ, npj Digital Medicine), and the curated weekly digests
(Semantic Scholar, The Academic Digest, Nature Briefing).from:[email protected] newer_than:7d (also [email protected]) — use only if
the label query returns nothing or is unavailable.ToC alerts are dense. A single NMI/JCIM/Nat Comms email contains 10–30
paper links and may bust the MCP's per-thread output limit. When that happens
the get_thread tool error message gives you the path to the on-disk dump —
parse it with python3 + a regex on the HTML, then verify the resulting
DOIs/first-authors via Crossref (see step 6 below).
For each thread, call get_thread to get the body text. Assemble a JSON list
/tmp/gmail_raw.json with one object per thread:
{"id": "...", "subject": "...", "sender": "[email protected]",
"sender_name": "Google Scholar Alerts", "date": "YYYY-MM-DD",
"body_text": "...", "snippet": "...", "thread_url": "https://mail.google.com/..."}
Then normalise:
python scripts/fetch_gmail.py --raw /tmp/gmail_raw.json --out /tmp/gmail.json
The normaliser deliberately drops the raw sender (email address) from output so
that downstream digests cannot leak it; only the human-friendly display name
survives.
Graceful degradation for API sources only. The two MCP sources (Slack and Gmail) are required (Step 0 already gated on this); if either becomes unreachable mid-run, that is a hard failure — abort with the same message as Gate A. The two API sources (bioRxiv, Europe PMC) can degrade: if one errors, mark it 🔴 in the connector semaphore and continue with the remaining three.
Phase B / C sources (arXiv, chemRxiv, Semantic Scholar, journal RSS, GitHub,
Hugging Face): not in MVP. If you see scripts for these in scripts/, run them
too; otherwise skip silently. They do not get a row in the connector semaphore
unless they actually exist as MVP connectors.
python scripts/dedup_and_rank.py \
--in /tmp/bx.json --in /tmp/epmc.json --in /tmp/slack.json \
--seen /tmp/seen.txt \
--landscape references/search-landscape.md \
--lmic references/lmic-countries.md \
--out /tmp/pool.json
This produces a top-~50 pool with score breakdowns. The pool is what you triage in Step 5 — do not fall back to the raw fetched data.
Read /tmp/pool.json. For each item:
references/hub-incorporation-criteria.md before triaging. The single most
important question for in-scope items is: "could this become an Ersilia
Model Hub entry?" Activity prediction, featurization, and property
prediction together account for 86 % of Ready Hub models — weight items in
those subtasks heavier than everything else.references/output-template.md. The 🤖 marker and trailing task emoji do the
Hub-flagging work inline; there is no dedicated chapter.references/output-template.md (aim 25–35).
Adjust to what the week actually delivered — do not pad.For every chosen item, write the entry following references/output-template.md exactly.
Specifically:
https://api.crossref.org/works/<doi> lookup (or
https://api.crossref.org/works?query.title=...&filter=container-title:<journal>
when only the title is known) and use message.author[0].family and
message.published.date-parts[0]. Same rule for arXiv IDs (resolve via
Crossref or arXiv's own API). If lookup fails, omit the author rather than
guessing.references/output-template.md: [Author et al., *Venue*, YYYY](url) {ribbon} — **Title.** Combined why-matters + TL;DR.⭐ 🌍 🤖 🗃️ 💻) in fixed display order, only when
load-bearing. See output-template.md for criteria.ersilia-os/digests.Authors line — name the lead institution.Venue line — preprint server or journal, plus the publication date.TL;DR — 1–2 sentences. Plain language. Write fresh. Never paste the abstract verbatim.Why it matters for Ersilia — required, one sentence, specific. Name the Hub model, NTD
pipeline, partner institution, or open-source release that ties it to Ersilia.Links — paper URL, DOI, code, model/dataset if present.**Shared by**: @{sharer} above the Links line.Write to the default local staging path unless --out was provided:
skills/literature-digest/digests/{YY}-{MM}-{DD}-literature-digest.md
(2-digit year, end date of the window — e.g. 26-05-21-literature-digest.md).
This is the working copy; the canonical home is the remote repo published in Step 8.
Include:
references/output-template.md).The digests/ folder is .gitignored — the file lives locally but is not committed by
default.
The local file in digests/ is a working copy. The canonical home is
github.com/ersilia-os/digests at literature/{YY}-{MM}-{DD}-literature-digest.md.
Upload via:
python scripts/upload_digest.py --digest digests/{YY}-{MM}-{DD}-literature-digest.md
gh (which must be authenticated; gh auth status checks this).--force is passed.
This is belt-and-braces with Step 0 Gate B — if you somehow got past the
pre-flight, the upload still won't clobber.README.md under the ## Literature digests heading with a line like
- [YYYY-MM-DD](literature/YY-MM-DD-literature-digest.md). Entries are kept
in date-descending order; the operation is idempotent. Pass --no-readme to
skip this step (rarely useful in production).html_url of the new digest file (and, if the README
was updated, the README's html_url) on stdout. Hand those URLs to the user
as the digest location. Do not present the local path as the primary
artefact — the remote is canonical.--force. Do not retry silently.If the upload fails for a recoverable reason (network blip, gh auth lapsed), keep the local file intact and tell the user how to re-run just the upload step. Never delete the local file before a successful upload.
After (and only after) upload_digest.py exits 0, post a single notification
to #literature so the team sees the new digest. Use the Slack MCP directly:
slack_send_message(
channel_id = "C010067BP2Q", # #literature on ersilia-workspace
message = <rendered template from references/slack-alert-template.md>
)
references/slack-alert-template.md. Use the chapter
short-forms. Compose the chapter list from chapters that actually appeared in
the digest (skip empty ones).upload_digest.py),
if --dry-run was set, or if the digest was generated but not actually pushed.--force overwrite, still post once —
the team should know the digest has been updated.references/output-template.md for when to apply each.digests/ is gitignored
by default for a reason.This skill is invoked manually by default. To run it weekly:
/schedule create literature-digest --cron "0 8 * * 1" --command "/literature-digest"
(Monday 08:00 local time.) The schedule skill handles the cron wiring; see its SKILL.md
for options. Self-scheduling is intentionally not built in — running this requires the
Slack and Gmail MCPs to be live in the session, which is easier to guarantee for a manual
run than a cron.
newsletter-drafting: a future flag could re-package the digest's high-
impact items as a newsletter block. Today this is manual.testing
A minimal test skill to verify that the ersilia-skills repository and local setup (symlinks, git hook) are working correctly. Use this skill to confirm that skill loading, slash commands, and the setup.sh workflow are functioning as expected. Trigger on phrases like "run test skill", "check skill setup", or "verify ersilia skills".
development
How to create Python plots using the stylia package — Ersilia's matplotlib wrapper for publication-ready figures. ALWAYS use this skill when the user says anything like "make a plot", "plot this", "plot the results", "visualize", "prepare a plotting function", "show me a chart", "can you plot", "add a figure", or any similar phrasing during a coding session. This includes scatter plots, line plots, bar charts, heatmaps, histograms, ROC curves, and any other chart type. Also trigger on requests to visualize data, compare values, show distributions, or create any kind of figure — even if the user does not mention stylia or matplotlib explicitly. Never generate matplotlib figures without stylia — always use stylia.create_figure() instead of plt.figure() or plt.subplots().
documentation
Create LinkedIn post drafts and end-of-month newsletter content for Ersilia Open Source Initiative. Use this skill whenever the user asks to plan LinkedIn posts, draft a monthly content schedule, write a weekly post, or create the monthly newsletter digest. Triggers include: "start of month", "end of month", "write a LinkedIn post", "prepare this month's posts", "draft the newsletter", "monthly update", "weekly post", or any request to create content for Ersilia's LinkedIn or newsletter. Also triggers when the user uploads a content calendar (PDF or text) and asks for posts for a given month. Always use this skill for any Ersilia content creation request, even if the ask seems simple.
documentation
Write the monthly Ersilia newsletter digest from a summary of the month's events. Use this skill whenever the user asks to write, draft, or prepare the monthly newsletter, end-of-month digest, or newsletter content blocks for Ersilia. Triggers include: "write the newsletter", "prepare the newsletter", "end of month newsletter", "draft the monthly update", "newsletter for [month]", or any request to produce newsletter content for Ersilia Open Source Initiative. Always use this skill for newsletter requests even if the ask seems simple.