skills/paper-relevance-filter/SKILL.md
Scores a candidate paper against a keyword watchlist and a relevance-criteria document, returning KEEP/DROP/REVIEW with a one-line rationale and a 0-100 relevance score. Combines keyword-match strength, criteria-fit, and a historical-context check (was this paper or its preprint already covered in a recent digest). Domain-neutral - usable for any literature-scan workflow. Use after fetching candidate papers from bioRxiv, medRxiv, or PubMed and before clustering or synthesis. Trigger keywords - paper relevance, filter papers, keep or drop, score papers, relevance rationale.
npx skillsauth add lyndonkl/claude paper-relevance-filterInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Decide whether a fetched paper belongs in this week's digest. Output is a per-paper decision (KEEP / DROP / REVIEW), a 0-100 score, and a one-line rationale that the user can audit.
- [ ] Step 1: Load relevance criteria + the watchlist + last-4-weeks kept-paper IDs
- [ ] Step 2: Score each paper on three axes (match, criteria, novelty)
- [ ] Step 3: Combine to a 0-100 score; map to KEEP / REVIEW / DROP via thresholds
- [ ] Step 4: Apply tie-breakers (cap output at requested max kept count)
- [ ] Step 5: Return decisions + a calibration summary
Step 1 — Inputs
The caller hands the skill:
papers: list of normalized paper records (output of fetch-preprint-recent or fetch-pubmed-recent)watchlist: list of keywords/phrases (with optional weights — default weight 1.0)criteria: text from relevance-criteria.md describing what fits and what doesn'tprior_ids: set of id values that appeared as KEEP in any of the last 4 digests (used for novelty)max_kept: target ceiling, e.g. 25 (the digest will not exceed this)Step 2 — Three-axis scoring
For each paper, compute three sub-scores in [0, 1]:
Axis 1 — Match strength (0-1)
If keywords carry weights (some matter more than others), use the max weight among matched keywords as a multiplier capped at 1.0.
Axis 2 — Criteria fit (0-1) This is the qualitative axis. Read the abstract against the relevance-criteria document. The criteria typically state:
Score:
If the criteria document is silent on a paper's territory, default to 0.7 and flag for REVIEW.
Axis 3 — Novelty (0-1)
id is not in prior_ids and the title doesn't fuzzy-match any prior titleprior_ids (e.g., same DOI prefix 10.1101/... matched, this is a journal version) — KEEP-worthy but tag as "journal version of preprint covered YYYY-WW"id match in prior_ids (already covered)Use normalized title (lowercase, strip punctuation, collapse whitespace) for fuzzy matching. A Levenshtein ratio > 0.9 against any prior title counts as a match.
Step 3 — Combine and threshold
score = 100 * (0.45 * match + 0.45 * criteria + 0.10 * novelty)
Match and criteria carry equal weight (a paper that mentions your keywords once but is wildly out-of-topic should not score higher than one that's deeply on-topic with a single mention). Novelty is a small finger on the scale — enough to demote already-covered work but not enough to drop a genuinely important journal-version-of-preprint update.
Decision thresholds (default; the calling agent may override):
| Score | Decision | Notes | | -------- | -------- | -------------------------------------------------------------- | | 70-100 | KEEP | Goes into the digest | | 50-69 | REVIEW | Boundary cases — caller decides whether to escalate to user | | 0-49 | DROP | Filtered out, reason logged in the dropped-papers section |
Special-case override: if novelty == 0.0 (already in a prior digest), force DROP regardless of score. The papers section may still list it as "already covered" for traceability.
Step 4 — Tie-breakers when KEEP > max_kept
When more papers score ≥ 70 than max_kept:
max_kept, and demote the rest to REVIEW (not DROP — they were good enough; just couldn't fit). Surface this in the calibration summary.Never demote to DROP what scored ≥ 70 unless explicitly forced.
Step 5 — Return
{
"decisions": [
{
"id": "10.1101/2026.05.07.123456",
"decision": "KEEP",
"score": 84,
"axes": {"match": 0.9, "criteria": 1.0, "novelty": 1.0},
"rationale": "Title + abstract hit 'protein language model' twice; in-scope (primary methods paper, empirical); novel.",
"tags": []
},
{
"id": "PMID:39000000",
"decision": "KEEP",
"score": 72,
"axes": {"match": 1.0, "criteria": 0.7, "novelty": 0.5},
"rationale": "Strong keyword match; review article (criteria penalty); journal version of preprint covered 2026-15.",
"tags": ["journal-version-of:2026-15"]
},
{
"id": "PMID:39111111",
"decision": "DROP",
"score": 31,
"axes": {"match": 0.5, "criteria": 0.0, "novelty": 1.0},
"rationale": "'protein language model' appears once in abstract but the paper is a clinical trial enrollment report — out of scope.",
"tags": ["look-alike-trap"]
}
],
"calibration": {
"kept": 17,
"review": 3,
"dropped": 84,
"force_dropped_already_covered": 2,
"demoted_for_cap": 0,
"stricter_pass_applied": false
}
}
Pattern A — Strict weekly digest: defaults above. Tight thresholds; max_kept=25.
Pattern B — Catch-up over multiple weeks: run per-week with the same prior_ids growing each iteration. Don't pool all 3 weeks of papers and filter once — you'll lose the historical-context signal.
Pattern C — Topic deep-dive (user wants more, not less): relax max_kept to a high number (e.g. 100), keep thresholds, return the full ranked list. Only do this on explicit user request.
Pattern D — Sanity-check the watchlist itself: run with prior_ids = [] and look at calibration.dropped. If the same theme keeps getting dropped for criteria reasons, the watchlist may be drifting away from intent.
relevance-criteria.md text. If something's not in the criteria, the answer is REVIEW with rationale "criteria silent" — not "I think it fits."| Decision | Score | Action by caller | | -------- | ----- | ---------------------------------------------------- | | KEEP | 70+ | Include in digest, cluster, synthesize | | REVIEW | 50-69 | Surface in a "boundary cases" section, ask user | | DROP | 0-49 | Log in dropped-papers list with rationale |
| Axis | Default weight | What it measures |
| -------- | -------------- | --------------------------------------------------- |
| Match | 0.45 | How strongly watchlist keywords appear in title/abs |
| Criteria | 0.45 | Qualitative fit against relevance-criteria.md |
| Novelty | 0.10 | Not already in last-4-weeks digests |
development
--- name: zettel-note description: The note-writing discipline for this vault's evergreen knowledge graph, modeled on a Zettelkasten reading companion and governed by the vault conventions. Enforces declarative-claim titles, one claim per note (atomicity), own-words prose with no block quotes, the piped [[slug|Title]] link form, the labeled link-relationship vocabulary (Confirms/Contradicts/Extends/Context/Prerequisite/Builds-on/Applies/Example-of/Contrasts-with), 3-6 links per note, and search-
development
Plans between-round FIFA World Cup Fantasy transfers — budgets the round's free transfer(s), forces out players whose nation has been eliminated, chases fixture-swing drops, upgrades on value, and decides when a rebuild is large enough to fire the Wildcard instead of spending free transfers one at a time. Ranks candidate in/out pairs by EV gain over each player's remaining survival horizon (delta xEV weighted by progression_carry) MINUS transfer cost (a free transfer is cheap, a points hit is real, churning the squad for marginal swings is a critic flag), and tags forced/fixture/upgrade priority. Emits a `transfer-plan` signal. Use when called by wc-squad-architect (whose transfer work this skill is the engine for) and by the strategists in the populate stage when their candidate is transfer-adjacent rather than a full rebuild.
testing
Reads and updates the FIFA World Cup Fantasy tournament state machine (footballfantasy/context/tournament-state.md) — the temporal backbone tracking phase (pre-tournament → group MD1-3 → R32 → R16 → QF → SF → final), budget ($100m group / $105m knockouts), nation cap (3 group, loosening in knockouts), chips remaining, surviving nations, each owned player's elimination-risk horizon, and deadlines. Validates state on load (count/feasibility checks), applies phase transitions, and appends to the append-only state log (never silent overwrite). Use to load state at the start of a run and to commit state changes after the manager makes a move.
development
Validates and persists FIFA World Cup Fantasy signal files to signals/YYYY-MM-DD-<type>.md. Checks the required frontmatter (type, round, date, emitted_by, confidence, source_urls), range-checks declared numeric signals, confirms every factual claim carries a source URL or "manager-provided", rejects unknown signal types, and refuses to persist a signal that fails validation (logging the failure instead). Keeps the inter-agent signal layer auditable so downstream agents can trust what they read and never re-derive it. Use whenever an agent or skill writes a signal.