skills/bib-verify/SKILL.md
Verify a BibTeX file for hallucinated or fabricated references by cross-checking every entry against CrossRef, arXiv, and DBLP. Reports each reference as verified, suspect, or not found, with field-level mismatch details (title, authors, year, DOI). Use when the user wants to check a .bib file for fake citations, validate references in a paper, or audit bibliography entries for accuracy.
npx skillsauth add agentscope-ai/openjudge bib-verifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Check every entry in a .bib file against real academic databases using the
OpenJudge PaperReviewPipeline in BibTeX-only mode:
.bib fileverified, suspect, or not_foundpip install py-openjudge litellm
| Info | Required? | Notes |
|------|-----------|-------|
| BibTeX file path | Yes | .bib file to verify |
| CrossRef email | No | Improves CrossRef API rate limits |
# Verify a standalone .bib file
python -m cookbooks.paper_review --bib_only references.bib
# With CrossRef email for better rate limits
python -m cookbooks.paper_review --bib_only references.bib --email [email protected]
# Save report to a custom path
python -m cookbooks.paper_review --bib_only references.bib \
--email [email protected] --output bib_report.md
| Flag | Default | Description |
|------|---------|-------------|
| --bib_only | — | Path to .bib file (required for standalone verification) |
| --email | — | CrossRef mailto — improves rate limits, recommended |
| --output | auto | Output .md report path |
| --language | en | Report language: en or zh |
Each reference entry is assigned one of three statuses:
| Status | Meaning |
|--------|---------|
| verified | Found in CrossRef / arXiv / DBLP with matching fields |
| suspect | Title or authors do not match any real paper — likely hallucinated or mis-cited |
| not_found | No match in any database — treat as fabricated |
Field-level details are shown for suspect entries:
title_match — whether the title matches a real paperauthor_match — whether the author list matchesyear_match — whether the publication year is correctdoi_match — whether the DOI resolves to the right papertools
Generate text, images, video, speech, and music via the MiniMax AI platform. Covers text generation (MiniMax-M2.7 model), image generation (image-01), video generation (Hailuo-2.3), speech synthesis (speech-2.8-hd, 300+ voices), music generation (music-2.6 with lyrics, cover, and instrumental), and web search. Use when the user needs to create AI-generated multimedia content, produce narrated audio from text, compose music, or search the web through MiniMax AI services.
development
Build RL reward signals using the OpenJudge framework. Covers choosing between pointwise and pairwise reward strategies based on RL algorithm, task type, and cost; aggregating multi-dimensional pointwise scores into a scalar reward; pairwise tournament reward for GRPO on subjective tasks (net win rate across group rollouts); generating preference pairs for DPO/RLAIF; and normalizing scores for training stability. Use when building reward models, scoring rollouts for GRPO/REINFORCE, generating preference data for DPO, or doing Best-of-N selection.
tools
Benchmark LLM reference recommendation capabilities by verifying every cited paper against Crossref, PubMed, arXiv, and DBLP. Measures hallucination rate, per-field accuracy (title/author/year/DOI), discipline breakdown, and year constraint compliance. Supports tool-augmented (ReAct + web search) mode. Use when the user asks to evaluate, benchmark, or compare models on academic reference hallucination, literature recommendation quality, or citation accuracy.
testing
Review academic papers for correctness, quality, and novelty using OpenJudge's multi-stage pipeline. Supports PDF files and LaTeX source packages (.tar.gz/.zip). Covers 10 disciplines: cs, medicine, physics, chemistry, biology, economics, psychology, environmental_science, mathematics, social_sciences. Use when the user asks to review, evaluate, critique, or assess a research paper, check references, or verify a BibTeX file.