harvard-library-catalog/SKILL.md
Search Harvard Library's 13M+ bibliographic records via LibraryCloud and retrieve MARC/MODS data via PRESTO. Use this skill whenever the user wants to look up books, manuscripts, finding aids, or other items in Harvard's library catalog, verify bibliographic information (title, author, ISBN, publication date), find digital collections, or retrieve detailed catalog records. Also triggers when a user extracts a book title from a document and wants to find its full bibliographic metadata.
npx skillsauth add kltng/humanities-skills harvard-library-catalogInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Search and retrieve bibliographic records from Harvard Library's catalog of 13M+ items.
The Item API uses field names as query parameters — not q=field:value Solr syntax.
CORRECT: https://api.lib.harvard.edu/v2/items.json?title=hamlet&name=shakespeare
WRONG: https://api.lib.harvard.edu/v2/items?q=title:hamlet
The q= parameter is for keyword search across all fields. Field-specific search uses dedicated parameters like title=, name=, subject=, identifier=, etc.
.json in the URL pathResponses are XML by default. To get JSON, append .json before the query string:
JSON: https://api.lib.harvard.edu/v2/items.json?title=hamlet
Dublin Core: https://api.lib.harvard.edu/v2/items.dc.json?title=hamlet
Default XML: https://api.lib.harvard.edu/v2/items?title=hamlet
PRESTO returns raw MARC, MODS, or Dublin Core for a single record by its HOLLIS number. It complements LibraryCloud when you need the original catalog record:
MARC: https://webservices.lib.harvard.edu/rest/marc/hollis/{HOLLIS_ID}
MODS: https://webservices.lib.harvard.edu/rest/mods/hollis/{HOLLIS_ID}
DC: https://webservices.lib.harvard.edu/rest/dc/hollis/{HOLLIS_ID}
PRESTO returns XML only and does not support JSON serialization. ISBN/barcode lookups may not work on all records.
LibraryCloud returns 403 without a User-Agent header. Always include one:
curl -H 'User-Agent: MyApp/1.0' 'https://api.lib.harvard.edu/v2/items.json?title=hamlet'
The Python script includes this automatically.
Exceeding this triggers a 5-minute lockout. The Python script handles this automatically.
| Need | Method |
|------|--------|
| Search by title, author, subject, date | LibraryCloud Item API (field params) |
| Full-text keyword search | LibraryCloud Item API (q= param) |
| Look up by ISBN, LCCN, or other identifier | LibraryCloud identifier= or q= keyword |
| Browse digital collections | LibraryCloud collectionTitle= or Collections API |
| Get raw MARC record for a known HOLLIS ID | PRESTO /rest/marc/hollis/{id} |
| Faceted browsing (by language, date, genre) | LibraryCloud facets= parameter |
This is the primary use case — an LLM extracts a book title from a document and needs complete bibliographic data:
from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()
# 1. Search by title (and optionally author)
results = api.search(title="The Great Gatsby", name="Fitzgerald")
# 2. Get the first match's summary
if results:
summary = api.summarize(results[0])
# → title, author, publisher, date, ISBN, subjects, language, physical description
# 3. For deeper data, get MARC via PRESTO
hollis_id = api.get_record_id(results[0])
if hollis_id:
marc = api.get_presto_record(hollis_id, format="mods")
| Field | What it searches | Exact match? |
|-------|-----------------|-------------|
| q | All fields (keyword) | No |
| title | Title, subtitle, part name/number | Yes (title_exact) |
| name | All name fields (author, editor, etc.) | No |
| subject | All subject fields (topic, geographic, temporal) | Yes (subject_exact) |
| identifier | ISBN, LCCN, other system IDs | Yes |
| languageCode | ISO language code (e.g., chi, eng) | Yes |
| dateIssued | Publication date (YYYY) | Yes |
| dates.start / dates.end | Date range filter | — |
| genre | Genre/form (e.g., "Drawings", "Maps") | Yes (genre_exact) |
| repository | Harvard library name | Yes |
| isOnline | Has digital version (true/false) | — |
| recordIdentifier | HOLLIS/Alma record ID | Yes |
Combine fields freely: ?title=hamlet&name=shakespeare&languageCode=ger&dates.start=1900
limit=N (default 10, max 250)start=N for offset-based pagination (up to ~30K results)cursor=* then cursor={nextCursor} for large result sets (up to 100K)Add facets=field1,field2 to get value counts. Useful fields: name, subject, languageCode, genre, resourceType, repository, dateIssued.
?title=china&facets=languageCode,genre
Use scripts/harvard_api.py for programmatic access (zero dependencies):
from scripts.harvard_api import HarvardLibraryAPI
api = HarvardLibraryAPI()
# Keyword search
results = api.search(q="Chinese porcelain Ming dynasty")
# Field search
results = api.search(title="dream of the red chamber", languageCode="chi")
# With facets
results, facets = api.search_with_facets(subject="astronomy", facets=["genre", "dateIssued"])
# Pagination
all_results = api.search_all(title="peanuts", name="schulz", max_results=500)
# PRESTO lookup
marc_xml = api.get_presto_record("011557057", format="marc")
# Summarize a record
for r in results[:5]:
print(api.summarize(r))
| Endpoint | URL |
|----------|-----|
| LibraryCloud Items | https://api.lib.harvard.edu/v2/items |
| LibraryCloud Collections | https://api.lib.harvard.edu/v2/collections |
| PRESTO (MARC/MODS/DC) | https://webservices.lib.harvard.edu/rest/{format}/hollis/{id} |
references/api_reference.md — Complete field reference with all searchable fields, facets, and query examplesscripts/harvard_api.py — Full-featured Python client with rate limiting, pagination, and record summarizationdevelopment
Query the China Biographical Database (CBDB) locally via SQLite for biographical data on 656K+ historical Chinese figures from the 7th century BCE through the 19th century CE. Use when searching for Chinese historical figures, scholars, officials, or literary figures — their biographical details, family/kinship networks, official postings, social associations, examination records, or addresses. Runs entirely locally after initial database download (~556 MB). Faster and more flexible than the API version.
development
Interact with a local Zotero 8 desktop application through its HTTP API at localhost:23119. Use this skill whenever the user wants to search, fetch, add, edit, or organize bibliographic items in their Zotero library, import citations (BibTeX, RIS, etc.), attach files, manage collections and tags, or retrieve full-text content from Zotero. Triggers on mentions of Zotero, citation management, reference libraries, bibliographic databases, or local library management. Also use when chaining with other catalog skills (Harvard, LOC, HathiTrust, etc.) to save found records into the user's Zotero library.
development
Search for items and properties on Wikidata and retrieve entity details, claims, and external identifiers. Supports both keyword search (Wikidata Action API) and semantic/hybrid search (Wikidata Vector Database), plus direct entity retrieval (Special:EntityData) and structured querying (WDQS SPARQL).
testing
Query and explore the TGAZ (Temporal Gazetteer) SQLite database of 82,000+ historical Chinese placenames spanning 763 BCE to 1911 CE. Use this skill whenever the user asks about historical Chinese places, administrative geography, dynastic jurisdictions, place name evolution, or wants to query tgaz.db. Also trigger when the user mentions CHGIS, TGAZ, historical gazetteer, Chinese historical GIS, or asks questions like "what was X called in dynasty Y", "what counties existed in year Z", "where was X located", or any spatial/temporal query about Chinese historical geography. This skill is relevant even for casual questions like "tell me about ancient Chang'an" or "Tang dynasty cities near the Yellow River".