Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ondata/openalex

Name: openalex
Author: ondata

skills/openalex/SKILL.md

npx skillsauth add ondata/skills openalex

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

OpenAlex

Use this skill to run reliable OpenAlex API workflows from shell.

IMPORTANT: Always write curl commands on a single line. Multi-line \ continuation breaks argument parsing in agent environments and will cause errors.

SECURITY: Never expose the actual value of OPENALEX_API_KEY anywhere — not in text responses, not in echoed commands, not in logs. Always reference it as $OPENALEX_API_KEY. If the key appears in any output, stop immediately and do not repeat it.

Definition of Done

A task is complete when:

Results

The API returns at least one result (or a clear "no results found" message)
Each result shows: title (display_name), year, citation count
Output is readable — not a raw JSON blob

Process

curl written on a single line
api_key included in every request
select= used to limit returned fields
jq used to format output

PDF download (when requested)

If PDF is available: file saved locally, path printed
If PDF is not available: clear message, exit code 2, no crash

Quick Start

Export API key:

export OPENALEX_API_KEY='...'

To verify it is set without printing the value:

[[ -n "${OPENALEX_API_KEY:-}" ]] && echo "key is set" || echo "ERROR: OPENALEX_API_KEY not set"

Run list query (works):

curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'search="data quality" AND "open government data"' --data-urlencode 'filter=type:article,from_publication_date:2023-01-01' --data-urlencode 'sort=relevance_score:desc' --data-urlencode 'per-page=200' --data-urlencode 'select=display_name,publication_year,cited_by_count,doi' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, year:.publication_year, cited_by:.cited_by_count, doi}'

Workflow

Define entity endpoint (works, authors, sources, etc.).
Build a search block with boolean logic (AND, OR, NOT, quotes, parentheses).
Add structured filter constraints (type/date/language/OA/citation fields).
Restrict output with select (root-level fields only).
Page results with page or cursor=*.
Extract fields via jq and save/transform as needed.

Iterative Validation Workflow

Use this when building or debugging non-trivial queries.

Start with a toy query (per-page=5 or per-page=10) and minimal select=.
Manually inspect 5-10 records for relevance and field quality (display_name, year, DOI).
Compare a baseline and a variant before scaling:
- baseline: filter=title.search:"..."
- variant: search=... with same filters
Tune one parameter at a time (search, filter, sort, per-page, pagination mode).
Scale only after validation (per-page=200, then cursor=* for deep pagination).
Log each run: command, key parameters, result count, and quick notes.

Avoid jumping directly from a paper/spec to a full extraction script without this short validation loop.

Query Blocks

title.search=: searches only in the title — use this by default for focused results. Must be passed inside filter=, not as a standalone parameter: filter=title.search:"your query".
search=: full-text search across the entire document — use only when title-only matching is too restrictive.
search.semantic=: semantic/conceptual search (costs $0.001/request; requires API key).
filter=: exact/structured constraints; comma means AND.
sort=: relevance_score:desc, cited_by_count:desc, publication_date:desc, etc.
per-page=: 1..200. Default is 25 — always set per-page=200 for bulk queries (8× fewer API calls).
page=: page number for standard pagination.
cursor=*: deep pagination beyond first 10k records.
select=: reduce payload; nested paths are not allowed in select.
group_by=: aggregate results by a field (e.g. group_by=publication_year, group_by=topics.id).
sample=: random sample of N results (e.g. sample=20). Add seed=42 for reproducibility.

Filter Syntax

Filters are comma-separated AND conditions. Within a single attribute:

| Logic | Syntax | Example | |-------|--------|---------| | AND (comma) | filter=a:x,b:y | filter=type:article,is_oa:true | | OR (pipe) | filter=type:article\|book | multiple values for same field | | NOT (exclamation) | filter=type:!journal-article | negation | | Greater than | filter=cited_by_count:>100 | comparison | | Less than | filter=publication_year:<2020 | comparison | | Range | filter=publication_year:2020-2023 | inclusive range |

Batch Lookup

Combine up to 50 IDs in one request using the pipe operator — avoid sequential calls:

# Batch DOI lookup (up to 50 per request)
curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'filter=doi:https://doi.org/10.1/abc|https://doi.org/10.2/def' --data-urlencode 'per-page=50' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, doi}'

Two-Step Entity Lookup

Names are ambiguous; always resolve to an OpenAlex ID first, then filter.

Step 1 — find the entity ID:

curl -sS --get 'https://api.openalex.org/authors' --data-urlencode 'search=Heather Piwowar' --data-urlencode 'per-page=5' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {id, display_name}'

Step 2 — use the ID in a filter:

curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'filter=authorships.author.id:A5023888391' --data-urlencode 'per-page=200' --data-urlencode 'select=id,display_name,publication_year,cited_by_count' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, year:.publication_year}'

Applies to: authors (authorships.author.id), institutions (authorships.institutions.id), sources/journals (primary_location.source.id). External IDs are also accepted: ORCID, ROR, ISSN, DOI.

PDF Retrieval

For a work ID:

Fetch work metadata.
Resolve PDF URL in this order:
- .content_urls.pdf
- .best_oa_location.pdf_url
- .primary_location.pdf_url
- first non-null .locations[].pdf_url
Download with api_key query parameter when source is content.openalex.org.

Output Format

When displaying results, always show display_name as the title — never use doi or id in its place.

Minimal jq for a results table:

| jq -r '.results[] | [.display_name, .publication_year, .cited_by_count, .doi] | @tsv'

Or as structured objects:

| jq '.results[] | {title: .display_name, year: .publication_year, cited_by: .cited_by_count, doi}'

CSV Export

To save results as a CSV file, use jq with @csv and include a header row:

curl -sS --get 'https://api.openalex.org/works' ... --data-urlencode "api_key=$OPENALEX_API_KEY" | jq -r '["title","year","cited_by","doi"], (.results[] | [.display_name, .publication_year, .cited_by_count, (.doi // "")]) | @csv' > results.csv

Rules:

Use // "" for fields that may be null (e.g. doi) — @csv fails on null values.
The header array and data array must have the same number of columns.
Use -r (raw output) so @csv produces plain text, not JSON strings.

Error Handling

Implement exponential backoff on 403 (rate limit) and 500 (server error):

attempt 1 → wait 1s → attempt 2 → wait 2s → attempt 3 → wait 4s → attempt 4 → wait 8s

HTTP codes:

200 — success
400 — invalid parameter or filter syntax; fix the query
403 — rate limit exceeded; back off and retry
404 — entity not found
500 — temporary server error; retry with backoff

Endpoint Costs

With the free $1/day budget:

| Request type | Cost | Daily limit | |---|---|---| | Singleton (/works/W123) | free | unlimited | | List / filter | $0.0001 | ~10,000 requests | | Search (full-text or semantic) | $0.001 | ~1,000 requests | | PDF download (content.openalex.org) | $0.01 | ~100 downloads |

Use select= and per-page=200 to minimize request count.

Common Pitfalls

Do not use .id or .doi as the title field in jq output — .id is an OpenAlex URL, .doi is a DOI URL; always use .display_name for human-readable titles.
Do not include id in select= unless you need the OpenAlex URL for follow-up lookups — it is a URL, not a title, and confuses output.
Do not sort by relevance_score without a search query.
Do not use nested fields in select (example: use open_access, then parse .open_access.is_oa with jq).
Do not filter by entity names directly — use the two-step entity lookup to get the ID first.
Do not use sequential calls for batch ID lookups — batch up to 50 with the pipe operator.
Do not use per-page=25 (default) for bulk extraction — always set per-page=200.
Expect some records to have no downloadable PDF.
search= searches full text and can return loosely related results. Use title.search= when the topic must appear in the title.
Always write curl commands on a single line — multi-line \ continuation breaks argument parsing in agent environments.
title.search is NOT a valid standalone parameter — always pass it inside filter=: filter=title.search:"your query".
Always include api_key=$OPENALEX_API_KEY in every request.
Never expose the actual key value — not in text output, not in echoed commands, not in logs, and not in any other form. Always use the variable reference $OPENALEX_API_KEY.
- To verify it is set: [[ -n "${OPENALEX_API_KEY:-}" ]] && echo "key is set" || echo "ERROR: OPENALEX_API_KEY not set".

Resources

Query recipes and jq snippets: references/query-recipes.md
Generic query helper: scripts/openalex_query.sh
PDF downloader for work IDs: scripts/openalex_download_pdf.sh

ondata/openalex

skills/openalex/SKILL.md

Query OpenAlex API from the command line with curl and jq for publication discovery, filtering, sorting, pagination, and PDF availability checks. Use when searching scholarly works/authors/sources, building or debugging OpenAlex queries, extracting results, or downloading available PDFs using OPENALEX_API_KEY.

4 stars

development

Updated Apr 27, 2026

$ install --global

skillsauth

npx skillsauth add ondata/skills openalex

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 27, 2026, 9:24 AM65.0s4 files scanned

SKILL.md

name:: openalex
description:: Query OpenAlex API from the command line with curl and jq for publication discovery, filtering, sorting, pagination, and PDF availability checks. Use when searching scholarly works/authors/sources, building or debugging OpenAlex queries, extracting results, or downloading available PDFs using OPENALEX_API_KEY.
compatibility:: Requires curl, jq, bash, OPENALEX_API_KEY environment variable, and internet access.
license:: CC BY-SA 4.0 (Creative Commons Attribution-ShareAlike 4.0 International)
version:: 0.2
author:: Andrea Borruso <[email protected]>
tags:: [api, research, scholarly, bibliometrics, open-access, curl, jq, pdf]

OpenAlex

Use this skill to run reliable OpenAlex API workflows from shell.

IMPORTANT: Always write curl commands on a single line. Multi-line \ continuation breaks argument parsing in agent environments and will cause errors.

SECURITY: Never expose the actual value of OPENALEX_API_KEY anywhere — not in text responses, not in echoed commands, not in logs. Always reference it as $OPENALEX_API_KEY. If the key appears in any output, stop immediately and do not repeat it.

Definition of Done

A task is complete when:

Results

The API returns at least one result (or a clear "no results found" message)
Each result shows: title (display_name), year, citation count
Output is readable — not a raw JSON blob

Process

curl written on a single line
api_key included in every request
select= used to limit returned fields
jq used to format output

PDF download (when requested)

If PDF is available: file saved locally, path printed
If PDF is not available: clear message, exit code 2, no crash

Quick Start

Export API key:

export OPENALEX_API_KEY='...'

To verify it is set without printing the value:

[[ -n "${OPENALEX_API_KEY:-}" ]] && echo "key is set" || echo "ERROR: OPENALEX_API_KEY not set"

Run list query (works):

curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'search="data quality" AND "open government data"' --data-urlencode 'filter=type:article,from_publication_date:2023-01-01' --data-urlencode 'sort=relevance_score:desc' --data-urlencode 'per-page=200' --data-urlencode 'select=display_name,publication_year,cited_by_count,doi' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, year:.publication_year, cited_by:.cited_by_count, doi}'

Workflow

Define entity endpoint (works, authors, sources, etc.).
Build a search block with boolean logic (AND, OR, NOT, quotes, parentheses).
Add structured filter constraints (type/date/language/OA/citation fields).
Restrict output with select (root-level fields only).
Page results with page or cursor=*.
Extract fields via jq and save/transform as needed.

Iterative Validation Workflow

Use this when building or debugging non-trivial queries.

Start with a toy query (per-page=5 or per-page=10) and minimal select=.
Manually inspect 5-10 records for relevance and field quality (display_name, year, DOI).
Compare a baseline and a variant before scaling:
- baseline: filter=title.search:"..."
- variant: search=... with same filters
Tune one parameter at a time (search, filter, sort, per-page, pagination mode).
Scale only after validation (per-page=200, then cursor=* for deep pagination).
Log each run: command, key parameters, result count, and quick notes.

Avoid jumping directly from a paper/spec to a full extraction script without this short validation loop.

Query Blocks

title.search=: searches only in the title — use this by default for focused results. Must be passed inside filter=, not as a standalone parameter: filter=title.search:"your query".
search=: full-text search across the entire document — use only when title-only matching is too restrictive.
search.semantic=: semantic/conceptual search (costs $0.001/request; requires API key).
filter=: exact/structured constraints; comma means AND.
sort=: relevance_score:desc, cited_by_count:desc, publication_date:desc, etc.
per-page=: 1..200. Default is 25 — always set per-page=200 for bulk queries (8× fewer API calls).
page=: page number for standard pagination.
cursor=*: deep pagination beyond first 10k records.
select=: reduce payload; nested paths are not allowed in select.
group_by=: aggregate results by a field (e.g. group_by=publication_year, group_by=topics.id).
sample=: random sample of N results (e.g. sample=20). Add seed=42 for reproducibility.

Filter Syntax

Filters are comma-separated AND conditions. Within a single attribute:

Batch Lookup

Combine up to 50 IDs in one request using the pipe operator — avoid sequential calls:

# Batch DOI lookup (up to 50 per request)
curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'filter=doi:https://doi.org/10.1/abc|https://doi.org/10.2/def' --data-urlencode 'per-page=50' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, doi}'

Two-Step Entity Lookup

Names are ambiguous; always resolve to an OpenAlex ID first, then filter.

Step 1 — find the entity ID:

curl -sS --get 'https://api.openalex.org/authors' --data-urlencode 'search=Heather Piwowar' --data-urlencode 'per-page=5' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {id, display_name}'

Step 2 — use the ID in a filter:

curl -sS --get 'https://api.openalex.org/works' --data-urlencode 'filter=authorships.author.id:A5023888391' --data-urlencode 'per-page=200' --data-urlencode 'select=id,display_name,publication_year,cited_by_count' --data-urlencode "api_key=$OPENALEX_API_KEY" | jq '.results[] | {title:.display_name, year:.publication_year}'

Applies to: authors (authorships.author.id), institutions (authorships.institutions.id), sources/journals (primary_location.source.id). External IDs are also accepted: ORCID, ROR, ISSN, DOI.

PDF Retrieval

For a work ID:

Fetch work metadata.
Resolve PDF URL in this order:
- .content_urls.pdf
- .best_oa_location.pdf_url
- .primary_location.pdf_url
- first non-null .locations[].pdf_url
Download with api_key query parameter when source is content.openalex.org.

Output Format

When displaying results, always show display_name as the title — never use doi or id in its place.

Minimal jq for a results table:

| jq -r '.results[] | [.display_name, .publication_year, .cited_by_count, .doi] | @tsv'

Or as structured objects:

| jq '.results[] | {title: .display_name, year: .publication_year, cited_by: .cited_by_count, doi}'

CSV Export

To save results as a CSV file, use jq with @csv and include a header row:

curl -sS --get 'https://api.openalex.org/works' ... --data-urlencode "api_key=$OPENALEX_API_KEY" | jq -r '["title","year","cited_by","doi"], (.results[] | [.display_name, .publication_year, .cited_by_count, (.doi // "")]) | @csv' > results.csv

Rules:

Use // "" for fields that may be null (e.g. doi) — @csv fails on null values.
The header array and data array must have the same number of columns.
Use -r (raw output) so @csv produces plain text, not JSON strings.

Error Handling

Implement exponential backoff on 403 (rate limit) and 500 (server error):

attempt 1 → wait 1s → attempt 2 → wait 2s → attempt 3 → wait 4s → attempt 4 → wait 8s

HTTP codes:

200 — success
400 — invalid parameter or filter syntax; fix the query
403 — rate limit exceeded; back off and retry
404 — entity not found
500 — temporary server error; retry with backoff

Endpoint Costs

With the free $1/day budget:

Use select= and per-page=200 to minimize request count.

Common Pitfalls

Do not use .id or .doi as the title field in jq output — .id is an OpenAlex URL, .doi is a DOI URL; always use .display_name for human-readable titles.
Do not include id in select= unless you need the OpenAlex URL for follow-up lookups — it is a URL, not a title, and confuses output.
Do not sort by relevance_score without a search query.
Do not use nested fields in select (example: use open_access, then parse .open_access.is_oa with jq).
Do not filter by entity names directly — use the two-step entity lookup to get the ID first.
Do not use sequential calls for batch ID lookups — batch up to 50 with the pipe operator.
Do not use per-page=25 (default) for bulk extraction — always set per-page=200.
Expect some records to have no downloadable PDF.
search= searches full text and can return loosely related results. Use title.search= when the topic must appear in the title.
Always write curl commands on a single line — multi-line \ continuation breaks argument parsing in agent environments.
title.search is NOT a valid standalone parameter — always pass it inside filter=: filter=title.search:"your query".
Always include api_key=$OPENALEX_API_KEY in every request.
Never expose the actual key value — not in text output, not in echoed commands, not in logs, and not in any other form. Always use the variable reference $OPENALEX_API_KEY.
- To verify it is set: [[ -n "${OPENALEX_API_KEY:-}" ]] && echo "key is set" || echo "ERROR: OPENALEX_API_KEY not set".

Resources

Query recipes and jq snippets: references/query-recipes.md
Generic query helper: scripts/openalex_query.sh
PDF downloader for work IDs: scripts/openalex_download_pdf.sh

Related Skills

ondata/difensore-civico-ti-scrivo

development

VerifiedTrustedCommunity

Guides users step by step in drafting a formal complaint (segnalazione) to Italy's Digital Civic Defender (Difensore Civico per il Digitale, DCD) at AGID for violations of the CAD (Codice dell'Amministrazione Digitale) or other digitalization norms by public administrations. Use this skill whenever someone wants to: report an Italian PA to AGID; write to the Difensore Civico per il Digitale; complain about open data violations, non-machine-readable public data, inaccessible PA portals, missing or restrictive licenses on public data, captchas blocking automated access, unanswered data reuse requests (D.Lgs. 36/2006 art. 5), failure to publish mandatory High Value Datasets (HVD, Reg. (UE) 2023/138), or a prior DCD complaint that got no response. Trigger even if the user does not name the skill — any Italian digital-rights complaint targeting a PA is a candidate.

5SKILL.mdUpdated May 30, 2026

ondata/difensore-civico-ti-scrivo

ondata/datawrapper

development

VerifiedTrustedCommunity

Create charts, choropleth maps, and locator maps via the Datawrapper API. Use this skill whenever the user wants to publish a visualization on Datawrapper, create an interactive chart or map from data, generate a PNG/embed from Datawrapper, or use the Datawrapper REST API. Triggers on: "create a map with datawrapper", "publish a chart on datawrapper", "choropleth map", "locator map datawrapper", "export PNG from datawrapper", and any request involving creating or configuring Datawrapper charts/maps programmatically. Also triggers for Italian variants: "mappa coropletica datawrapper", "crea grafico datawrapper", "mappa datawrapper".

4SKILL.mdUpdated May 5, 2026

ondata/typst-cards

development

VerifiedTrustedCommunity

Generate PNG images for online communication — social media, carousels, infographics, posts — using Typst. Use this skill whenever the user wants to create slides, cards, visual posts or any digital graphic content, even if they don't explicitly mention Typst. The skill drives an interview about brand materials (logo, palette, fonts, DESIGN.md), proposes the formats best suited to the context (Instagram 1:1, Stories 9:16, LinkedIn 16:9, etc.) and produces ready-to-use PNGs.

4SKILL.mdUpdated Apr 27, 2026

ondata/open-data-quality

testing

VerifiedTrustedCommunity

Comprehensive open data quality validator for two audiences: data analysts who need to assess whether a dataset is ready to use, and public administrations who want to self-evaluate their published data. Automatically adapts based on input type: (A) local CSV file only — performs file-level structural and content checks; (B) CKAN/open data portal dataset — adds metadata completeness, resource accessibility, URL reachability, and DCAT-AP compliance (supports all national profiles: DCAT-AP 2.x baseline, IT, BE, NL, DE, FR, UK, ES, and others). Always use this skill when the user mentions: data quality, validate dataset, check CSV, open data compliance, metadata audit, CKAN dataset review, "is this data usable?", or whenever a CSV file or CKAN dataset ID/URL is provided for quality assessment. Produces severity-ranked reports (blocker / major / minor) with concrete fixes, quality score, and a plain-language summary for non-technical stakeholders.

4SKILL.mdUpdated Apr 27, 2026

ondata/open-data-quality

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ondata/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/skills/openalex ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ondata/skills

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT