skills/openrouter-generations/SKILL.md
Retrieve detailed metadata and stored content for individual OpenRouter generations. Use when the user wants to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text — or is debugging a failed or unexpected generation.
npx skillsauth add openrouterteam/skills openrouter-generationsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Retrieve detailed metadata and stored content for individual OpenRouter generations. Use this skill when you need to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text.
--api-key <key> or set the OPENROUTER_API_KEY environment variablegen-1234567890 or gen-aBcDeFgHiJkLmNoPqRsT.cd <skill-path>/scripts && npm install
| Endpoint | Method | Purpose |
|----------|--------|---------|
| /api/v1/generation | GET | Request metadata and usage (tokens, cost, latency, model, provider) |
| /api/v1/generation/content | GET | Stored prompt and completion text |
Both take a single query parameter: id (the generation ID).
Full API reference: openrouter.ai/docs/api/api-reference/generations/get-generation
Retrieves everything about a generation except the actual prompt/completion text:
cd <skill-path>/scripts && npx tsx get-generation.ts gen-1234567890
npx tsx get-generation.ts --id gen-1234567890 --json
What you get back:
model, provider_name, router, service_tiertokens_prompt, tokens_completion, native_tokens_reasoning, native_tokens_cachedtotal_cost, usage, upstream_inference_cost, cache_discountlatency, generation_time, moderation_latencyfinish_reason, streamed, cancelled, is_byokcreated_at, app_id, external_user, session_id, request_idprovider_responses array showing fallback attempts with per-provider latency and statusRetrieves the stored prompt and completion:
cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-1234567890
npx tsx get-generation-content.ts --id gen-1234567890 --json
What you get back:
prompt (raw text) and/or messages (array of {role, content})completion (the model's response) and reasoning (chain-of-thought, if applicable)Note: Content is only available if the generation was not made with Zero Data Retention (ZDR) enabled. If ZDR was on, this endpoint returns empty/null content.
curl -G https://openrouter.ai/api/v1/generation \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d id=gen-1234567890
curl -G https://openrouter.ai/api/v1/generation/content \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d id=gen-1234567890
/api/v1/generation){
"data": {
"id": "gen-3bhGkxlo4XFrqiabUM7NDtwDzWwG",
"api_type": "completions",
"model": "openai/gpt-4o",
"provider_name": "OpenAI",
"created_at": "2024-07-15T23:33:19.433273+00:00",
"tokens_prompt": 10,
"tokens_completion": 25,
"native_tokens_reasoning": 5,
"native_tokens_cached": 3,
"total_cost": 0.0015,
"usage": 0.0015,
"upstream_inference_cost": 0.0012,
"latency": 1250,
"generation_time": 1200,
"finish_reason": "stop",
"streamed": true,
"is_byok": false,
"cancelled": false,
"router": "openrouter/auto",
"service_tier": "priority",
"provider_responses": [
{
"provider_name": "OpenAI",
"model_permaslug": "openai/gpt-4o",
"status": 200,
"latency": 1200,
"is_byok": false
}
]
}
}
/api/v1/generation/content){
"data": {
"input": {
"prompt": "What is the meaning of life?",
"messages": [
{
"content": "What is the meaning of life?",
"role": "user"
}
]
},
"output": {
"completion": "The meaning of life is a philosophical question...",
"reasoning": null
}
}
}
# Check what happened — look at finish_reason, provider_responses, and cancelled
cd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123 --json
Look for:
finish_reason = "length" means the model hit max tokensfinish_reason = "content_filter" means content was filteredcancelled = true means the request was cancelled by the clientprovider_responses with multiple entries means fallbacks occurredcd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123
Check total_cost (what you were charged) vs upstream_inference_cost (what the provider charged OpenRouter).
cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-abc123
Useful for debugging unexpected outputs — verify the actual prompt sent and completion received.
If you have a request_id or session_id from one generation, you can find related generations via the analytics query endpoint (see openrouter-analytics skill).
| Status | Meaning | |--------|---------| | 401 | Invalid or missing API key | | 403 | You don't have access to this generation (belongs to another user) | | 404 | Generation ID not found | | 429 | Rate limited — wait and retry | | 500 | Server error — retry | | 502 | Upstream failure — retry |
| Field | Type | Description |
|-------|------|-------------|
| id | string | Generation ID (gen-...) |
| model | string | Model permaslug (e.g., openai/gpt-4o) |
| provider_name | string|null | Provider that served the request |
| api_type | string | One of: completions, embeddings, rerank, tts, stt, video |
| tokens_prompt | int|null | Prompt token count |
| tokens_completion | int|null | Completion token count |
| native_tokens_reasoning | int|null | Reasoning/thinking tokens |
| native_tokens_cached | int|null | Cached input tokens |
| total_cost | number | Total cost in USD |
| usage | number | Usage amount in USD |
| upstream_inference_cost | number|null | Provider's cost in USD |
| cache_discount | number|null | Discount from caching |
| latency | number|null | Total latency in ms |
| generation_time | number|null | Model generation time in ms |
| moderation_latency | number|null | Moderation check time in ms |
| finish_reason | string|null | Why generation stopped (stop, length, content_filter, etc.) |
| native_finish_reason | string|null | Raw finish reason from provider |
| streamed | bool|null | Whether response was streamed |
| is_byok | bool | Whether user's own provider key was used |
| cancelled | bool|null | Whether request was cancelled |
| app_id | int|null | OAuth app ID |
| external_user | string|null | External user identifier (X-External-User header) |
| session_id | string|null | Session grouping ID |
| request_id | string|null | Request grouping ID (all gens from one API call) |
| router | string|null | Router used (e.g., openrouter/auto) |
| service_tier | string|null | Provider service tier |
| web_search_engine | string|null | Search engine used (e.g., exa, firecrawl) |
| num_search_results | int|null | Number of search results included |
| provider_responses | array|null | Provider attempt chain with per-provider latency/status |
| Field | Type | Description |
|-------|------|-------------|
| data.input.prompt | string|null | Raw prompt text |
| data.input.messages | array|null | Messages array ([{role, content}]) |
| data.output.completion | string|null | Model's completion text |
| data.output.reasoning | string|null | Chain-of-thought reasoning |
development
Answer natural-language questions about a user's OpenRouter usage data — spend, request volume, model breakdown, latency, token usage, and cost optimization. Use when the user asks about their API usage, billing, costs, top models, traffic patterns, or wants to optimize their OpenRouter spend.
data-ai
Discover the OpenRouter analytics schema — available metrics, dimensions, filter operators, and granularities. Use when you need to know what analytics data is queryable, what dimensions you can break down by, or how to map a user's question to the right metric/dimension combination.
development
Construct and execute analytics queries against the OpenRouter API — full parameter reference for metrics, dimensions, filters, time ranges, ordering, and pagination. Use when building or debugging an analytics query, understanding the request/response shape, or handling query errors.
development
Transcribe speech to text using OpenRouter's speech-to-text API. Use when the user asks to transcribe audio, convert speech to text, extract a transcript from a recording or meeting, caption a video's audio, or mentions STT, speech-to-text, ASR, or transcription.