skills_all/model-registry-maintainer/SKILL.md
Guide for maintaining the MassGen model and backend registry. This skill should be used when adding new models, updating model information (release dates, pricing, context windows), or ensuring the registry stays current with provider releases. Covers both the capabilities registry and the pricing/token manager.
npx skillsauth add activer007/ordinary-claude-skills model-registry-maintainerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides guidance for maintaining MassGen's model registry across two key files:
massgen/backend/capabilities.py - Models, capabilities, release datesmassgen/token_manager/token_manager.py - Pricing, context windowsWhat it contains:
Used by:
--quickstart, --generate-config)Always update this file for new models.
What it contains:
Used by:
Pricing resolution order:
Only update PROVIDER_PRICING if:
"YYYY-MM"Add model to the models list and model_release_dates:
# massgen/backend/capabilities.py
"openai": BackendCapabilities(
# ... existing fields ...
models=[
"new-model-name", # Add here (newest first)
"gpt-5.1",
# ... existing models ...
],
model_release_dates={
"new-model-name": "2025-12", # Add here
"gpt-5.1": "2025-11",
# ... existing dates ...
},
)
First, check if the model is already in LiteLLM database:
import requests
url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()
if "new-model-name" in pricing_db:
print("✅ Model found in LiteLLM - no need to update token_manager.py")
print(f"Pricing: ${pricing_db['new-model-name']['input_cost_per_token']*1000}/1K input")
else:
print("❌ Model NOT in LiteLLM - need to add to PROVIDER_PRICING")
Only if NOT in LiteLLM, add to PROVIDER_PRICING:
# massgen/token_manager/token_manager.py
PROVIDER_PRICING: Dict[str, Dict[str, ModelPricing]] = {
"OpenAI": {
# Format: ModelPricing(input_per_1k, output_per_1k, context_window, max_output)
"new-model-name": ModelPricing(0.00125, 0.01, 300000, 150000),
# ... existing models ...
},
}
Provider name mapping:
"OpenAI" (not "openai")"Anthropic" (not "claude")"Google" (not "gemini")"xAI" (not "grok")If the model introduces new capabilities:
supported_capabilities={
"web_search",
"code_execution",
"new_capability", # Add here
}
Only change if the new model should be the recommended default:
default_model="new-model-name"
# Run capabilities tests
uv run pytest massgen/tests/test_backend_capabilities.py -v
# Test config generation with new model
massgen --generate-config ./test.yaml --config-backend openai --config-model new-model-name
# Verify the config was created successfully
cat ./test.yaml
uv run python docs/scripts/generate_backend_tables.py
cd docs && make html
In capabilities.py:
models=[
"gpt-5.1", # 2025-11
"gpt-5-codex", # 2025-09
"gpt-5", # 2025-08
"gpt-5-mini", # 2025-08
"gpt-5-nano", # 2025-08
"gpt-4.1", # 2025-04
"gpt-4.1-mini", # 2025-04
"gpt-4.1-nano", # 2025-04
"gpt-4o", # 2024-05
"gpt-4o-mini", # 2024-07
"o4-mini", # 2025-04
]
In token_manager.py (add missing models):
"OpenAI": {
"gpt-5": ModelPricing(0.00125, 0.01, 400000, 128000),
"gpt-5-mini": ModelPricing(0.00025, 0.002, 400000, 128000),
"gpt-5-nano": ModelPricing(0.00005, 0.0004, 400000, 128000),
"gpt-4o": ModelPricing(0.0025, 0.01, 128000, 16384),
"gpt-4o-mini": ModelPricing(0.00015, 0.0006, 128000, 16384),
# Missing: gpt-5.1, gpt-5-codex, gpt-4.1 family, o4-mini
}
In capabilities.py:
models=[
"claude-haiku-4-5-20251001", # 2025-10
"claude-sonnet-4-5-20250929", # 2025-09
"claude-opus-4-1-20250805", # 2025-08
"claude-sonnet-4-20250514", # 2025-05
]
In token_manager.py:
"Anthropic": {
"claude-haiku-4-5": ModelPricing(0.001, 0.005, 200000, 65536),
"claude-sonnet-4-5": ModelPricing(0.003, 0.015, 200000, 65536),
"claude-opus-4.1": ModelPricing(0.015, 0.075, 200000, 32768),
"claude-sonnet-4": ModelPricing(0.003, 0.015, 200000, 8192),
}
In capabilities.py:
models=[
"gemini-3-pro-preview", # 2025-11
"gemini-2.5-flash", # 2025-06
"gemini-2.5-pro", # 2025-06
]
In token_manager.py (missing gemini-2.5 and gemini-3):
"Google": {
"gemini-1.5-pro": ModelPricing(0.00125, 0.005, 2097152, 8192),
"gemini-1.5-flash": ModelPricing(0.000075, 0.0003, 1048576, 8192),
# Missing: gemini-2.5-pro, gemini-2.5-flash, gemini-3-pro-preview
}
In capabilities.py:
models=[
"grok-4-1-fast-reasoning", # 2025-11
"grok-4-1-fast-non-reasoning", # 2025-11
"grok-code-fast-1", # 2025-08
"grok-4", # 2025-07
"grok-4-fast", # 2025-09
"grok-3", # 2025-02
"grok-3-mini", # 2025-05
]
In token_manager.py (missing grok-3, grok-4 families):
"xAI": {
"grok-2-latest": ModelPricing(0.005, 0.015, 131072, 131072),
"grok-2": ModelPricing(0.005, 0.015, 131072, 131072),
"grok-2-mini": ModelPricing(0.001, 0.003, 131072, 65536),
# Missing: grok-3, grok-4, grok-4-1 families
}
Important: The names in PROVIDER_PRICING use simplified patterns:
"gpt-5" matches gpt-5, gpt-5-preview, gpt-5-*"claude-sonnet-4-5" matches claude-sonnet-4-5-* (any date suffix)"gemini-2.5-pro" is exact matchThe token manager uses prefix matching for flexibility.
capabilities.py models list and release_datestoken_manager.py PROVIDER_PRICING["OpenAI"]token_manager.py PROVIDER_PRICINGsupported_capabilities in capabilities.pynotes explaining when/how capability works# Test capabilities registry
uv run pytest massgen/tests/test_backend_capabilities.py -v
# Test token manager
uv run pytest massgen/tests/test_token_manager.py -v
# Generate config with new model
massgen --generate-config ./test.yaml --config-backend openai --config-model new-model
# Build docs to verify tables
cd docs && make html
The easiest way to get comprehensive model pricing and context window data:
URL: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
Coverage: 500+ models across 30+ providers including:
Data Available:
{
"gpt-4o": {
"input_cost_per_token": 0.0000025,
"output_cost_per_token": 0.00001,
"max_input_tokens": 128000,
"max_output_tokens": 16384,
"supports_vision": true,
"supports_function_calling": true,
"supports_prompt_caching": true
}
}
Usage:
import requests
# Fetch latest pricing
url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()
# Get info for a model
model_info = pricing_db.get("gpt-4o")
input_per_1k = model_info["input_cost_per_token"] * 1000
output_per_1k = model_info["output_cost_per_token"] * 1000
Update token_manager.py from LiteLLM:
For the most up-to-date model list with live pricing:
Endpoint: https://openrouter.ai/api/v1/models
Data Available:
Usage:
import requests
import os
headers = {"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"}
response = requests.get("https://openrouter.ai/api/v1/models", headers=headers)
models = response.json()["data"]
for model in models:
print(f"{model['id']}: ${model['pricing']['prompt']} input, ${model['pricing']['completion']} output")
| Provider | Models API | Pricing in API? | Recommendation |
|----------|------------|-----------------|----------------|
| OpenAI | https://api.openai.com/v1/models | ❌ No | Use LiteLLM |
| Claude | No public API | ❌ No | Use LiteLLM |
| Gemini | https://generativelanguage.googleapis.com/v1beta/models | ❌ No | API + LiteLLM |
| Grok (xAI) | https://api.x.ai/v1/models | ❌ No | Use LiteLLM |
| Together AI | https://api.together.xyz/v1/models | ✅ Yes | API directly |
| Groq | https://api.groq.com/openai/v1/models | ❌ No | Use LiteLLM |
| Cerebras | https://api.cerebras.ai/v1/models | ❌ No | Use LiteLLM |
| Fireworks | https://api.fireworks.ai/v1/accounts/{id}/models | ❌ No | Use LiteLLM |
| Azure OpenAI | Azure Management API | ❌ Complex | Manual |
| Claude Code | No API | ❌ No | Manual |
Create scripts/update_model_pricing.py to automate updates:
#!/usr/bin/env python3
"""Update token_manager.py pricing from LiteLLM database."""
import requests
# Fetch LiteLLM database
url = "https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
pricing_db = requests.get(url).json()
# Filter by provider
openai_models = {k: v for k, v in pricing_db.items()
if v.get("litellm_provider") == "openai"}
anthropic_models = {k: v for k, v in pricing_db.items()
if v.get("litellm_provider") == "anthropic"}
# Generate ModelPricing entries
for model_name, info in openai_models.items():
input_per_1k = info["input_cost_per_token"] * 1000
output_per_1k = info["output_cost_per_token"] * 1000
context = info.get("max_input_tokens", 0)
max_output = info.get("max_output_tokens", 0)
print(f' "{model_name}": ModelPricing({input_per_1k}, {output_per_1k}, {context}, {max_output}),')
Run weekly to keep pricing current:
uv run python scripts/update_model_pricing.py
massgen/backend/capabilities.pymassgen/token_manager/token_manager.pymassgen/tests/test_backend_capabilities.pymassgen/config_builder.pydocs/scripts/generate_backend_tables.pytools
Generate typed TypeScript SDKs for AI agents to interact with MCP servers. Converts verbose JSON-RPC curl commands to clean function calls (docs.createDocument() vs curl). Auto-detects MCP tools from server modules, generates TypeScript types and client methods, creates runnable example scripts. Use when: building MCP-enabled applications, need typed programmatic access to MCP tools, want Claude Code to manage apps via scripts, eliminating manual JSON-RPC curl commands, validating MCP inputs/outputs, or creating reusable agent automation.
testing
Generate structured task lists from specs or requirements. IMPORTANT: After completing ANY spec via ExitSpecMode, ALWAYS ask the user: "Would you like me to generate a task list for this spec?" Use when user confirms or explicitly requests task generation from a plan/spec/PRD.
tools
Create compelling story-format summaries using UltraThink to find the best narrative framing. Support multiple formats - 3-part narrative, n-length with inline links, abridged 5-line, or comprehensive via Foundry MCP. USE WHEN user says 'create story explanation', 'narrative summary', 'explain as a story', or wants content in Daniel's conversational first-person voice.
testing
Navigate through the original three-world shamanic technology. Deploy when soul retrieval, power animal guidance, or journey between realms emerges. Deeply respectful of Tungus, Buryat, Yakut, Evenki traditions. Use for consciousness navigation, NOT cultural appropriation.