skills/litellm-proxy/SKILL.md
Set up LiteLLM AI Gateway proxy with Docker Compose for Claude Code Max subscription. Use when the user wants to route Claude Code through LiteLLM for cost tracking, budget controls, or usage monitoring. Also trigger when user mentions "litellm", "AI gateway", "proxy for Claude Code", "track Claude usage", "Claude Code billing", or wants to set up a local proxy between Claude Code and Anthropic API.
npx skillsauth add razbakov/skills litellm-proxyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Set up a LiteLLM AI Gateway proxy via Docker Compose so Claude Code Max subscription traffic flows through LiteLLM for cost attribution, budget controls, and usage tracking per user or team.
Reference docs: https://docs.litellm.ai/docs/tutorials/claude_code_max_subscription
Claude Code sends two headers on each request:
Authorization: Bearer {oauth_token} — the Max subscription OAuth token, forwarded to Anthropicx-litellm-api-key: Bearer {virtual_key} — authenticates with LiteLLM proxyLiteLLM validates the virtual key, logs the request, then forwards everything (including the OAuth token) to Anthropic.
mkdir -p ~/Projects/litellm && cd ~/Projects/litellm
docker-compose.ymlservices:
litellm:
image: docker.litellm.ai/berriai/litellm:main-stable
volumes:
- ./config.yaml:/app/config.yaml
command:
- "--config=/app/config.yaml"
ports:
- "4000:4000"
environment:
DATABASE_URL: "postgresql://llmproxy:dbpassword9090@db:5432/litellm"
STORE_MODEL_IN_DB: "True"
env_file:
- .env
depends_on:
- db
healthcheck:
test:
- CMD-SHELL
- python3 -c "import urllib.request; urllib.request.urlopen('http://localhost:4000/health/liveliness')"
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
db:
image: postgres:16
restart: always
container_name: litellm_db
environment:
POSTGRES_DB: litellm
POSTGRES_USER: llmproxy
POSTGRES_PASSWORD: dbpassword9090
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -d litellm -U llmproxy"]
interval: 1s
timeout: 5s
retries: 10
prometheus:
image: prom/prometheus
volumes:
- prometheus_data:/prometheus
- ./prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9090:9090"
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=15d"
restart: always
volumes:
prometheus_data:
driver: local
postgres_data:
name: litellm_postgres_data
config.yamlTwo critical settings in general_settings:
forward_client_headers_to_llm_api: true — forwards the OAuth token to Anthropiclitellm_key_header_name: "x-litellm-api-key" — tells LiteLLM to authenticate via this custom header instead of the Authorization header (which carries the OAuth token)model_list:
- model_name: anthropic-claude
litellm_params:
model: anthropic/claude-sonnet-4-6
- model_name: anthropic-opus
litellm_params:
model: anthropic/claude-opus-4-6
- model_name: anthropic-haiku
litellm_params:
model: anthropic/claude-haiku-4-5-20251001
general_settings:
master_key: os.environ/LITELLM_MASTER_KEY
database_url: "postgresql://llmproxy:dbpassword9090@db:5432/litellm"
forward_client_headers_to_llm_api: true
litellm_key_header_name: "x-litellm-api-key"
Update model IDs to the latest available at time of setup. Check https://docs.anthropic.com/en/docs/about-claude/models for current model IDs.
.envGenerate secure random keys for production use. The salt key cannot be changed after adding a model.
LITELLM_MASTER_KEY="sk-change-me-to-a-secure-key"
LITELLM_SALT_KEY="sk-change-me-to-a-secure-salt"
UI_USERNAME="admin"
UI_PASSWORD="admin"
prometheus.ymlglobal:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "litellm"
static_configs:
- targets: ["litellm:4000"]
All config files (.env, config.yaml, prometheus.yml) must exist before starting — Docker will create directories instead of files if they're missing.
docker compose up -d
Wait for health check to pass:
sleep 15 && curl -s http://localhost:4000/health/liveliness
Use the master key to create a virtual key for Claude Code:
curl -s -X POST 'http://localhost:4000/key/generate' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-change-me-to-a-secure-key' \
-d '{"key_name": "claude-code"}' | python3 -m json.tool
Save the returned key value (starts with sk-).
Add to ~/.zshrc (or ~/.bashrc):
# LiteLLM Proxy (Claude Code Max subscription)
export ANTHROPIC_BASE_URL=http://localhost:4000
export ANTHROPIC_MODEL="anthropic-claude"
export ANTHROPIC_CUSTOM_HEADERS="x-litellm-api-key: Bearer <YOUR_VIRTUAL_KEY>"
Then source ~/.zshrc and restart Claude Code.
# Check proxy auth works
curl -s -H "x-litellm-api-key: Bearer <YOUR_VIRTUAL_KEY>" http://localhost:4000/model/info
# Test from Claude Code
source ~/.zshrc && echo "say hi" | claude --print
| Service | URL | |---------|-----| | LiteLLM Proxy | http://localhost:4000 | | LiteLLM UI | http://localhost:4000/ui | | Prometheus | http://localhost:9090 | | Postgres | localhost:5432 |
/ui/) authenticates via Authorization header internally. When litellm_key_header_name is set to a custom header, the UI login may not work with API key auth. Use UI_USERNAME/UI_PASSWORD for UI access instead.# Start
cd ~/Projects/litellm && docker compose up -d
# Stop
cd ~/Projects/litellm && docker compose down
# Restart (after config changes)
cd ~/Projects/litellm && docker compose restart litellm
# Logs
cd ~/Projects/litellm && docker compose logs litellm --tail 50
# Health check
curl -s http://localhost:4000/health/liveliness
development
Seed a new or empty Instagram account with a 9-post grid (3×3) so the profile looks established the moment a new visitor lands. Designed for festivals, new businesses, product launches, conferences, communities — any time an empty IG profile would hurt conversion from external traffic (QR scans, flyer drops, cross-promo). Generates assets via /image-from-gemini (per content-publishing rules — never HTML), writes captions with hashtag sets, and outputs a posting order + cadence plan. Trigger generously: phrases like '9 posts for instagram', 'fill my IG', 'starter grid', 'launch grid', 'instagram seed', '9-post grid', 'IG account not to look empty', 'first instagram posts', 'feed bootstrap', '3x3 grid', 'instagram launch content'. Even if the user mentions only one piece (just the images, just the captions, just the order), use this skill — the grid only works as an integrated bundle.
testing
Translate one English blog post into multiple target languages via parallel sub-agents, preserving frontmatter conventions, hero image, and brand voice. Use when the user shares a published English post URL or markdown path and says 'translate it', 'add other languages', 'publish in DE/ES/RU/UK', 'translate to 5 languages', or asks for localized versions of a specific post.
development
Build a complete press kit for an event, product launch, or campaign — in multiple languages — and publish it as a shareable Google Drive folder ready to send to journalists, partners, or a delegate. Produces press releases (typically DE/EN/ES, or configurable), uploads press photos and flyers, creates an Overview document for at-a-glance briefing, and creates a Handover document with pending tasks, contacts, risks, and decisions so press distribution can be delegated. Use when the user says 'I need a press release', 'create a press kit', 'press release in X languages', 'set up a Drive folder for press', 'handover doc for someone else to run press', or has an upcoming announcement that needs to be sent to media. Trigger generously: even partial requests (just a press release, just a flyer folder) typically evolve into the full kit.
development
Track ticket sales for a live event (concert, festival, conference, workshop) with daily snapshots, generate a burndown chart comparing actual sales to ideal-linear targets and tier-cumulative milestones, and report whether the event is on pace. Use when the user asks how sales are going, wants to know if their event will sell out, asks for a daily sales report, wants to set up sales tracking for an upcoming event, or asks about ticket pace / velocity / projection. Trigger generously: phrases like 'how is concert sales going', 'burndown for my event', 'are we going to sell out', 'sales velocity', 'daily ticket chart', 'how many tickets do we need to sell', or any case where the user has a ticketed event with a fixed sales window and wants visibility on pacing.