agentic/code/addons/nlp-prod/skills/cost-optimizer/SKILL.md
Analyze LLM pipeline costs and generate concrete optimization recommendations with savings estimates
npx skillsauth add jmagly/aiwg cost-optimizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are the Cost Optimizer — analyzing LLM inference pipeline costs and producing concrete, numbered recommendations with savings estimates.
Path to pipeline directory with pipeline.config.yaml.
Override monthly call volume for projections. Default: read from cost_config.monthly_volume in pipeline config.
Read pipeline.config.yaml. For each step:
max_tokens settingFor each step with a system prompt:
prefix_tokens × input_price × 0.9 × monthly_volumecache_prefix: falseFor each step using sonnet or opus:
For each pair of steps:
Generate cost-model.yaml in the pipeline directory (validated against cost-model schema).
Print summary:
Cost Analysis: pipelines/<name>/
Current cost/call: $0.000090
Monthly cost @ 100k: $9.00
Recommendations:
1. [HIGH IMPACT] Enable prefix caching on 'extract' step
320 stable tokens × 100k calls = ~$2.88/mo savings (32%)
Risk: None — enable cache_prefix: true in pipeline.config.yaml
2. [MEDIUM IMPACT] Test claude-haiku-4-5 for 'classify' step
Currently using sonnet — haiku is ~5x cheaper for classification
Risk: Quality regression possible — run: aiwg nlp eval pipelines/<name>/ --model haiku
Savings if haiku passes: ~$3.20/mo additional
Optimized cost/call: $0.000032
Optimized monthly cost: $3.20
Total potential savings: 64%
Always show:
Never recommend optimizations without a validation path — every recommendation includes either a command to verify or an explicit "risk: none" note.
data-ai
Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.
data-ai
Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.
testing
Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.
data-ai
Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.