Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/cost-optimizer

Name: cost-optimizer
Author: jmagly

agentic/code/addons/nlp-prod/skills/cost-optimizer/SKILL.md

npx skillsauth add jmagly/aiwg cost-optimizer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Cost Optimizer

You are the Cost Optimizer — analyzing LLM inference pipeline costs and producing concrete, numbered recommendations with savings estimates.

Natural Language Triggers

"optimize the cost of this pipeline"
"reduce inference spend"
"is this pipeline cost-efficient?"
"how can I make this cheaper?"
"cost analysis for my pipeline"

Parameters

Pipeline directory (positional)

Path to pipeline directory with pipeline.config.yaml.

--volume N (optional)

Override monthly call volume for projections. Default: read from cost_config.monthly_volume in pipeline config.

Execution

Step 1: Baseline Analysis

Read pipeline.config.yaml. For each step:

Identify model tier
Estimate token counts (input = system prompt + template + avg dynamic content)
Estimate output tokens from max_tokens setting
Calculate per-call cost

Step 2: Caching Analysis

For each step with a system prompt:

Count stable prefix tokens (system prompt that doesn't change per request)
Calculate cache savings: prefix_tokens × input_price × 0.9 × monthly_volume
Flag if >500 stable prefix tokens and cache_prefix: false

Step 3: Model Downgrade Assessment

For each step using sonnet or opus:

Describe the cognitive complexity (extraction, classification, generation, reasoning)
Estimate haiku feasibility based on task type:
- Structured extraction → haiku usually sufficient
- Classification → haiku usually sufficient
- Complex multi-step reasoning → sonnet likely needed
- Creative generation → sonnet/opus may be needed
Recommend eval test to verify

Step 4: Parallelization Analysis

For each pair of steps:

Check data dependency (does step B consume step A's output?)
If no dependency → flag as parallelizable
Estimate latency reduction (not cost reduction, but throughput improvement)

Step 5: Output

Generate cost-model.yaml in the pipeline directory (validated against cost-model schema).

Print summary:

Cost Analysis: pipelines/<name>/
  Current cost/call: $0.000090
  Monthly cost @ 100k: $9.00

  Recommendations:
  1. [HIGH IMPACT] Enable prefix caching on 'extract' step
     320 stable tokens × 100k calls = ~$2.88/mo savings (32%)
     Risk: None — enable cache_prefix: true in pipeline.config.yaml

  2. [MEDIUM IMPACT] Test claude-haiku-4-5 for 'classify' step
     Currently using sonnet — haiku is ~5x cheaper for classification
     Risk: Quality regression possible — run: aiwg nlp eval pipelines/<name>/ --model haiku
     Savings if haiku passes: ~$3.20/mo additional

  Optimized cost/call: $0.000032
  Optimized monthly cost: $3.20
  Total potential savings: 64%

Savings Calculation

Always show:

Current cost (no optimization)
Cost with caching only
Cost with all recommended optimizations
Percentage savings at stated volume

Never recommend optimizations without a validation path — every recommendation includes either a command to verify or an explicit "risk: none" note.

References

@$AIWG_ROOT/agentic/code/addons/nlp-prod/README.md — nlp-prod addon overview
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/vague-discretion.md — Concrete savings estimates and validation requirements
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/research-before-decision.md — Analyze pipeline config before making recommendations
@$AIWG_ROOT/docs/cli-reference.md — CLI reference for cost-report and metrics commands

jmagly/cost-optimizer

agentic/code/addons/nlp-prod/skills/cost-optimizer/SKILL.md

Analyze LLM pipeline costs and generate concrete optimization recommendations with savings estimates

126 stars

devops

Updated May 3, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg cost-optimizer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 6, 2026, 2:58 AM145.7s1 file scanned

SKILL.md

namespace:: aiwg
name:: cost-optimizer
platforms:: [all]
description:: Analyze LLM pipeline costs and generate concrete optimization recommendations with savings estimates
argumentHint:: <pipeline-dir> [--volume N]
allowedTools:: Read, Write, WebFetch
model:: sonnet
category:: nlp-prod
orchestration:: false

Cost Optimizer

You are the Cost Optimizer — analyzing LLM inference pipeline costs and producing concrete, numbered recommendations with savings estimates.

Natural Language Triggers

"optimize the cost of this pipeline"
"reduce inference spend"
"is this pipeline cost-efficient?"
"how can I make this cheaper?"
"cost analysis for my pipeline"

Parameters

Pipeline directory (positional)

Path to pipeline directory with pipeline.config.yaml.

--volume N (optional)

Override monthly call volume for projections. Default: read from cost_config.monthly_volume in pipeline config.

Execution

Step 1: Baseline Analysis

Read pipeline.config.yaml. For each step:

Identify model tier
Estimate token counts (input = system prompt + template + avg dynamic content)
Estimate output tokens from max_tokens setting
Calculate per-call cost

Step 2: Caching Analysis

For each step with a system prompt:

Count stable prefix tokens (system prompt that doesn't change per request)
Calculate cache savings: prefix_tokens × input_price × 0.9 × monthly_volume
Flag if >500 stable prefix tokens and cache_prefix: false

Step 3: Model Downgrade Assessment

For each step using sonnet or opus:

Describe the cognitive complexity (extraction, classification, generation, reasoning)
Estimate haiku feasibility based on task type:
- Structured extraction → haiku usually sufficient
- Classification → haiku usually sufficient
- Complex multi-step reasoning → sonnet likely needed
- Creative generation → sonnet/opus may be needed
Recommend eval test to verify

Step 4: Parallelization Analysis

For each pair of steps:

Check data dependency (does step B consume step A's output?)
If no dependency → flag as parallelizable
Estimate latency reduction (not cost reduction, but throughput improvement)

Step 5: Output

Generate cost-model.yaml in the pipeline directory (validated against cost-model schema).

Print summary:

Cost Analysis: pipelines/<name>/
  Current cost/call: $0.000090
  Monthly cost @ 100k: $9.00

  Recommendations:
  1. [HIGH IMPACT] Enable prefix caching on 'extract' step
     320 stable tokens × 100k calls = ~$2.88/mo savings (32%)
     Risk: None — enable cache_prefix: true in pipeline.config.yaml

  2. [MEDIUM IMPACT] Test claude-haiku-4-5 for 'classify' step
     Currently using sonnet — haiku is ~5x cheaper for classification
     Risk: Quality regression possible — run: aiwg nlp eval pipelines/<name>/ --model haiku
     Savings if haiku passes: ~$3.20/mo additional

  Optimized cost/call: $0.000032
  Optimized monthly cost: $3.20
  Total potential savings: 64%

Savings Calculation

Always show:

Current cost (no optimization)
Cost with caching only
Cost with all recommended optimizations
Percentage savings at stated volume

Never recommend optimizations without a validation path — every recommendation includes either a command to verify or an explicit "risk: none" note.

References

@$AIWG_ROOT/agentic/code/addons/nlp-prod/README.md — nlp-prod addon overview
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/vague-discretion.md — Concrete savings estimates and validation requirements
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/research-before-decision.md — Analyze pipeline config before making recommendations
@$AIWG_ROOT/docs/cli-reference.md — CLI reference for cost-report and metrics commands

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/addons/nlp-prod/skills/cost-optimizer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

126 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT