Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

markus41/plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization

Name: plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization
Author: markus41

plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization/SKILL.md

npx skillsauth add markus41/claude plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Claude Code Cost Optimization

Complete guide to managing costs, model routing, token usage, and caching.

Cost Tracking

/cost Command

/cost

Shows:

Input tokens consumed
Output tokens consumed
Cache read tokens (cheaper)
Cache write tokens
Total estimated cost (USD)

Model Selection & Routing

Available Models

| Model | ID | Best For | Cost | |-------|-----|---------|------| | Opus 4.6 | claude-opus-4-6 | Architecture, complex decisions | Highest | | Sonnet 4.6 | claude-sonnet-4-6 | General development, implementation | Medium | | Haiku 4.5 | claude-haiku-4-5-20251001 | Quick lookups, simple tasks | Lowest |

Switching Models

/model claude-haiku-4-5-20251001   # Switch to Haiku for simple tasks
/model claude-sonnet-4-6            # Switch back to Sonnet
/model claude-opus-4-6              # Switch to Opus for complex work

CLI Model Override

claude -m claude-haiku-4-5-20251001 -p "quick question"

Settings Configuration

{
  "model": "claude-sonnet-4-6",
  "smallFastModel": "claude-haiku-4-5-20251001"
}

Token Reduction Strategies

1. Use /compact Frequently

/compact                    # Compress full conversation
/compact focus on the API   # Compress with specific focus

Reduces context window size, lowering per-message input costs.

2. Targeted File Reads

// Expensive: read entire large file
Read(file_path="large-file.ts")        // ~5000 tokens

// Cheap: read specific section
Read(file_path="large-file.ts", offset=100, limit=30)  // ~300 tokens

// Cheap: search first
Grep(pattern="function auth", path="src/")  // ~100 tokens

3. Use Sub-Agents for Research

Sub-agents process information internally and return summaries:

// Main context gets only the summary (~500 tokens)
// Instead of 20 file reads (~50,000 tokens)
Agent(subagent_type="Explore", prompt="Find all database models")

4. Grep Before Read

// Don't read every file looking for something
// Search first, then read only matching files
Grep(pattern="TODO|FIXME", type="ts")

5. Background Tasks

// Long tasks don't consume main context while running
Agent(run_in_background=true, ...)
Bash(command="npm test", run_in_background=true)

6. Clear Between Unrelated Tasks

/clear   # Reset context for new topic

Prompt Caching

How It Works

Claude Code caches system prompts and conversation history
Cached tokens cost significantly less (~90% savings)
Cache hits happen when the same prefix appears in consecutive requests

Maximizing Cache Hits

Keep CLAUDE.md stable — Changes invalidate cache
Consistent system prompts — Don't modify with --append-system-prompt frequently
Sequential conversations — Cache benefits multi-turn conversations

API-Level Caching

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "Your system prompt here...",
      cache_control: { type: "ephemeral" }
    }
  ],
  messages: [...]
});

// Usage shows cache info
console.log(response.usage.cache_creation_input_tokens);
console.log(response.usage.cache_read_input_tokens);

Provider Cost Comparison

Anthropic Direct

Standard pricing, most features.

AWS Bedrock

CLAUDE_CODE_USE_BEDROCK=1 claude

May have different pricing through AWS agreements
Cross-region inference available
Committed use discounts possible

Google Vertex AI

CLAUDE_CODE_USE_VERTEX=1 claude

GCP pricing and billing
May have committed use discounts

Batch Processing (50% Savings)

For non-interactive workloads, use the Message Batches API:

const batch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "review-1",
      params: {
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Review file1.ts" }]
      }
    },
    {
      custom_id: "review-2",
      params: {
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Review file2.ts" }]
      }
    }
  ]
});

Batch processing gives 50% cost reduction with 24-hour SLA.

Cost Estimation

Rule of Thumb

| Task | Approximate Cost | |------|-----------------| | Simple question | $0.01 - $0.05 | | Code review (1 file) | $0.05 - $0.15 | | Feature implementation | $0.20 - $1.00 | | Complex refactoring | $0.50 - $2.00 | | Full project analysis | $1.00 - $5.00 |

Factors Affecting Cost

Model used (Opus > Sonnet > Haiku)
Context window size (more history = more input tokens)
Number of tool calls (each adds output + input)
File sizes read
Number of conversation turns
Extended thinking budget

Best Practices

Start with Sonnet — Good balance of quality and cost
Use Haiku for exploration — Switch for simple lookups
Upgrade to Opus for architecture — Worth the cost for complex decisions
Compact regularly — Smaller context = lower cost per turn
Delegate research — Sub-agents are cost-neutral for main context
Monitor with /cost — Track spending periodically
Use batch API for bulk — 50% savings on non-interactive work
Leverage caching — Keep system prompts stable

markus41/plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization

plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization/SKILL.md

# Claude Code Cost Optimization Complete guide to managing costs, model routing, token usage, and caching. ## Cost Tracking ### /cost Command ``` /cost ``` Shows: - Input tokens consumed - Output tokens consumed - Cache read tokens (cheaper) - Cache write tokens - Total estimated cost (USD) ## Model Selection & Routing ### Available Models | Model | ID | Best For | Cost | |-------|-----|---------|------| | Opus 4.6 | `claude-opus-4-6` | Architecture, complex decisions | Highest | | Sonnet 4

10 stars

tools

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add markus41/claude plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 7, 2026, 2:28 AM34.1s1 file scanned

SKILL.md

Claude Code Cost Optimization

Complete guide to managing costs, model routing, token usage, and caching.

Cost Tracking

/cost Command

/cost

Shows:

Input tokens consumed
Output tokens consumed
Cache read tokens (cheaper)
Cache write tokens
Total estimated cost (USD)

Model Selection & Routing

Available Models

Switching Models

/model claude-haiku-4-5-20251001   # Switch to Haiku for simple tasks
/model claude-sonnet-4-6            # Switch back to Sonnet
/model claude-opus-4-6              # Switch to Opus for complex work

CLI Model Override

claude -m claude-haiku-4-5-20251001 -p "quick question"

Settings Configuration

{
  "model": "claude-sonnet-4-6",
  "smallFastModel": "claude-haiku-4-5-20251001"
}

Token Reduction Strategies

1. Use /compact Frequently

/compact                    # Compress full conversation
/compact focus on the API   # Compress with specific focus

Reduces context window size, lowering per-message input costs.

2. Targeted File Reads

// Expensive: read entire large file
Read(file_path="large-file.ts")        // ~5000 tokens

// Cheap: read specific section
Read(file_path="large-file.ts", offset=100, limit=30)  // ~300 tokens

// Cheap: search first
Grep(pattern="function auth", path="src/")  // ~100 tokens

3. Use Sub-Agents for Research

Sub-agents process information internally and return summaries:

// Main context gets only the summary (~500 tokens)
// Instead of 20 file reads (~50,000 tokens)
Agent(subagent_type="Explore", prompt="Find all database models")

4. Grep Before Read

// Don't read every file looking for something
// Search first, then read only matching files
Grep(pattern="TODO|FIXME", type="ts")

5. Background Tasks

// Long tasks don't consume main context while running
Agent(run_in_background=true, ...)
Bash(command="npm test", run_in_background=true)

6. Clear Between Unrelated Tasks

/clear   # Reset context for new topic

Prompt Caching

How It Works

Claude Code caches system prompts and conversation history
Cached tokens cost significantly less (~90% savings)
Cache hits happen when the same prefix appears in consecutive requests

Maximizing Cache Hits

Keep CLAUDE.md stable — Changes invalidate cache
Consistent system prompts — Don't modify with --append-system-prompt frequently
Sequential conversations — Cache benefits multi-turn conversations

API-Level Caching

const response = await client.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  system: [
    {
      type: "text",
      text: "Your system prompt here...",
      cache_control: { type: "ephemeral" }
    }
  ],
  messages: [...]
});

// Usage shows cache info
console.log(response.usage.cache_creation_input_tokens);
console.log(response.usage.cache_read_input_tokens);

Provider Cost Comparison

Anthropic Direct

Standard pricing, most features.

AWS Bedrock

CLAUDE_CODE_USE_BEDROCK=1 claude

May have different pricing through AWS agreements
Cross-region inference available
Committed use discounts possible

Google Vertex AI

CLAUDE_CODE_USE_VERTEX=1 claude

GCP pricing and billing
May have committed use discounts

Batch Processing (50% Savings)

For non-interactive workloads, use the Message Batches API:

const batch = await client.messages.batches.create({
  requests: [
    {
      custom_id: "review-1",
      params: {
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Review file1.ts" }]
      }
    },
    {
      custom_id: "review-2",
      params: {
        model: "claude-sonnet-4-6",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Review file2.ts" }]
      }
    }
  ]
});

Batch processing gives 50% cost reduction with 24-hour SLA.

Cost Estimation

Rule of Thumb

Factors Affecting Cost

Model used (Opus > Sonnet > Haiku)
Context window size (more history = more input tokens)
Number of tool calls (each adds output + input)
File sizes read
Number of conversation turns
Extended thinking budget

Best Practices

Start with Sonnet — Good balance of quality and cost
Use Haiku for exploration — Switch for simple lookups
Upgrade to Opus for architecture — Worth the cost for complex decisions
Compact regularly — Smaller context = lower cost per turn
Delegate research — Sub-agents are cost-neutral for main context
Monitor with /cost — Track spending periodically
Use batch API for bulk — 50% savings on non-interactive work
Leverage caching — Keep system prompts stable

Related Skills

markus41/plugins/microsoft-agents-expert/skills/teams-agents

tools

VerifiedTrustedCommunity

Build Teams-native agents with the Teams SDK (formerly Teams AI Library v2) — App class, activity routing, adaptive cards, streaming, AI-generated labels, feedback, message extensions, Teams-as-MCP-server, and the bring-your-own-AI pattern with Agent Framework.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/teams-agents

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

tools

VerifiedTrustedCommunity

Run agents on Microsoft Foundry (formerly Azure AI Foundry) Agent Service — prompt agents vs hosted agents, threads/runs and the Responses API, built-in tools (Bing grounding, code interpreter, file search, MCP, OpenAPI, A2A), connected agents, Entra agent identity, SDKs, and observability/evaluations.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/microsoft-foundry

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

tools

VerifiedTrustedCommunity

Build and host custom engine agents with the Microsoft 365 Agents SDK — AgentApplication, the Activity protocol, channel reach via Azure Bot Service, hosting Agent Framework or Semantic Kernel engines, and the Agents Toolkit/Playground workflow. Successor to the Bot Framework SDK.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/m365-agents-sdk

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

tools

VerifiedTrustedCommunity

Design, govern, and extend Microsoft Copilot Studio agents — topics, generative orchestration, knowledge, tools and MCP, agent flows, autonomous triggers, publishing channels, Copilot Credits pricing, and solution-based ALM on Power Platform.

18SKILL.mdUpdated Jul 12, 2026

markus41/plugins/microsoft-agents-expert/skills/copilot-studio

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/markus41/claude.git

# Copy into Claude Code skills folder (global)
cp -r claude/plugins/claude-code-expert/archive/v7.6.0/skills/cost-optimization ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

markus41/claude

10 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT