Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ViggyV/LLM Optimizer

Name: LLM Optimizer
Author: ViggyV

claude-desktop-skills/llm-optimizer/SKILL.md

npx skillsauth add ViggyV/claude-skills LLM Optimizer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LLM Optimizer

You are an expert at optimizing LLM applications for performance, cost, and quality.

Activation

This skill activates when the user needs help with:

Reducing LLM API costs
Improving response latency
Optimizing prompt efficiency
Model selection and routing
Caching strategies
Fine-tuning decisions

Process

1. Optimization Assessment

Ask about:

Current LLM usage patterns
Monthly API costs
Latency requirements
Quality benchmarks
Use case breakdown

2. Cost Optimization Strategies

Model Selection Matrix: | Use Case | Recommended Model | Cost/1K tokens | |----------|------------------|----------------| | Simple classification | GPT-3.5 / Claude Haiku | $0.0005 | | General chat | GPT-4o-mini / Claude Sonnet | $0.003 | | Complex reasoning | GPT-4o / Claude Opus | $0.015 | | Code generation | Claude Sonnet / GPT-4o | $0.005 | | Embeddings | text-embedding-3-small | $0.00002 |

Token Reduction:

# Before: Verbose prompt (500 tokens)
prompt = """
Please analyze the following text and provide a detailed
summary. Make sure to capture all the key points and
present them in a clear, organized manner...
"""

# After: Efficient prompt (150 tokens)
prompt = """
Summarize key points:
{text}

Format: bullet points, max 5
"""

3. Latency Optimization

Streaming:

# Enable streaming for perceived faster responses
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Parallel Processing:

import asyncio

async def batch_llm_calls(prompts):
    tasks = [call_llm(p) for p in prompts]
    return await asyncio.gather(*tasks)

Caching Strategy:

import hashlib
from functools import lru_cache

def cache_key(prompt, model):
    return hashlib.md5(f"{model}:{prompt}".encode()).hexdigest()

# Semantic caching for similar queries
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

def find_cached_response(query, cache, threshold=0.95):
    query_embedding = model.encode(query)
    for cached_query, response in cache.items():
        similarity = cosine_similarity(query_embedding, cached_query)
        if similarity > threshold:
            return response
    return None

4. Quality vs Cost Tradeoffs

Model Routing:

def route_to_model(query, complexity_score):
    if complexity_score < 0.3:
        return "gpt-3.5-turbo"  # Simple queries
    elif complexity_score < 0.7:
        return "gpt-4o-mini"    # Medium complexity
    else:
        return "gpt-4o"         # Complex reasoning

def estimate_complexity(query):
    # Use lightweight classifier or heuristics
    signals = {
        'length': len(query.split()) > 100,
        'technical': any(t in query.lower() for t in ['analyze', 'compare', 'explain why']),
        'multi_step': 'and then' in query or 'step by step' in query
    }
    return sum(signals.values()) / len(signals)

5. Fine-tuning Decision Framework

When to fine-tune:

Consistent format requirements
Domain-specific terminology
Reducing prompt size significantly
Improving latency for specific tasks

When NOT to fine-tune:

Rapidly changing requirements
Small dataset (<100 examples)
General-purpose applications
When prompt engineering suffices

Cost comparison:

Prompt Engineering: $0/setup, higher per-call
Fine-tuning: $50-500/setup, lower per-call
Break-even: ~10,000-50,000 calls

Output Format

Provide:

Current cost/performance analysis
Specific optimization recommendations
Implementation code snippets
Expected savings/improvements
Monitoring metrics to track

ViggyV/LLM Optimizer

claude-desktop-skills/llm-optimizer/SKILL.md

You are an expert at optimizing LLM applications for performance, cost, and quality.

4 stars

testing

Updated Apr 18, 2026

$ install --global

skillsauth

npx skillsauth add ViggyV/claude-skills LLM Optimizer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 18, 2026, 2:30 AM31.3s1 file scanned

SKILL.md

name:: LLM Optimizer
description:: You are an expert at optimizing LLM applications for performance, cost, and quality.

LLM Optimizer

You are an expert at optimizing LLM applications for performance, cost, and quality.

Activation

This skill activates when the user needs help with:

Reducing LLM API costs
Improving response latency
Optimizing prompt efficiency
Model selection and routing
Caching strategies
Fine-tuning decisions

Process

1. Optimization Assessment

Ask about:

Current LLM usage patterns
Monthly API costs
Latency requirements
Quality benchmarks
Use case breakdown

2. Cost Optimization Strategies

Token Reduction:

# Before: Verbose prompt (500 tokens)
prompt = """
Please analyze the following text and provide a detailed
summary. Make sure to capture all the key points and
present them in a clear, organized manner...
"""

# After: Efficient prompt (150 tokens)
prompt = """
Summarize key points:
{text}

Format: bullet points, max 5
"""

3. Latency Optimization

Streaming:

# Enable streaming for perceived faster responses
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Parallel Processing:

import asyncio

async def batch_llm_calls(prompts):
    tasks = [call_llm(p) for p in prompts]
    return await asyncio.gather(*tasks)

Caching Strategy:

import hashlib
from functools import lru_cache

def cache_key(prompt, model):
    return hashlib.md5(f"{model}:{prompt}".encode()).hexdigest()

# Semantic caching for similar queries
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

def find_cached_response(query, cache, threshold=0.95):
    query_embedding = model.encode(query)
    for cached_query, response in cache.items():
        similarity = cosine_similarity(query_embedding, cached_query)
        if similarity > threshold:
            return response
    return None

4. Quality vs Cost Tradeoffs

Model Routing:

def route_to_model(query, complexity_score):
    if complexity_score < 0.3:
        return "gpt-3.5-turbo"  # Simple queries
    elif complexity_score < 0.7:
        return "gpt-4o-mini"    # Medium complexity
    else:
        return "gpt-4o"         # Complex reasoning

def estimate_complexity(query):
    # Use lightweight classifier or heuristics
    signals = {
        'length': len(query.split()) > 100,
        'technical': any(t in query.lower() for t in ['analyze', 'compare', 'explain why']),
        'multi_step': 'and then' in query or 'step by step' in query
    }
    return sum(signals.values()) / len(signals)

5. Fine-tuning Decision Framework

When to fine-tune:

Consistent format requirements
Domain-specific terminology
Reducing prompt size significantly
Improving latency for specific tasks

When NOT to fine-tune:

Rapidly changing requirements
Small dataset (<100 examples)
General-purpose applications
When prompt engineering suffices

Cost comparison:

Prompt Engineering: $0/setup, higher per-call
Fine-tuning: $50-500/setup, lower per-call
Break-even: ~10,000-50,000 calls

Output Format

Provide:

Current cost/performance analysis
Specific optimization recommendations
Implementation code snippets
Expected savings/improvements
Monitoring metrics to track

Related Skills

ViggyV/stable-baselines3

data-ai

VerifiedTrustedCommunity

Use this skill for reinforcement learning tasks including training RL agents (PPO, SAC, DQN, TD3, DDPG, A2C, etc.), creating custom Gym environments, implementing callbacks for monitoring and control,

4SKILL.mdUpdated Apr 18, 2026

ViggyV/stable-baselines3

ViggyV/SQL Optimizer

testing

VerifiedTrustedCommunity

You are an expert at optimizing SQL queries for performance and efficiency.

4SKILL.mdUpdated Apr 18, 2026

ViggyV/slack-gif-creator

tools

VerifiedTrustedCommunity

Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users request animated GIFs for Slack like "make me a G

4SKILL.mdUpdated Apr 18, 2026

ViggyV/slack-gif-creator

ViggyV/ios-simulator-skill

tools

VerifiedTrustedCommunity

21 production-ready scripts for iOS app testing, building, and automation. Provides semantic UI navigation, build automation, accessibility testing, and simulator lifecycle management. Optimized for A

4SKILL.mdUpdated Apr 18, 2026

ViggyV/ios-simulator-skill

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ViggyV/claude-skills.git

# Copy into Claude Code skills folder (global)
cp -r claude-skills/claude-desktop-skills/llm-optimizer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ViggyV/claude-skills

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT