Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

scientiacapital/cost-optimized-llm

Name: cost-optimized-llm
Author: scientiacapital

skills/cost-optimized-llm/SKILL.md

npx skillsauth add scientiacapital/scientia-superpowers cost-optimized-llm

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Cost-Optimized LLM Routing

Achieve 70-90% cost savings with intelligent model routing. NO OpenAI allowed.

Critical Rule

NEVER use OpenAI models in this ecosystem.

Allowed providers:

Anthropic Claude (Haiku, Sonnet, Opus)
Google Gemini (Flash, Pro)
DeepSeek (via OpenRouter)
Qwen (via OpenRouter)
Cerebras (speed-critical)
Local: Ollama, sentence-transformers

Cost Comparison

| Model | Cost per 1M tokens | Use Case | |-------|-------------------|----------| | DeepSeek V3 | $0.14 input / $0.28 output | Simple queries, classification | | Claude Haiku | $0.25 input / $1.25 output | Moderate complexity | | Gemini Flash | FREE (limited) | MVP, prototyping | | Claude Sonnet | $3.00 input / $15.00 output | Complex reasoning | | Claude Opus | $15.00 input / $75.00 output | Expert tasks only |

Tiered Routing Strategy

Tier 1: Simple Tasks → DeepSeek ($0.0001/1K)

Use for:

Text classification
Simple extractions
Formatting
Basic Q&A
Sentiment analysis

from openai import OpenAI  # OpenRouter uses OpenAI SDK

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"]
)

response = client.chat.completions.create(
    model="deepseek/deepseek-chat",
    messages=[{"role": "user", "content": prompt}],
    max_tokens=500
)

Tier 2: Moderate Tasks → Claude Haiku ($0.00075/1K)

Use for:

Code review
Summarization
Multi-step reasoning
Data analysis

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-3-5-haiku-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}]
)

Tier 3: Complex Tasks → Claude Sonnet ($0.009/1K)

Use for:

Architecture decisions
Complex code generation
Multi-file refactoring
Nuanced analysis

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}]
)

Automatic Routing Implementation

from enum import Enum
from typing import Literal

class TaskComplexity(Enum):
    SIMPLE = "simple"
    MODERATE = "moderate"
    COMPLEX = "complex"

def route_to_model(complexity: TaskComplexity) -> str:
    """Route to appropriate model based on complexity."""
    routing = {
        TaskComplexity.SIMPLE: "deepseek/deepseek-chat",
        TaskComplexity.MODERATE: "claude-3-5-haiku-20241022",
        TaskComplexity.COMPLEX: "claude-sonnet-4-20250514"
    }
    return routing[complexity]

def estimate_complexity(prompt: str) -> TaskComplexity:
    """Estimate task complexity from prompt characteristics."""
    # Simple heuristics
    word_count = len(prompt.split())
    has_code = "```" in prompt or "def " in prompt or "function" in prompt
    has_analysis = any(w in prompt.lower() for w in ["analyze", "compare", "evaluate"])

    if word_count < 50 and not has_code and not has_analysis:
        return TaskComplexity.SIMPLE
    elif word_count < 200 or (has_code and not has_analysis):
        return TaskComplexity.MODERATE
    else:
        return TaskComplexity.COMPLEX

def smart_complete(prompt: str, force_model: str = None) -> str:
    """Complete with automatic model routing."""
    if force_model:
        model = force_model
    else:
        complexity = estimate_complexity(prompt)
        model = route_to_model(complexity)

    # Route to appropriate client
    if model.startswith("deepseek"):
        return call_openrouter(model, prompt)
    else:
        return call_anthropic(model, prompt)

Free Tier Strategy (Gemini Flash)

For MVPs and prototyping, use Gemini Flash (FREE):

import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-1.5-flash")

response = model.generate_content(prompt)

Limits:

15 requests/minute
1 million tokens/day
1,500 requests/day

Cost Tracking

Track costs per project:

import json
from datetime import datetime
from pathlib import Path

COST_LOG = Path.home() / ".claude" / "llm_costs.jsonl"

def log_cost(project: str, model: str, input_tokens: int, output_tokens: int):
    """Log LLM usage for cost tracking."""
    costs = {
        "deepseek/deepseek-chat": (0.00014, 0.00028),
        "claude-3-5-haiku-20241022": (0.00025, 0.00125),
        "claude-sonnet-4-20250514": (0.003, 0.015),
        "gemini-1.5-flash": (0, 0)  # Free
    }

    input_cost, output_cost = costs.get(model, (0.01, 0.03))
    total = (input_tokens / 1_000_000 * input_cost) + (output_tokens / 1_000_000 * output_cost)

    entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "project": project,
        "model": model,
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "cost_usd": round(total, 6)
    }

    with open(COST_LOG, "a") as f:
        f.write(json.dumps(entry) + "\n")

    return total

Voice AI Cost Optimization

For voice pipelines (vozlux, solarvoice-ai):

STT (Speech-to-Text)

Deepgram Nova-2: $0.0043/min (recommended)
AssemblyAI: $0.00025/sec

TTS (Text-to-Speech)

Cartesia Sonic-3: ~$0.01/1K chars (quality)
AWS Polly: ~$0.004/1K chars (budget)

Tier-Based Voice Routing

def get_voice_tier(subscription: str) -> dict:
    tiers = {
        "starter": {
            "tts": "polly",
            "stt": "deepgram-base",
            "llm": "deepseek"
        },
        "pro": {
            "tts": "cartesia",
            "stt": "deepgram-nova",
            "llm": "haiku"
        },
        "enterprise": {
            "tts": "cartesia",
            "stt": "deepgram-nova",
            "llm": "sonnet"
        }
    }
    return tiers.get(subscription, tiers["starter"])

Monthly Budget Estimates

For a typical Scientia project:

| Usage Level | DeepSeek Heavy | Mixed Tier | Sonnet Heavy | |-------------|----------------|------------|--------------| | Light (10K queries) | $1.40 | $8 | $90 | | Medium (100K queries) | $14 | $80 | $900 | | Heavy (1M queries) | $140 | $800 | $9,000 |

Recommendation: Use Mixed Tier routing for 90%+ of use cases.

Environment Variables

Required in .env:

# Primary (Anthropic)
ANTHROPIC_API_KEY=sk-ant-...

# Cost optimization (OpenRouter for DeepSeek)
OPENROUTER_API_KEY=sk-or-...

# Free tier (Google)
GOOGLE_API_KEY=AIza...

# NEVER set these:
# OPENAI_API_KEY=  # FORBIDDEN

Validation

lang-core enforces NO OpenAI at runtime:

def validate_environment():
    """Block OpenAI usage."""
    if os.environ.get("OPENAI_API_KEY"):
        raise EnvironmentError(
            "OpenAI is not allowed in Scientia projects. "
            "Use ANTHROPIC_API_KEY or OPENROUTER_API_KEY instead."
        )

scientiacapital/cost-optimized-llm

skills/cost-optimized-llm/SKILL.md

Implement cost-optimized LLM routing with NO OpenAI. Use tiered model selection (DeepSeek, Haiku, Sonnet) to achieve 70-90% cost savings. Triggers on "LLM costs", "model selection", "cost optimization", "which model", "DeepSeek", "Claude pricing", "reduce AI costs".

testing

Updated Apr 13, 2026

$ install --global

skillsauth

npx skillsauth add scientiacapital/scientia-superpowers cost-optimized-llm

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 13, 2026, 2:46 AM39.8s1 file scanned

SKILL.md

name:: cost-optimized-llm
description:: Implement cost-optimized LLM routing with NO OpenAI. Use tiered model selection (DeepSeek, Haiku, Sonnet) to achieve 70-90% cost savings. Triggers on "LLM costs", "model selection", "cost optimization", "which model", "DeepSeek", "Claude pricing", "reduce AI costs".

Cost-Optimized LLM Routing

Achieve 70-90% cost savings with intelligent model routing. NO OpenAI allowed.

Critical Rule

NEVER use OpenAI models in this ecosystem.

Allowed providers:

Anthropic Claude (Haiku, Sonnet, Opus)
Google Gemini (Flash, Pro)
DeepSeek (via OpenRouter)
Qwen (via OpenRouter)
Cerebras (speed-critical)
Local: Ollama, sentence-transformers

Cost Comparison

Tiered Routing Strategy

Tier 1: Simple Tasks → DeepSeek ($0.0001/1K)

Use for:

Text classification
Simple extractions
Formatting
Basic Q&A
Sentiment analysis

from openai import OpenAI  # OpenRouter uses OpenAI SDK

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"]
)

response = client.chat.completions.create(
    model="deepseek/deepseek-chat",
    messages=[{"role": "user", "content": prompt}],
    max_tokens=500
)

Tier 2: Moderate Tasks → Claude Haiku ($0.00075/1K)

Use for:

Code review
Summarization
Multi-step reasoning
Data analysis

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-3-5-haiku-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": prompt}]
)

Tier 3: Complex Tasks → Claude Sonnet ($0.009/1K)

Use for:

Architecture decisions
Complex code generation
Multi-file refactoring
Nuanced analysis

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[{"role": "user", "content": prompt}]
)

Automatic Routing Implementation

from enum import Enum
from typing import Literal

class TaskComplexity(Enum):
    SIMPLE = "simple"
    MODERATE = "moderate"
    COMPLEX = "complex"

def route_to_model(complexity: TaskComplexity) -> str:
    """Route to appropriate model based on complexity."""
    routing = {
        TaskComplexity.SIMPLE: "deepseek/deepseek-chat",
        TaskComplexity.MODERATE: "claude-3-5-haiku-20241022",
        TaskComplexity.COMPLEX: "claude-sonnet-4-20250514"
    }
    return routing[complexity]

def estimate_complexity(prompt: str) -> TaskComplexity:
    """Estimate task complexity from prompt characteristics."""
    # Simple heuristics
    word_count = len(prompt.split())
    has_code = "```" in prompt or "def " in prompt or "function" in prompt
    has_analysis = any(w in prompt.lower() for w in ["analyze", "compare", "evaluate"])

    if word_count < 50 and not has_code and not has_analysis:
        return TaskComplexity.SIMPLE
    elif word_count < 200 or (has_code and not has_analysis):
        return TaskComplexity.MODERATE
    else:
        return TaskComplexity.COMPLEX

def smart_complete(prompt: str, force_model: str = None) -> str:
    """Complete with automatic model routing."""
    if force_model:
        model = force_model
    else:
        complexity = estimate_complexity(prompt)
        model = route_to_model(complexity)

    # Route to appropriate client
    if model.startswith("deepseek"):
        return call_openrouter(model, prompt)
    else:
        return call_anthropic(model, prompt)

Free Tier Strategy (Gemini Flash)

For MVPs and prototyping, use Gemini Flash (FREE):

import google.generativeai as genai

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
model = genai.GenerativeModel("gemini-1.5-flash")

response = model.generate_content(prompt)

Limits:

15 requests/minute
1 million tokens/day
1,500 requests/day

Cost Tracking

Track costs per project:

import json
from datetime import datetime
from pathlib import Path

COST_LOG = Path.home() / ".claude" / "llm_costs.jsonl"

def log_cost(project: str, model: str, input_tokens: int, output_tokens: int):
    """Log LLM usage for cost tracking."""
    costs = {
        "deepseek/deepseek-chat": (0.00014, 0.00028),
        "claude-3-5-haiku-20241022": (0.00025, 0.00125),
        "claude-sonnet-4-20250514": (0.003, 0.015),
        "gemini-1.5-flash": (0, 0)  # Free
    }

    input_cost, output_cost = costs.get(model, (0.01, 0.03))
    total = (input_tokens / 1_000_000 * input_cost) + (output_tokens / 1_000_000 * output_cost)

    entry = {
        "timestamp": datetime.utcnow().isoformat(),
        "project": project,
        "model": model,
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "cost_usd": round(total, 6)
    }

    with open(COST_LOG, "a") as f:
        f.write(json.dumps(entry) + "\n")

    return total

Voice AI Cost Optimization

For voice pipelines (vozlux, solarvoice-ai):

STT (Speech-to-Text)

Deepgram Nova-2: $0.0043/min (recommended)
AssemblyAI: $0.00025/sec

TTS (Text-to-Speech)

Cartesia Sonic-3: ~$0.01/1K chars (quality)
AWS Polly: ~$0.004/1K chars (budget)

Tier-Based Voice Routing

def get_voice_tier(subscription: str) -> dict:
    tiers = {
        "starter": {
            "tts": "polly",
            "stt": "deepgram-base",
            "llm": "deepseek"
        },
        "pro": {
            "tts": "cartesia",
            "stt": "deepgram-nova",
            "llm": "haiku"
        },
        "enterprise": {
            "tts": "cartesia",
            "stt": "deepgram-nova",
            "llm": "sonnet"
        }
    }
    return tiers.get(subscription, tiers["starter"])

Monthly Budget Estimates

For a typical Scientia project:

Recommendation: Use Mixed Tier routing for 90%+ of use cases.

Environment Variables

Required in .env:

# Primary (Anthropic)
ANTHROPIC_API_KEY=sk-ant-...

# Cost optimization (OpenRouter for DeepSeek)
OPENROUTER_API_KEY=sk-or-...

# Free tier (Google)
GOOGLE_API_KEY=AIza...

# NEVER set these:
# OPENAI_API_KEY=  # FORBIDDEN

Validation

lang-core enforces NO OpenAI at runtime:

def validate_environment():
    """Block OpenAI usage."""
    if os.environ.get("OPENAI_API_KEY"):
        raise EnvironmentError(
            "OpenAI is not allowed in Scientia projects. "
            "Use ANTHROPIC_API_KEY or OPENROUTER_API_KEY instead."
        )

Related Skills

scientiacapital/supabase-mastery

development

VerifiedTrustedCommunity

Master Supabase patterns for migrations, RLS policies, pgvector, and authentication. Use when creating database schemas, writing migrations, implementing row-level security, setting up auth, or debugging Supabase issues. Triggers on "supabase migration", "RLS policy", "row level security", "pgvector", "supabase auth", "magic link".

SKILL.mdUpdated Apr 13, 2026

scientiacapital/supabase-mastery

scientiacapital/revenue-acceleration

testing

VerifiedTrustedCommunity

GTM workflows for revenue acceleration across Scientia projects. Use for demo preparation, sales outreach, battle cards, pricing strategy, and revenue tracking. Triggers on "revenue focus", "prepare demo", "sales outreach", "battle card", "GTM strategy", "pricing", "tier-1 projects".

SKILL.mdUpdated Apr 13, 2026

scientiacapital/revenue-acceleration

scientiacapital/deployment-patterns

development

VerifiedTrustedCommunity

Deploy projects to Vercel, Railway, or Docker with platform-specific best practices. Use when deploying applications, configuring deployment settings, debugging deployment failures, or setting up CI/CD pipelines. Triggers on "deploy to vercel", "railway deployment", "docker build", "deployment failed", "configure vercel.json".

SKILL.mdUpdated Apr 13, 2026

scientiacapital/deployment-patterns

steipete/skill-creator

testing

VerifiedTrustedCommunity

Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".

356,423SKILL.mdUpdated Apr 13, 2026

steipete/skill-creator

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/scientiacapital/scientia-superpowers.git

# Copy into Claude Code skills folder (global)
cp -r scientia-superpowers/skills/cost-optimized-llm ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

scientiacapital/scientia-superpowers

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT