Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

pyramidheadshark/.claude/skills/multimodal-router

Name: .claude/skills/multimodal-router
Author: pyramidheadshark

.claude/skills/multimodal-router/SKILL.md

npx skillsauth add pyramidheadshark/ml-claude-infra .claude/skills/multimodal-router

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Multimodal Router

When to Load This Skill

Load when working with: PDF files, Word documents, Excel spreadsheets, images, audio, video files, or any document exceeding 400k tokens that cannot fit in Claude's standard context.

Model

Model: google/gemini-3-flash-preview
Provider: OpenRouter API
Context window: 1M tokens
Capabilities: text, images, audio, video, PDF — all natively
Thinking levels: minimal / low / medium / high (configurable per task)

Gemini 3 Flash Preview is a thinking model with near-Pro reasoning at Flash latency. Use thinking_level: "low" for document extraction, "medium" or "high" for complex analysis.

When to Use This Skill (Decision Rules)

Use Gemini 3 Flash via this skill when:

Input is a PDF, image, audio file, or video
Input document exceeds ~400k tokens (rough estimate: 300+ pages of text)
Task requires visual understanding (screenshots, diagrams, scanned docs)
Client sent .docx, .pdf, .xlsx, .mp4, .wav for initial project analysis

Do NOT use for: writing code, architecture decisions, tests. Those stay with Claude Code.

OpenRouter Client Pattern

import httpx

from src.project_name.core.config import settings


OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
MULTIMODAL_MODEL = "google/gemini-3-flash-preview"


async def call_gemini_flash(
    prompt: str,
    base64_content: str | None = None,
    media_type: str | None = None,
    thinking_level: str = "low",
) -> str:
    messages: list[dict] = []

    if base64_content and media_type:
        messages.append({
            "role": "user",
            "content": [
                {
                    "type": "image_url" if media_type.startswith("image") else "file",
                    "image_url": {"url": f"data:{media_type};base64,{base64_content}"},
                },
                {"type": "text", "text": prompt},
            ],
        })
    else:
        messages.append({"role": "user", "content": prompt})

    payload = {
        "model": MULTIMODAL_MODEL,
        "messages": messages,
        "reasoning": {"effort": thinking_level},
        "max_tokens": 4096,
    }

    async with httpx.AsyncClient(timeout=120.0) as client:
        response = await client.post(
            f"{OPENROUTER_BASE_URL}/chat/completions",
            headers={
                "Authorization": f"Bearer {settings.openrouter_api_key}",
                "HTTP-Referer": "https://github.com/your-org/project",
                "X-Title": "ML Engineering Platform",
            },
            json=payload,
        )
        response.raise_for_status()
        data = response.json()
        return data["choices"][0]["message"]["content"]

PDF Analysis Pattern

import base64
from pathlib import Path


async def analyze_pdf(pdf_path: Path, analysis_prompt: str) -> str:
    pdf_bytes = pdf_path.read_bytes()
    b64 = base64.b64encode(pdf_bytes).decode("utf-8")
    return await call_gemini_flash(
        prompt=analysis_prompt,
        base64_content=b64,
        media_type="application/pdf",
        thinking_level="medium",
    )

Standard Analysis Prompts

For intake phase (analyzing client documents):

INTAKE_SYSTEM_PROMPT = """
You are analyzing a client document to extract structured requirements.
Return a JSON object with these fields:
- business_goal: str
- key_stakeholders: list[str]
- data_sources: list[dict with name, format, volume]
- use_cases: list[str]
- constraints: list[str]
- open_questions: list[str]

Be thorough. Every ambiguity should appear in open_questions.
Return ONLY valid JSON, no markdown fences.
"""

.env Keys Required

OPENROUTER_API_KEY=sk-or-...

Cost Awareness

Gemini 3 Flash Preview pricing on OpenRouter: ~$0.0005/1k input tokens, ~$0.003/1k output. A 300-page PDF (≈150k tokens) costs approximately $0.075 to analyze. Always reasonable.

For documents that need Pro-level reasoning (very complex technical analysis): use google/gemini-3-flash-preview with thinking_level: "high" before escalating to Pro.

pyramidheadshark/.claude/skills/multimodal-router

.claude/skills/multimodal-router/SKILL.md

# Multimodal Router ## When to Load This Skill Load when working with: PDF files, Word documents, Excel spreadsheets, images, audio, video files, or any document exceeding 400k tokens that cannot fit in Claude's standard context. ## Model - **Model**: `google/gemini-3-flash-preview` - **Provider**: OpenRouter API - **Context window**: 1M tokens - **Capabilities**: text, images, audio, video, PDF — all natively - **Thinking levels**: minimal / low / medium / high (configurable per task) Gemi

4 stars

development

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add pyramidheadshark/ml-claude-infra .claude/skills/multimodal-router

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 8:11 AM104.3s2 files scanned

SKILL.md

Multimodal Router

When to Load This Skill

Load when working with: PDF files, Word documents, Excel spreadsheets, images, audio, video files, or any document exceeding 400k tokens that cannot fit in Claude's standard context.

Model

Model: google/gemini-3-flash-preview
Provider: OpenRouter API
Context window: 1M tokens
Capabilities: text, images, audio, video, PDF — all natively
Thinking levels: minimal / low / medium / high (configurable per task)

Gemini 3 Flash Preview is a thinking model with near-Pro reasoning at Flash latency. Use thinking_level: "low" for document extraction, "medium" or "high" for complex analysis.

When to Use This Skill (Decision Rules)

Use Gemini 3 Flash via this skill when:

Input is a PDF, image, audio file, or video
Input document exceeds ~400k tokens (rough estimate: 300+ pages of text)
Task requires visual understanding (screenshots, diagrams, scanned docs)
Client sent .docx, .pdf, .xlsx, .mp4, .wav for initial project analysis

Do NOT use for: writing code, architecture decisions, tests. Those stay with Claude Code.

OpenRouter Client Pattern

import httpx

from src.project_name.core.config import settings


OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
MULTIMODAL_MODEL = "google/gemini-3-flash-preview"


async def call_gemini_flash(
    prompt: str,
    base64_content: str | None = None,
    media_type: str | None = None,
    thinking_level: str = "low",
) -> str:
    messages: list[dict] = []

    if base64_content and media_type:
        messages.append({
            "role": "user",
            "content": [
                {
                    "type": "image_url" if media_type.startswith("image") else "file",
                    "image_url": {"url": f"data:{media_type};base64,{base64_content}"},
                },
                {"type": "text", "text": prompt},
            ],
        })
    else:
        messages.append({"role": "user", "content": prompt})

    payload = {
        "model": MULTIMODAL_MODEL,
        "messages": messages,
        "reasoning": {"effort": thinking_level},
        "max_tokens": 4096,
    }

    async with httpx.AsyncClient(timeout=120.0) as client:
        response = await client.post(
            f"{OPENROUTER_BASE_URL}/chat/completions",
            headers={
                "Authorization": f"Bearer {settings.openrouter_api_key}",
                "HTTP-Referer": "https://github.com/your-org/project",
                "X-Title": "ML Engineering Platform",
            },
            json=payload,
        )
        response.raise_for_status()
        data = response.json()
        return data["choices"][0]["message"]["content"]

PDF Analysis Pattern

import base64
from pathlib import Path


async def analyze_pdf(pdf_path: Path, analysis_prompt: str) -> str:
    pdf_bytes = pdf_path.read_bytes()
    b64 = base64.b64encode(pdf_bytes).decode("utf-8")
    return await call_gemini_flash(
        prompt=analysis_prompt,
        base64_content=b64,
        media_type="application/pdf",
        thinking_level="medium",
    )

Standard Analysis Prompts

For intake phase (analyzing client documents):

INTAKE_SYSTEM_PROMPT = """
You are analyzing a client document to extract structured requirements.
Return a JSON object with these fields:
- business_goal: str
- key_stakeholders: list[str]
- data_sources: list[dict with name, format, volume]
- use_cases: list[str]
- constraints: list[str]
- open_questions: list[str]

Be thorough. Every ambiguity should appear in open_questions.
Return ONLY valid JSON, no markdown fences.
"""

.env Keys Required

OPENROUTER_API_KEY=sk-or-...

Cost Awareness

Gemini 3 Flash Preview pricing on OpenRouter: ~$0.0005/1k input tokens, ~$0.003/1k output. A 300-page PDF (≈150k tokens) costs approximately $0.075 to analyze. Always reasonable.

For documents that need Pro-level reasoning (very complex technical analysis): use google/gemini-3-flash-preview with thinking_level: "high" before escalating to Pro.

Related Skills

pyramidheadshark/tests/fixtures/project-with-status/.claude/skills/design-doc-creator

testing

VerifiedTrustedCommunity

# Design Doc Creator ## When to Load This Skill Load when: design documents, requirements, new project start. Short fixture skill for testing (optional/meta skill).

4SKILL.mdUpdated Apr 17, 2026

pyramidheadshark/tests/fixtures/project-with-status/.claude/skills/design-doc-creator

pyramidheadshark/.claude/skills/windows-developer

development

VerifiedTrustedCommunity

# Windows Developer Guide ## When to Load Automatically loaded on Windows (`platform_trigger: "win32"`). Applies to: `.py`, `.ps1`, `.bat`, `.cmd` files and any Windows-specific workflow. ## Python on Windows ### Encoding (CRITICAL) Windows defaults to `cp1251` / `cp1252` for file I/O. Always specify UTF-8 explicitly: ```python with open("file.txt", "r", encoding="utf-8") as f: content = f.read() Path("file.txt").read_text(encoding="utf-8") Path("file.txt").write_text(content, encodin

4SKILL.mdUpdated Apr 15, 2026

pyramidheadshark/.claude/skills/windows-developer

pyramidheadshark/.claude/skills/test-first-patterns

development

VerifiedTrustedCommunity

# Test-First Patterns ## When to Load This Skill Load when writing tests, creating `.feature` files, setting up conftest, discussing test strategy, or reviewing coverage. ## Philosophy Tests are written BEFORE code. Always. No exceptions. The order is: Design Doc → BDD Scenarios → Unit Tests → Implementation. BDD scenarios come from the design document's use cases section — they are a direct translation of business requirements into executable specifications. This makes tests the living do

4SKILL.mdUpdated Apr 15, 2026

pyramidheadshark/.claude/skills/test-first-patterns

pyramidheadshark/.claude/skills/supply-chain-auditor

testing

VerifiedTrustedCommunity

# Skill: Supply Chain Auditor ## When to Load Auto-load when: adding dependencies, reviewing packages, updating versions, or discussing `requirements.txt`, `pyproject.toml`, `package.json`. Triggers on `dependency`, `install`, `package`, `CVE`, `audit`, `vulnerable` (≥2 keywords). ## Core Rules Every new dependency addition must pass this checklist before merging: 1. **Pinned** — exact version in production (`==1.2.3` for pip, `"1.2.3"` for npm, not `^` or `~`). 2. **Maintained** — last com

4SKILL.mdUpdated Apr 15, 2026

pyramidheadshark/.claude/skills/supply-chain-auditor

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/pyramidheadshark/ml-claude-infra.git

# Copy into Claude Code skills folder (global)
cp -r ml-claude-infra/.claude/skills/multimodal-router ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

pyramidheadshark/ml-claude-infra

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT