Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

glebis/vision-bench

Name: vision-bench
Author: glebis

vision-bench/SKILL.md

npx skillsauth add glebis/claude-skills vision-bench

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Vision Bench — LLM Image Evaluation

Compare images by scoring them with one or more vision LLM judges against structured rubric criteria.

Quick Start

# Install dependencies
pip install pyyaml openai anthropic mistralai

# Score a single image
python bench.py image.png --criteria photorealism --judge gemini-2.5-flash

# Compare two AI-generated images
python bench.py img_a.png img_b.png \
  --criteria text_to_image \
  --prompt "a fox in a snowy forest" \
  --judge gpt-4o

# Multi-judge consensus
python bench.py img.png \
  --criteria portrait \
  --judges gpt-4o gemini-2.5-flash claude-opus-4-5-20251022

# OpenRouter models (any vision-capable model)
python bench.py img_a.png img_b.png \
  --criteria artistic_style \
  --judges "openrouter/meta-llama/llama-4-maverick" "openrouter/mistralai/pixtral-large-2411"

# List all presets
python bench.py --list-presets

# Save report to file
python bench.py img.png --criteria chart_analysis --save report.md

Presets

| Preset | Use Case | |--------|----------| | text_to_image | Compare AI image generators (Midjourney, DALL-E, Flux) | | photorealism | How convincingly an image looks like a photo | | artistic_style | Style consistency, composition, color harmony | | portrait | AI-generated portrait quality and realism | | product_photo | E-commerce product image quality | | document_ocr | Document text extraction and layout understanding | | chart_analysis | Chart and data visualization comprehension | | invoice | Financial document field extraction accuracy | | ui_screenshot | App/web screenshot understanding | | scientific | Scientific/medical image accuracy | | alt_text | Accessibility image description quality |

Custom criteria: pass any .yaml file as --criteria path/to/my.yaml.

Judge Providers

| Prefix | Provider | Example | |--------|----------|---------| | gpt-, o1, o3, o4 | OpenAI | gpt-4o | | claude- | Anthropic | claude-sonnet-4-5-20251022 | | gemini- | Google Gemini | gemini-2.5-flash | | pixtral-, mistral-, ministral- | Mistral | pixtral-12b-2409 | | openrouter/ | OpenRouter (any model) | openrouter/meta-llama/llama-4-maverick |

API Keys

Keys are loaded from secrets.enc.yaml (SOPS + age encrypted) with fallback to environment variables.

Supported keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY

To encrypt your own keys:

sops --config .sops.yaml --encrypt --input-type yaml --output-type yaml secrets.yaml > secrets.enc.yaml

Output Formats

--output markdown (default) · --output json · --output table

Files

bench.py — CLI entry point
judge.py — Multi-provider LLM judge logic
report.py — Report generation
vault.py — SOPS secrets decryption
criteria/ — 11 YAML preset files
.sops.yaml — Age key config for encryption
secrets.enc.yaml — Encrypted API keys

glebis/vision-bench

vision-bench/SKILL.md

Score and compare images using vision LLMs as judges. YAML-defined criteria presets for 11 use cases (text-to-image, photorealism, document OCR, charts, UI, portrait, product, scientific, invoice, alt-text, artistic style). Supports OpenAI, Anthropic, Gemini, Mistral, and OpenRouter as judge providers. Keys auto-decrypted via SOPS + age.

125 stars

testing

Updated Apr 24, 2026

$ install --global

skillsauth

npx skillsauth add glebis/claude-skills vision-bench

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 8:08 PM232.1s17 files scanned

SKILL.md

name:: vision-bench
description:: Score and compare images using vision LLMs as judges. YAML-defined criteria presets for 11 use cases (text-to-image, photorealism, document OCR, charts, UI, portrait, product, scientific, invoice, alt-text, artistic style). Supports OpenAI, Anthropic, Gemini, Mistral, and OpenRouter as judge providers. Keys auto-decrypted via SOPS + age.

Vision Bench — LLM Image Evaluation

Compare images by scoring them with one or more vision LLM judges against structured rubric criteria.

Quick Start

# Install dependencies
pip install pyyaml openai anthropic mistralai

# Score a single image
python bench.py image.png --criteria photorealism --judge gemini-2.5-flash

# Compare two AI-generated images
python bench.py img_a.png img_b.png \
  --criteria text_to_image \
  --prompt "a fox in a snowy forest" \
  --judge gpt-4o

# Multi-judge consensus
python bench.py img.png \
  --criteria portrait \
  --judges gpt-4o gemini-2.5-flash claude-opus-4-5-20251022

# OpenRouter models (any vision-capable model)
python bench.py img_a.png img_b.png \
  --criteria artistic_style \
  --judges "openrouter/meta-llama/llama-4-maverick" "openrouter/mistralai/pixtral-large-2411"

# List all presets
python bench.py --list-presets

# Save report to file
python bench.py img.png --criteria chart_analysis --save report.md

Presets

Custom criteria: pass any .yaml file as --criteria path/to/my.yaml.

Judge Providers

API Keys

Keys are loaded from secrets.enc.yaml (SOPS + age encrypted) with fallback to environment variables.

Supported keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY

To encrypt your own keys:

sops --config .sops.yaml --encrypt --input-type yaml --output-type yaml secrets.yaml > secrets.enc.yaml

Output Formats

--output markdown (default) · --output json · --output table

Files

bench.py — CLI entry point
judge.py — Multi-provider LLM judge logic
report.py — Report generation
vault.py — SOPS secrets decryption
criteria/ — 11 YAML preset files
.sops.yaml — Age key config for encryption
secrets.enc.yaml — Encrypted API keys

Related Skills

glebis/skills/disk-cleanup

data-ai

VerifiedTrustedCommunity

--- name: disk-cleanup description: Scan and clean macOS caches, package-manager data, crash dumps, and app caches to reclaim disk space. Deterministic — a config registry (targets.json) plus two scripts (survey.py read-only, clean.py executor) do all the measuring and deleting; the agent only relays a compressed summary and makes the few human-judgment calls. IMPORTANT — use this skill whenever the user's request on macOS involves: freeing disk space, cleaning/clearing caches, "disk is full", "

337SKILL.mdUpdated May 30, 2026

glebis/skills/disk-cleanup

glebis/nano-banana

development

VerifiedTrustedCommunity

Generate and edit images using Google's Gemini image generation models (Nano Banana family). Supports style presets, platform-specific sizing (YouTube/slides/blog), variants, image editing via inlineData, reference images for style transfer, and organized output with metadata. Default model is Nano Banana 2 (gemini-3.1-flash-image-preview). Key is auto-decrypted via SOPS.

337SKILL.mdUpdated Apr 22, 2026

glebis/agency-docs-updater

development

VerifiedTrustedCommunity

--- name: agency-docs-updater description: End-to-end pipeline for publishing Claude Code lab meetings. Accepts optional args: date (YYYYMMDD, "yesterday", "today") and lab number (e.g. "04"). Examples: "yesterday 04", "20260420 05", "04" (today, lab 04), "" (today, auto-detect lab). --- # Agency Docs Updater Execute ALL steps automatically in sequence. Only pause if a step fails and cannot be recovered. Read `references/learnings.md` before starting for known pitfalls. **Configuration**: pat

337SKILL.mdUpdated Apr 22, 2026

glebis/agency-docs-updater

glebis/typography

tools

VerifiedTrustedCommunity

This skill should be used when applying proper typography to prose text or files in Russian, English, German, or French — smart quotes per locale («ёлочки», “curly”, „Gänsefüßchen“, « guillemets »), correct dashes (тире, em/en dash, Gedankenstrich, tiret), non-breaking spaces, ranges, ellipsis, and French espaces insécables before ! ? ; :. Fully deterministic via a pinned typograf-based CLI; never apply these rules by hand. Triggers on "типографика", "typograf", "оттипографь", "smart quotes", "fix typography", "неразрывные пробелы".

329SKILL.mdUpdated Jul 24, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/glebis/claude-skills.git

# Copy into Claude Code skills folder (global)
cp -r claude-skills/vision-bench ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

glebis/claude-skills

125 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT