Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

escapeboy/cache-inspector

Name: cache-inspector
Author: escapeboy

01-global-optimization/skills/cache-inspector/SKILL.md

npx skillsauth add escapeboy/ai-prompts cache-inspector

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

/cache-inspector — Prompt Cache Monitor

Inspects the Claude prompt caching system to report hit rates, cost savings, and optimization opportunities. Prompt caching saves 90% on re-reads of large content (system prompts, memories, tool definitions).

Usage

/cache-inspector [action]

Quick Examples

/cache-inspector             # Show current status (default)
/cache-inspector status      # Detailed cache status
/cache-inspector analyze     # Performance analysis with trends
/cache-inspector optimize    # Get actionable recommendations
/cache-inspector report      # Full report (save to file)
/cache-inspector clear       # Clear cache (for testing only)

Actions

`status` (default) — Current cache status

Shows the current state of the prompt cache.

/cache-inspector status

Output:

## Prompt Cache Status

### Active Cache Entries
| Content | Size | Status | TTL |
|---------|------|--------|-----|
| System prompt (global-optimization.md) | 3.2K tokens | CACHED ✅ | 58 min |
| Tool definitions (Serena MCP) | 8.4K tokens | CACHED ✅ | 55 min |
| memory: architecture.md | 2.1K tokens | CACHED ✅ | 52 min |
| memory: codebase-conventions.md | 1.8K tokens | CACHED ✅ | 52 min |
| memory: testing-strategy.md | 1.2K tokens | WARMING 🟡 | — |

### Session Stats
Cache hits: 47 / 54 reads (87%)
Tokens saved this session: 412K
Estimated cost saved: $1.24

### Cache Health: ✅ Excellent

`analyze` — Performance analysis

Deep analysis of cache performance with trends and comparison to baselines.

/cache-inspector analyze

Output:

## Cache Performance Analysis

### Hit Rate Trend
Session 1: 45% (warming)
Session 2: 71% (good)
Session 3: 87% (excellent) ← current
Target: >80% ✅

### Cost Impact
Without caching (estimated): $3.82
With caching (actual): $0.58
Savings: $3.24 (85%)

### Most Cached Content
1. Tool definitions (Serena): 8.4K — 12 reads, 11 cache hits (92%)
2. System prompt: 3.2K — 8 reads, 8 cache hits (100%)
3. architecture.md: 2.1K — 15 reads, 13 cache hits (87%)

### Optimization Opportunities
⚠️ testing-strategy.md: loaded 4 times but never cached (too small at 800 tokens)
   → Recommend expanding to >1024 tokens or combining with codebase-conventions.md

⚠️ 3 file reads bypassed caching (used Read tool directly)
   → Use /context load at session start to ensure memories are pre-cached

`optimize` — Get actionable recommendations

Analyzes current usage and provides specific recommendations to improve cache hit rate.

/cache-inspector optimize

Output:

## Cache Optimization Recommendations

### Priority 1: Pre-warm Cache at Session Start
Current: Cache warms gradually (first 3-4 messages cost full tokens)
Fix: Always run /context load before starting work

Expected improvement: +15% hit rate, -30K tokens per session

### Priority 2: Expand Small Memories
testing-strategy.md (800 tokens) is below the 1024-token cache threshold
Fix: Add more detail or merge with codebase-conventions.md

Expected improvement: +8% hit rate on test-related tasks

### Priority 3: Cache Constitution File
.claude/settings/constitution.json is read 6 times without caching
Fix: Move architectural rules to a memory file via /context save constitution

Expected improvement: +5% hit rate on all tasks

### Current Score: 72/100
After applying recommendations: 89/100 (estimated)

`report` — Full detailed report

Generates a complete cache performance report and saves it to .claude/learnings/cache-performance.md.

/cache-inspector report
/cache-inspector report --save    # Also saves to file

The report includes:

Session-by-session hit rate history
Cost breakdown (cached vs uncached reads)
Content inventory with sizes
TTL expiry timeline
Optimization score
All recommendations

`clear` — Clear cache entries

Warning: Clears all cached content. Only use for testing or when cache is stale.

/cache-inspector clear

Claude will ask for confirmation before clearing. After clearing, the next session will re-warm the cache (costs full tokens once).

How Prompt Caching Works

Claude's prompt caching (Anthropic API feature) stores frequently-read content server-side for 10 minutes (ephemeral) or 1 hour (with explicit cache control).

What gets cached: | Content | Size | Cache benefit | |---------|------|---------------| | System prompts | 2-5K tokens | 90% cost reduction on re-reads | | MCP tool definitions | 5-15K tokens | 90% cost reduction | | Serena memories | 1-3K each | 90% cost reduction | | Constitution files | 0.5-2K | 90% cost reduction | | Large spec documents | 5-20K | 90% cost reduction during impl |

Minimum size: Content must be ≥1024 tokens to be eligible for caching.

Cache TTL: Two tiers via cache_control: { type: "ephemeral" } — ttl: "5m" (default) or ttl: "1h" (opt-in). Pricing: 5m write 1.25× base input, 1h write 2.00×, cache hit 0.10×. Mixed TTLs in the same request are reported separately as ephemeral_5m_input_tokens / ephemeral_1h_input_tokens.

Cache Configuration

The cache is configured in ~/.claude/settings/prompt-caching.json. Key settings:

{
  "cache_control": {
    "type": "ephemeral",
    "auto_enable": true
  },
  "caching_rules": {
    "system_prompts": { "enabled": true, "min_tokens": 1024 },
    "tool_definitions": { "enabled": true, "min_tokens": 1024 },
    "memories": { "enabled": true, "min_tokens": 1024 }
  }
}

To modify: edit ~/.claude/settings/prompt-caching.json and reload Claude Code.

Target Metrics

| Metric | Poor | Good | Excellent | |--------|------|------|-----------| | Cache hit rate | <50% | 60-80% | >80% | | Token savings | <30% | 50-70% | >80% | | Session cost | >$3 | $0.75-$1.50 | <$0.75 | | Warmup time | >5 messages | 2-4 messages | 1-2 messages |

Troubleshooting

"Cache hit rate is low (<50%)"

Common causes:

Not loading memories first — run /context load before starting work
Sessions too short — cache needs 2-3 messages to warm up
Content below threshold — memories under 1024 tokens won't cache
Different content each time — variable prompts can't be cached

Run /cache-inspector optimize for specific recommendations.

"Cache entries expire quickly"

Cache TTL is 10-60 minutes. For long sessions:

Keep the conversation active (don't idle for >10 min)
Re-run /context load if you've been away

"No cache data available"

Cache metrics are only available when Serena MCP is connected and prompt-caching.json is configured. Run /optimize status to check configuration.

escapeboy/cache-inspector

01-global-optimization/skills/cache-inspector/SKILL.md

Analyze prompt cache hit rates, estimate cost savings from cached system prompts and memories, and recommend improvements to caching strategy. Use when checking cache performance, investigating high token costs, optimizing cache hit rates, or diagnosing slow cache warmup.

67 stars

testing

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add escapeboy/ai-prompts cache-inspector

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 12:17 PM4.7s1 file scanned

SKILL.md

name:: cache-inspector
description:: Analyze prompt cache hit rates, estimate cost savings from cached system prompts and memories, and recommend improvements to caching strategy. Use when checking cache performance, investigating high token costs, optimizing cache hit rates, or diagnosing slow cache warmup.
version:: 1.0.0

/cache-inspector — Prompt Cache Monitor

Usage

/cache-inspector [action]

Quick Examples

/cache-inspector             # Show current status (default)
/cache-inspector status      # Detailed cache status
/cache-inspector analyze     # Performance analysis with trends
/cache-inspector optimize    # Get actionable recommendations
/cache-inspector report      # Full report (save to file)
/cache-inspector clear       # Clear cache (for testing only)

Actions

`status` (default) — Current cache status

Shows the current state of the prompt cache.

/cache-inspector status

Output:

## Prompt Cache Status

### Active Cache Entries
| Content | Size | Status | TTL |
|---------|------|--------|-----|
| System prompt (global-optimization.md) | 3.2K tokens | CACHED ✅ | 58 min |
| Tool definitions (Serena MCP) | 8.4K tokens | CACHED ✅ | 55 min |
| memory: architecture.md | 2.1K tokens | CACHED ✅ | 52 min |
| memory: codebase-conventions.md | 1.8K tokens | CACHED ✅ | 52 min |
| memory: testing-strategy.md | 1.2K tokens | WARMING 🟡 | — |

### Session Stats
Cache hits: 47 / 54 reads (87%)
Tokens saved this session: 412K
Estimated cost saved: $1.24

### Cache Health: ✅ Excellent

`analyze` — Performance analysis

Deep analysis of cache performance with trends and comparison to baselines.

/cache-inspector analyze

Output:

## Cache Performance Analysis

### Hit Rate Trend
Session 1: 45% (warming)
Session 2: 71% (good)
Session 3: 87% (excellent) ← current
Target: >80% ✅

### Cost Impact
Without caching (estimated): $3.82
With caching (actual): $0.58
Savings: $3.24 (85%)

### Most Cached Content
1. Tool definitions (Serena): 8.4K — 12 reads, 11 cache hits (92%)
2. System prompt: 3.2K — 8 reads, 8 cache hits (100%)
3. architecture.md: 2.1K — 15 reads, 13 cache hits (87%)

### Optimization Opportunities
⚠️ testing-strategy.md: loaded 4 times but never cached (too small at 800 tokens)
   → Recommend expanding to >1024 tokens or combining with codebase-conventions.md

⚠️ 3 file reads bypassed caching (used Read tool directly)
   → Use /context load at session start to ensure memories are pre-cached

`optimize` — Get actionable recommendations

Analyzes current usage and provides specific recommendations to improve cache hit rate.

/cache-inspector optimize

Output:

## Cache Optimization Recommendations

### Priority 1: Pre-warm Cache at Session Start
Current: Cache warms gradually (first 3-4 messages cost full tokens)
Fix: Always run /context load before starting work

Expected improvement: +15% hit rate, -30K tokens per session

### Priority 2: Expand Small Memories
testing-strategy.md (800 tokens) is below the 1024-token cache threshold
Fix: Add more detail or merge with codebase-conventions.md

Expected improvement: +8% hit rate on test-related tasks

### Priority 3: Cache Constitution File
.claude/settings/constitution.json is read 6 times without caching
Fix: Move architectural rules to a memory file via /context save constitution

Expected improvement: +5% hit rate on all tasks

### Current Score: 72/100
After applying recommendations: 89/100 (estimated)

`report` — Full detailed report

Generates a complete cache performance report and saves it to .claude/learnings/cache-performance.md.

/cache-inspector report
/cache-inspector report --save    # Also saves to file

The report includes:

Session-by-session hit rate history
Cost breakdown (cached vs uncached reads)
Content inventory with sizes
TTL expiry timeline
Optimization score
All recommendations

`clear` — Clear cache entries

Warning: Clears all cached content. Only use for testing or when cache is stale.

/cache-inspector clear

Claude will ask for confirmation before clearing. After clearing, the next session will re-warm the cache (costs full tokens once).

How Prompt Caching Works

Claude's prompt caching (Anthropic API feature) stores frequently-read content server-side for 10 minutes (ephemeral) or 1 hour (with explicit cache control).

Minimum size: Content must be ≥1024 tokens to be eligible for caching.

Cache Configuration

The cache is configured in ~/.claude/settings/prompt-caching.json. Key settings:

{
  "cache_control": {
    "type": "ephemeral",
    "auto_enable": true
  },
  "caching_rules": {
    "system_prompts": { "enabled": true, "min_tokens": 1024 },
    "tool_definitions": { "enabled": true, "min_tokens": 1024 },
    "memories": { "enabled": true, "min_tokens": 1024 }
  }
}

To modify: edit ~/.claude/settings/prompt-caching.json and reload Claude Code.

Target Metrics

Troubleshooting

"Cache hit rate is low (<50%)"

Common causes:

Not loading memories first — run /context load before starting work
Sessions too short — cache needs 2-3 messages to warm up
Content below threshold — memories under 1024 tokens won't cache
Different content each time — variable prompts can't be cached

Run /cache-inspector optimize for specific recommendations.

"Cache entries expire quickly"

Cache TTL is 10-60 minutes. For long sessions:

Keep the conversation active (don't idle for >10 min)
Re-run /context load if you've been away

"No cache data available"

Cache metrics are only available when Serena MCP is connected and prompt-caching.json is configured. Run /optimize status to check configuration.

Related Skills

escapeboy/onepassword-integrate

development

VerifiedTrustedCommunity

Audit or install maximum-depth 1Password integration in the current project — fetches fresh 1Password developer docs first, detects existing integration, and either reviews/improves it or greenfield-installs (Service Account secret resolution + site-compat autocomplete/well-known). Stack-aware (Laravel, Node/Next, Python, Ruby/Rails, Go). Use when the user says "integrate 1Password", "make this site 1Password-friendly", "audit our 1P integration", or invokes /onepassword-integrate.

78SKILL.mdUpdated Jun 11, 2026

escapeboy/onepassword-integrate

escapeboy/image-optimize

development

VerifiedTrustedCommunity

Optimize PNG and JPEG images locally using pngquant and mozjpeg/jpegtran — TinyPNG-level compression without API keys.

78SKILL.mdUpdated Jun 11, 2026

escapeboy/image-optimize

escapeboy/git-sync-branches

development

VerifiedTrustedCommunity

Merges all feature branches into develop, syncs master/main with develop, commits any uncommitted changes, and deletes all feature branches (local and remote). Handles git submodules automatically. Use when you want to clean up branches and leave only develop and master/main in sync.

78SKILL.mdUpdated Jun 11, 2026

escapeboy/git-sync-branches

escapeboy/fix-bug

testing

VerifiedTrustedCommunity

Three-phase autonomous bug fix — investigate all occurrences, fix with full coverage, validate with regression test. Prevents partial fixes (the

78SKILL.mdUpdated Jun 11, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/escapeboy/ai-prompts.git

# Copy into Claude Code skills folder (global)
cp -r ai-prompts/01-global-optimization/skills/cache-inspector ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

escapeboy/ai-prompts

67 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

escapeboy/cache-inspector

$ install --global

Security Scan Results

SKILL.md

/cache-inspector — Prompt Cache Monitor

Usage

Quick Examples

Actions

status (default) — Current cache status

analyze — Performance analysis

optimize — Get actionable recommendations

report — Full detailed report

clear — Clear cache entries

How Prompt Caching Works

Cache Configuration

Target Metrics

Troubleshooting

"Cache hit rate is low (<50%)"

"Cache entries expire quickly"

"No cache data available"

See Also

Related Skills

escapeboy/onepassword-integrate

escapeboy/image-optimize

escapeboy/git-sync-branches

escapeboy/fix-bug

escapeboy/cache-inspector

$ install --global

Security Scan Results

SKILL.md

/cache-inspector — Prompt Cache Monitor

Usage

Quick Examples

Actions

status (default) — Current cache status

analyze — Performance analysis

optimize — Get actionable recommendations

report — Full detailed report

clear — Clear cache entries

How Prompt Caching Works

Cache Configuration

Target Metrics

Troubleshooting

"Cache hit rate is low (<50%)"

"Cache entries expire quickly"

"No cache data available"

See Also

Related Skills

escapeboy/onepassword-integrate

escapeboy/image-optimize

escapeboy/git-sync-branches

escapeboy/fix-bug

`status` (default) — Current cache status

`analyze` — Performance analysis

`optimize` — Get actionable recommendations

`report` — Full detailed report

`clear` — Clear cache entries

`status` (default) — Current cache status

`analyze` — Performance analysis

`optimize` — Get actionable recommendations

`report` — Full detailed report

`clear` — Clear cache entries