Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

redis/redis-semantic-cache

Name: redis-semantic-cache
Author: redis

skills/redis-semantic-cache/SKILL.md

npx skillsauth add redis/agent-skills redis-semantic-cache

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Redis Semantic Cache

Semantic caching for LLM responses with Redis Cloud's LangCache service. Stores prompts as embeddings; subsequent semantically-similar prompts return the cached response without re-calling the model.

LangCache is currently in preview on Redis Cloud. Features and behavior may change.

When to apply

Wrapping an LLM call (OpenAI, Anthropic, etc.) with a cache layer to cut cost and latency.
Caching RAG answers, classification outputs, or any deterministic LLM workload.
Tuning the precision/hit-rate trade-off for a semantic cache.
Splitting one application's LLM workloads across multiple cache instances.

1. The cache-aside flow

LangCache fits in front of any LLM call as a standard cache-aside pattern:

Send the user's prompt to LangCache's search.
Cache hit — return the stored response directly.
Cache miss — call the LLM, then set the response so future similar prompts hit.

from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY"),
)

result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.9)
if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    lang_cache.set(prompt="What is Redis?", response=response)

The same operations are available via REST (POST /v1/caches/{cacheId}/entries/search and POST /v1/caches/{cacheId}/entries) when an SDK isn't an option.

See references/langcache-usage.md for full SDK + REST samples and attribute-based storage.

2. Tune the similarity threshold

The threshold controls how close (in embedding cosine distance) a new prompt must be to a cached one to count as a hit. Higher = stricter match, fewer false positives. Lower = more hits, more risk of returning an off-topic answer.

| Threshold | Behavior | Use when | |---|---|---| | 0.95+ | Near-exact match required | Customer-facing answers where wrong responses are costly | | 0.9 | Balanced default | Most workloads — start here | | 0.8 | Loose semantic match | Internal tools, exploratory queries, FAQ deduplication |

# Stricter — fewer false positives
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.95)

# Looser — higher hit rate
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.8)

Adjust by watching the actual cache-hit rate and spot-checking that returned answers are still relevant.

See references/best-practices.md.

3. Separate caches per task type

Different LLM workloads should not share one cache — a "code question" prompt is semantically close to other code questions but has nothing to do with a password-reset support query, and crossing them returns garbage.

support_cache = LangCache(server_url=..., cache_id="support-cache-id", api_key=...)
code_cache    = LangCache(server_url=..., cache_id="code-cache-id",    api_key=...)

Create distinct cache IDs in Redis Cloud per task, and route each call to the right one. As a finer-grained alternative, store and search with custom attributes (e.g. {"category": "database"}) to keep tasks in the same cache but isolated by attribute filter — useful when the same prompt format spans subtopics.

References

LangCache documentation

redis/redis-semantic-cache

skills/redis-semantic-cache/SKILL.md

Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.

60 stars

development

Updated May 28, 2026

$ install --global

skillsauth

npx skillsauth add redis/agent-skills redis-semantic-cache

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 28, 2026, 5:39 AM113.9s3 files scanned

SKILL.md

name:: redis-semantic-cache
description:: Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.
license:: MIT
author:: Redis, Inc.
version:: 0.1.0

Redis Semantic Cache

Semantic caching for LLM responses with Redis Cloud's LangCache service. Stores prompts as embeddings; subsequent semantically-similar prompts return the cached response without re-calling the model.

LangCache is currently in preview on Redis Cloud. Features and behavior may change.

When to apply

Wrapping an LLM call (OpenAI, Anthropic, etc.) with a cache layer to cut cost and latency.
Caching RAG answers, classification outputs, or any deterministic LLM workload.
Tuning the precision/hit-rate trade-off for a semantic cache.
Splitting one application's LLM workloads across multiple cache instances.

1. The cache-aside flow

LangCache fits in front of any LLM call as a standard cache-aside pattern:

Send the user's prompt to LangCache's search.
Cache hit — return the stored response directly.
Cache miss — call the LLM, then set the response so future similar prompts hit.

from langcache import LangCache
import os

lang_cache = LangCache(
    server_url=f"https://{os.getenv('HOST')}",
    cache_id=os.getenv("CACHE_ID"),
    api_key=os.getenv("API_KEY"),
)

result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.9)
if result:
    response = result[0]["response"]
else:
    response = llm.generate("What is Redis?")
    lang_cache.set(prompt="What is Redis?", response=response)

The same operations are available via REST (POST /v1/caches/{cacheId}/entries/search and POST /v1/caches/{cacheId}/entries) when an SDK isn't an option.

See references/langcache-usage.md for full SDK + REST samples and attribute-based storage.

2. Tune the similarity threshold

# Stricter — fewer false positives
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.95)

# Looser — higher hit rate
result = lang_cache.search(prompt="What is Redis?", similarity_threshold=0.8)

Adjust by watching the actual cache-hit rate and spot-checking that returned answers are still relevant.

See references/best-practices.md.

3. Separate caches per task type

support_cache = LangCache(server_url=..., cache_id="support-cache-id", api_key=...)
code_cache    = LangCache(server_url=..., cache_id="code-cache-id",    api_key=...)

References

LangCache documentation

Related Skills

redis/redis-vector-search

development

VerifiedTrustedCommunity

Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.

60SKILL.mdUpdated May 28, 2026

redis/redis-vector-search

redis/redis-security

testing

VerifiedTrustedCommunity

Redis security guidance covering authentication (requirepass and ACL users), TLS, ACL-based least-privilege access control, restricting network exposure via bind and protected-mode, firewall rules, and disabling dangerous commands. Use when deploying Redis to production, defining ACL users for an application, configuring TLS connections, locking down a Redis instance behind a firewall, or auditing a Redis deployment for security hardening.

60SKILL.mdUpdated May 28, 2026

redis/redis-query-engine

testing

VerifiedTrustedCommunity

Redis Query Engine (RQE) guidance covering FT.CREATE schema design, field type selection (TEXT, TAG, NUMERIC, GEO, GEOSHAPE, VECTOR), DIALECT 2 query syntax, efficient FT.SEARCH and FT.AGGREGATE queries, zero-downtime index updates via aliases, and the SKIPINITIALSCAN option. Use when defining a search index on Hash or JSON documents, picking between TEXT and TAG for filtering, writing FT.SEARCH queries with filters and SORTBY, managing or swapping indexes in production, or troubleshooting slow searches with FT.PROFILE.

60SKILL.mdUpdated May 28, 2026

redis/redis-query-engine

redis/redis-observability

tools

VerifiedTrustedCommunity

Redis observability guidance — which metrics to monitor (memory, connections, hit ratio, ops/sec, rejected connections), which built-in commands to reach for during incident triage (SLOWLOG, INFO, MEMORY DOCTOR, CLIENT LIST, FT.PROFILE), and when to use the Redis Insight GUI. Use when setting up monitoring or alerts for a Redis instance, diagnosing a performance regression, profiling a slow FT.SEARCH query, or wiring Redis metrics into Prometheus, Datadog, or similar.

60SKILL.mdUpdated May 28, 2026

redis/redis-observability

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/redis/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/skills/redis-semantic-cache ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

redis/agent-skills

60 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT