Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

redis/redis-vector-search

Name: redis-vector-search
Author: redis

skills/redis-vector-search/SKILL.md

npx skillsauth add redis/agent-skills redis-vector-search

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Redis Vector Search

Guidance for storing and searching embeddings in Redis. Covers index configuration, algorithm selection, hybrid filtering, and the RAG retrieval pattern with RedisVL.

When to apply

Defining a VECTOR field in FT.CREATE (raw RQE) or a RedisVL IndexSchema.
Choosing HNSW vs FLAT and tuning HNSW parameters.
Adding category, date, or tenant filters to a vector query.
Building a retrieval-augmented generation (RAG) pipeline on top of Redis.

This skill builds on the redis-query-engine skill — vector fields live inside RQE indexes and share the same FT.CREATE / FT.SEARCH machinery.

1. Configure the vector index properly

Three settings must match the embedding model:

DIM — the model's output dimensionality (e.g. 1536 for OpenAI text-embedding-3-small). A mismatch produces silent garbage.
DISTANCE_METRIC — COSINE for normalized text embeddings (the common case), IP for unnormalized inner-product, L2 for raw Euclidean.
TYPE / datatype — usually FLOAT32. Use FLOAT16 or quantized variants only when memory cost is a hard constraint.

Raw RQE:

FT.CREATE idx:docs ON HASH PREFIX 1 doc:
    SCHEMA
        content TEXT
        embedding VECTOR HNSW 6
            TYPE FLOAT32
            DIM 1536
            DISTANCE_METRIC COSINE

RedisVL:

schema = IndexSchema.from_dict({
    "index": {"name": "idx:docs", "prefix": "doc:"},
    "fields": [
        {"name": "content", "type": "text"},
        {"name": "embedding", "type": "vector", "attrs": {
            "dims": 1536, "algorithm": "HNSW",
            "datatype": "FLOAT32", "distance_metric": "COSINE",
        }},
    ]
})

See references/index-creation.md for redis-py and RedisVL variants.

2. HNSW vs FLAT

| Algorithm | Speed | Accuracy | Memory | Best for | |---|---|---|---|---| | HNSW | Fast (approximate) | ~95%+ recall (tunable) | Higher | Large datasets (>10k vectors), latency-sensitive | | FLAT | Slow (exact) | 100% | Lower | Small datasets (<10k), accuracy-critical |

Default to HNSW for any production-scale workload. Tuning levers:

M — connections per node (16–64). Higher = better recall, more memory.
EF_CONSTRUCTION — build-time graph quality (100–500). Higher = better index, slower build.
EF_RUNTIME — query-time candidate-list size. Higher = better recall, slower queries.

Use FLAT when the corpus is small and you need exact results (e.g. semantic dedup over a few thousand items).

See references/algorithm-choice.md.

3. Hybrid search — filter before vector

Apply attribute filters (TAG / NUMERIC) so the engine narrows the search space before the vector comparison. Don't fetch a wide result set and then filter client-side — that's slower and less accurate.

from redisvl.query import VectorQuery
from redisvl.query.filter import Num, Tag

filters = (Tag("category") == "technology") & (Num("date") >= 2024)

query = VectorQuery(
    vector=query_embedding,
    vector_field_name="embedding",
    return_fields=["content", "category", "date"],
    num_results=10,
    filter_expression=filters,
)
results = index.query(query)

For text + vector fusion (BM25-weighted text scoring combined with vector similarity), use HybridQuery on Redis ≥ 8.4 with redis-py ≥ 7.1, or AggregateHybridQuery on older Redis. That's a different "hybrid" from filtered vector search above.

See references/hybrid-search.md.

4. RAG pattern

Standard pipeline: embed the user query → vector search Redis → pass top-K context to the LLM.

# Index documents with embeddings
records = [{"content": doc.content,
            "embedding": embed_model.encode(doc.content).tolist(),
            "source": doc.source}
           for doc in documents]
index.load(records)

# Retrieve relevant context for a user question
q_emb = embed_model.encode(user_question)
results = index.query(VectorQuery(
    vector=q_emb,
    vector_field_name="embedding",
    return_fields=["content", "source"],
    num_results=5,
))

# Generate with retrieved context
context = "\n".join(r["content"] for r in results)
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")

Practical tips:

Match metric to model. Most modern text embedding models pair best with COSINE.
Chunk long documents before indexing — retrieval over 200–500-token chunks usually beats indexing whole pages.
Batch inserts with index.load([...]) instead of one call per record.
Pre-filter with attributes (tenant, recency, document type) before the vector search.

See references/rag-pattern.md.

References

Redis: Vectors
Redis: RAG quickstart
RedisVL documentation

redis/redis-vector-search

skills/redis-vector-search/SKILL.md

Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.

60 stars

development

Updated May 28, 2026

$ install --global

skillsauth

npx skillsauth add redis/agent-skills redis-vector-search

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 28, 2026, 5:38 AM48.1s5 files scanned

SKILL.md

name:: redis-vector-search
description:: Redis vector search guidance covering HNSW vs FLAT algorithm choice, vector index configuration (dims, distance metric, datatype), filtered hybrid search combining vector similarity with TAG or NUMERIC filters, and the RAG retrieval pattern with RedisVL. Use when defining a VECTOR field in FT.CREATE, integrating embeddings (OpenAI, Cohere, sentence-transformers), tuning HNSW parameters (M, EF_CONSTRUCTION, EF_RUNTIME), building a retrieval-augmented generation pipeline, or filtering vector results by attribute.
license:: MIT
author:: Redis, Inc.
version:: 0.1.0

Redis Vector Search

Guidance for storing and searching embeddings in Redis. Covers index configuration, algorithm selection, hybrid filtering, and the RAG retrieval pattern with RedisVL.

When to apply

Defining a VECTOR field in FT.CREATE (raw RQE) or a RedisVL IndexSchema.
Choosing HNSW vs FLAT and tuning HNSW parameters.
Adding category, date, or tenant filters to a vector query.
Building a retrieval-augmented generation (RAG) pipeline on top of Redis.

This skill builds on the redis-query-engine skill — vector fields live inside RQE indexes and share the same FT.CREATE / FT.SEARCH machinery.

1. Configure the vector index properly

Three settings must match the embedding model:

DIM — the model's output dimensionality (e.g. 1536 for OpenAI text-embedding-3-small). A mismatch produces silent garbage.
DISTANCE_METRIC — COSINE for normalized text embeddings (the common case), IP for unnormalized inner-product, L2 for raw Euclidean.
TYPE / datatype — usually FLOAT32. Use FLOAT16 or quantized variants only when memory cost is a hard constraint.

Raw RQE:

FT.CREATE idx:docs ON HASH PREFIX 1 doc:
    SCHEMA
        content TEXT
        embedding VECTOR HNSW 6
            TYPE FLOAT32
            DIM 1536
            DISTANCE_METRIC COSINE

RedisVL:

schema = IndexSchema.from_dict({
    "index": {"name": "idx:docs", "prefix": "doc:"},
    "fields": [
        {"name": "content", "type": "text"},
        {"name": "embedding", "type": "vector", "attrs": {
            "dims": 1536, "algorithm": "HNSW",
            "datatype": "FLOAT32", "distance_metric": "COSINE",
        }},
    ]
})

See references/index-creation.md for redis-py and RedisVL variants.

2. HNSW vs FLAT

Default to HNSW for any production-scale workload. Tuning levers:

M — connections per node (16–64). Higher = better recall, more memory.
EF_CONSTRUCTION — build-time graph quality (100–500). Higher = better index, slower build.
EF_RUNTIME — query-time candidate-list size. Higher = better recall, slower queries.

Use FLAT when the corpus is small and you need exact results (e.g. semantic dedup over a few thousand items).

See references/algorithm-choice.md.

3. Hybrid search — filter before vector

from redisvl.query import VectorQuery
from redisvl.query.filter import Num, Tag

filters = (Tag("category") == "technology") & (Num("date") >= 2024)

query = VectorQuery(
    vector=query_embedding,
    vector_field_name="embedding",
    return_fields=["content", "category", "date"],
    num_results=10,
    filter_expression=filters,
)
results = index.query(query)

See references/hybrid-search.md.

4. RAG pattern

Standard pipeline: embed the user query → vector search Redis → pass top-K context to the LLM.

# Index documents with embeddings
records = [{"content": doc.content,
            "embedding": embed_model.encode(doc.content).tolist(),
            "source": doc.source}
           for doc in documents]
index.load(records)

# Retrieve relevant context for a user question
q_emb = embed_model.encode(user_question)
results = index.query(VectorQuery(
    vector=q_emb,
    vector_field_name="embedding",
    return_fields=["content", "source"],
    num_results=5,
))

# Generate with retrieved context
context = "\n".join(r["content"] for r in results)
response = llm.generate(f"Context: {context}\n\nQuestion: {user_question}")

Practical tips:

Match metric to model. Most modern text embedding models pair best with COSINE.
Chunk long documents before indexing — retrieval over 200–500-token chunks usually beats indexing whole pages.
Batch inserts with index.load([...]) instead of one call per record.
Pre-filter with attributes (tenant, recency, document type) before the vector search.

See references/rag-pattern.md.

References

Redis: Vectors
Redis: RAG quickstart
RedisVL documentation

Related Skills

redis/redis-semantic-cache

development

VerifiedTrustedCommunity

Redis LangCache guidance for semantic caching of LLM responses on Redis Cloud — calling search/set via the SDK or REST API, tuning the similarity threshold, separating caches per task type, and filtering with custom attributes. Use when caching LLM completions or RAG answers to cut API cost and latency, building a cache-aside layer in front of OpenAI / Anthropic / etc., tuning hit rate vs precision, or splitting one app's LLM workloads into multiple LangCache caches.

60SKILL.mdUpdated May 28, 2026

redis/redis-semantic-cache

redis/redis-security

testing

VerifiedTrustedCommunity

Redis security guidance covering authentication (requirepass and ACL users), TLS, ACL-based least-privilege access control, restricting network exposure via bind and protected-mode, firewall rules, and disabling dangerous commands. Use when deploying Redis to production, defining ACL users for an application, configuring TLS connections, locking down a Redis instance behind a firewall, or auditing a Redis deployment for security hardening.

60SKILL.mdUpdated May 28, 2026

redis/redis-query-engine

testing

VerifiedTrustedCommunity

Redis Query Engine (RQE) guidance covering FT.CREATE schema design, field type selection (TEXT, TAG, NUMERIC, GEO, GEOSHAPE, VECTOR), DIALECT 2 query syntax, efficient FT.SEARCH and FT.AGGREGATE queries, zero-downtime index updates via aliases, and the SKIPINITIALSCAN option. Use when defining a search index on Hash or JSON documents, picking between TEXT and TAG for filtering, writing FT.SEARCH queries with filters and SORTBY, managing or swapping indexes in production, or troubleshooting slow searches with FT.PROFILE.

60SKILL.mdUpdated May 28, 2026

redis/redis-query-engine

redis/redis-observability

tools

VerifiedTrustedCommunity

Redis observability guidance — which metrics to monitor (memory, connections, hit ratio, ops/sec, rejected connections), which built-in commands to reach for during incident triage (SLOWLOG, INFO, MEMORY DOCTOR, CLIENT LIST, FT.PROFILE), and when to use the Redis Insight GUI. Use when setting up monitoring or alerts for a Redis instance, diagnosing a performance regression, profiling a slow FT.SEARCH query, or wiring Redis metrics into Prometheus, Datadog, or similar.

60SKILL.mdUpdated May 28, 2026

redis/redis-observability

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/redis/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/skills/redis-vector-search ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

redis/agent-skills

60 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT