Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

latestaiagents/hybrid-retrieval

Name: hybrid-retrieval
Author: latestaiagents

plugins/rag-architect/skills/hybrid-retrieval/SKILL.md

npx skillsauth add latestaiagents/agent-skills hybrid-retrieval

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Hybrid Retrieval for RAG

Combine dense (semantic) and sparse (keyword) retrieval for superior results.

When to Use

Vector search misses exact keyword matches
Domain-specific terminology needs exact matching
Users search with both natural language and specific terms
Need to balance semantic understanding with precision

The Problem with Vector-Only Search

Query: "Error code E-4521 troubleshooting"

Vector search returns:
- "Common error handling patterns" (semantically similar)
- "Debugging techniques for applications" (related topic)

Missing:
- "E-4521: Database connection timeout" (exact match needed!)

Hybrid Architecture

┌─────────────────────────────────────────────────┐
│                   User Query                     │
└─────────────────────┬───────────────────────────┘
                      │
         ┌────────────┴────────────┐
         │                         │
         ▼                         ▼
┌─────────────────┐      ┌─────────────────┐
│  Dense Search   │      │  Sparse Search  │
│  (Embeddings)   │      │  (BM25/TF-IDF)  │
└────────┬────────┘      └────────┬────────┘
         │                         │
         └────────────┬────────────┘
                      │
                      ▼
              ┌───────────────┐
              │    Fusion     │
              │  (RRF/Linear) │
              └───────┬───────┘
                      │
                      ▼
              ┌───────────────┐
              │   Reranker    │
              │  (Optional)   │
              └───────┬───────┘
                      │
                      ▼
              ┌───────────────┐
              │ Final Results │
              └───────────────┘

Implementation

Basic Hybrid with LangChain

from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma

# Dense retriever (vector search)
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 10

# Combine with ensemble
hybrid_retriever = EnsembleRetriever(
    retrievers=[dense_retriever, bm25_retriever],
    weights=[0.5, 0.5]  # Adjust based on your data
)

results = hybrid_retriever.invoke("Error code E-4521")

Reciprocal Rank Fusion (RRF)

def reciprocal_rank_fusion(results_list: list, k: int = 60) -> list:
    """
    Combine multiple ranked lists using RRF.
    k=60 is the standard constant from the original paper.
    """
    fused_scores = {}

    for results in results_list:
        for rank, doc in enumerate(results):
            doc_id = doc.metadata.get("id", hash(doc.page_content))
            if doc_id not in fused_scores:
                fused_scores[doc_id] = {"doc": doc, "score": 0}
            fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

    # Sort by fused score
    reranked = sorted(
        fused_scores.values(),
        key=lambda x: x["score"],
        reverse=True
    )
    return [item["doc"] for item in reranked]

# Usage
dense_results = dense_retriever.invoke(query)
sparse_results = bm25_retriever.invoke(query)
final_results = reciprocal_rank_fusion([dense_results, sparse_results])

With Pinecone (Native Hybrid)

from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder

# Initialize
pc = Pinecone(api_key="...")
index = pc.Index("hybrid-index")

# Sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus)

# Query with both dense and sparse
def hybrid_query(query: str, alpha: float = 0.5):
    # Dense vector
    dense_vec = embeddings.embed_query(query)

    # Sparse vector
    sparse_vec = bm25.encode_queries([query])[0]

    # Hybrid search
    results = index.query(
        vector=dense_vec,
        sparse_vector=sparse_vec,
        top_k=10,
        alpha=alpha,  # 0 = sparse only, 1 = dense only
        include_metadata=True
    )
    return results

With Weaviate (Native Hybrid)

import weaviate

client = weaviate.Client("http://localhost:8080")

result = client.query.get(
    "Document",
    ["content", "title"]
).with_hybrid(
    query="Error code E-4521",
    alpha=0.5,  # Balance between vector and keyword
    fusion_type="rankedFusion"
).with_limit(10).do()

Adding a Reranker

from sentence_transformers import CrossEncoder

# Load reranker model
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

def rerank_results(query: str, docs: list, top_k: int = 5) -> list:
    """Rerank documents using cross-encoder."""
    pairs = [[query, doc.page_content] for doc in docs]
    scores = reranker.predict(pairs)

    # Sort by reranker scores
    scored_docs = list(zip(docs, scores))
    scored_docs.sort(key=lambda x: x[1], reverse=True)

    return [doc for doc, score in scored_docs[:top_k]]

# Full pipeline
hybrid_results = hybrid_retriever.invoke(query)  # Get 20 results
final_results = rerank_results(query, hybrid_results, top_k=5)  # Rerank to top 5

Weight Tuning Guidelines

| Data Type | Dense Weight | Sparse Weight | Notes | |-----------|--------------|---------------|-------| | General text | 0.5 | 0.5 | Balanced default | | Technical docs | 0.4 | 0.6 | Keywords matter more | | Conversational | 0.7 | 0.3 | Semantic matters more | | Code/APIs | 0.3 | 0.7 | Exact matches critical | | Legal/Medical | 0.4 | 0.6 | Terminology precision |

Evaluation

def evaluate_retrieval(queries: list, ground_truth: dict, retriever) -> dict:
    """Calculate retrieval metrics."""
    metrics = {"mrr": 0, "recall@5": 0, "precision@5": 0}

    for query in queries:
        results = retriever.invoke(query)
        result_ids = [doc.metadata["id"] for doc in results[:5]]
        relevant_ids = ground_truth[query]

        # MRR
        for i, rid in enumerate(result_ids):
            if rid in relevant_ids:
                metrics["mrr"] += 1 / (i + 1)
                break

        # Recall & Precision
        hits = len(set(result_ids) & set(relevant_ids))
        metrics["recall@5"] += hits / len(relevant_ids)
        metrics["precision@5"] += hits / 5

    # Average
    n = len(queries)
    return {k: v/n for k, v in metrics.items()}

Best Practices

Start with 50/50 weights - then tune based on evaluation
Always add a reranker - significant quality improvement
Index sparse vectors - BM25 on raw text, not chunks
Use native hybrid - when available (Pinecone, Weaviate, Qdrant)
Monitor both paths - log which retriever contributed to final results

latestaiagents/hybrid-retrieval

plugins/rag-architect/skills/hybrid-retrieval/SKILL.md

Implement hybrid search combining dense vectors and sparse retrieval for optimal RAG results. Use this skill when vector search alone isn't providing accurate results. Activate when: hybrid search, BM25, keyword search, sparse retrieval, dense retrieval, reranking, ensemble retrieval.

2 stars

data-ai

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add latestaiagents/agent-skills hybrid-retrieval

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 2:55 AM13.4s1 file scanned

SKILL.md

name:: hybrid-retrieval
description:: |
Activate when:: hybrid search, BM25, keyword search, sparse retrieval, dense retrieval, reranking, ensemble retrieval.

Hybrid Retrieval for RAG

Combine dense (semantic) and sparse (keyword) retrieval for superior results.

When to Use

Vector search misses exact keyword matches
Domain-specific terminology needs exact matching
Users search with both natural language and specific terms
Need to balance semantic understanding with precision

The Problem with Vector-Only Search

Query: "Error code E-4521 troubleshooting"

Vector search returns:
- "Common error handling patterns" (semantically similar)
- "Debugging techniques for applications" (related topic)

Missing:
- "E-4521: Database connection timeout" (exact match needed!)

Hybrid Architecture

┌─────────────────────────────────────────────────┐
│                   User Query                     │
└─────────────────────┬───────────────────────────┘
                      │
         ┌────────────┴────────────┐
         │                         │
         ▼                         ▼
┌─────────────────┐      ┌─────────────────┐
│  Dense Search   │      │  Sparse Search  │
│  (Embeddings)   │      │  (BM25/TF-IDF)  │
└────────┬────────┘      └────────┬────────┘
         │                         │
         └────────────┬────────────┘
                      │
                      ▼
              ┌───────────────┐
              │    Fusion     │
              │  (RRF/Linear) │
              └───────┬───────┘
                      │
                      ▼
              ┌───────────────┐
              │   Reranker    │
              │  (Optional)   │
              └───────┬───────┘
                      │
                      ▼
              ┌───────────────┐
              │ Final Results │
              └───────────────┘

Implementation

Basic Hybrid with LangChain

from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import Chroma

# Dense retriever (vector search)
vectorstore = Chroma.from_documents(docs, embeddings)
dense_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})

# Sparse retriever (BM25)
bm25_retriever = BM25Retriever.from_documents(docs)
bm25_retriever.k = 10

# Combine with ensemble
hybrid_retriever = EnsembleRetriever(
    retrievers=[dense_retriever, bm25_retriever],
    weights=[0.5, 0.5]  # Adjust based on your data
)

results = hybrid_retriever.invoke("Error code E-4521")

Reciprocal Rank Fusion (RRF)

def reciprocal_rank_fusion(results_list: list, k: int = 60) -> list:
    """
    Combine multiple ranked lists using RRF.
    k=60 is the standard constant from the original paper.
    """
    fused_scores = {}

    for results in results_list:
        for rank, doc in enumerate(results):
            doc_id = doc.metadata.get("id", hash(doc.page_content))
            if doc_id not in fused_scores:
                fused_scores[doc_id] = {"doc": doc, "score": 0}
            fused_scores[doc_id]["score"] += 1 / (k + rank + 1)

    # Sort by fused score
    reranked = sorted(
        fused_scores.values(),
        key=lambda x: x["score"],
        reverse=True
    )
    return [item["doc"] for item in reranked]

# Usage
dense_results = dense_retriever.invoke(query)
sparse_results = bm25_retriever.invoke(query)
final_results = reciprocal_rank_fusion([dense_results, sparse_results])

With Pinecone (Native Hybrid)

from pinecone import Pinecone
from pinecone_text.sparse import BM25Encoder

# Initialize
pc = Pinecone(api_key="...")
index = pc.Index("hybrid-index")

# Sparse encoder
bm25 = BM25Encoder()
bm25.fit(corpus)

# Query with both dense and sparse
def hybrid_query(query: str, alpha: float = 0.5):
    # Dense vector
    dense_vec = embeddings.embed_query(query)

    # Sparse vector
    sparse_vec = bm25.encode_queries([query])[0]

    # Hybrid search
    results = index.query(
        vector=dense_vec,
        sparse_vector=sparse_vec,
        top_k=10,
        alpha=alpha,  # 0 = sparse only, 1 = dense only
        include_metadata=True
    )
    return results

With Weaviate (Native Hybrid)

import weaviate

client = weaviate.Client("http://localhost:8080")

result = client.query.get(
    "Document",
    ["content", "title"]
).with_hybrid(
    query="Error code E-4521",
    alpha=0.5,  # Balance between vector and keyword
    fusion_type="rankedFusion"
).with_limit(10).do()

Adding a Reranker

from sentence_transformers import CrossEncoder

# Load reranker model
reranker = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2')

def rerank_results(query: str, docs: list, top_k: int = 5) -> list:
    """Rerank documents using cross-encoder."""
    pairs = [[query, doc.page_content] for doc in docs]
    scores = reranker.predict(pairs)

    # Sort by reranker scores
    scored_docs = list(zip(docs, scores))
    scored_docs.sort(key=lambda x: x[1], reverse=True)

    return [doc for doc, score in scored_docs[:top_k]]

# Full pipeline
hybrid_results = hybrid_retriever.invoke(query)  # Get 20 results
final_results = rerank_results(query, hybrid_results, top_k=5)  # Rerank to top 5

Weight Tuning Guidelines

Evaluation

def evaluate_retrieval(queries: list, ground_truth: dict, retriever) -> dict:
    """Calculate retrieval metrics."""
    metrics = {"mrr": 0, "recall@5": 0, "precision@5": 0}

    for query in queries:
        results = retriever.invoke(query)
        result_ids = [doc.metadata["id"] for doc in results[:5]]
        relevant_ids = ground_truth[query]

        # MRR
        for i, rid in enumerate(result_ids):
            if rid in relevant_ids:
                metrics["mrr"] += 1 / (i + 1)
                break

        # Recall & Precision
        hits = len(set(result_ids) & set(relevant_ids))
        metrics["recall@5"] += hits / len(relevant_ids)
        metrics["precision@5"] += hits / 5

    # Average
    n = len(queries)
    return {k: v/n for k, v in metrics.items()}

Best Practices

Start with 50/50 weights - then tune based on evaluation
Always add a reranker - significant quality improvement
Index sparse vectors - BM25 on raw text, not chunks
Use native hybrid - when available (Pinecone, Weaviate, Qdrant)
Monitor both paths - log which retriever contributed to final results

Related Skills

latestaiagents/skill-testing

development

VerifiedTrustedCommunity

Test skills for correct activation, content quality, and regression — both automated checks (frontmatter validity, lint) and manual verification (query-suite activation testing). Covers CI integration and how to catch skill regressions before users do. Use this skill when adding skills to a repo, setting up CI for a skill library, or debugging "the skill exists but doesn't work". Activate when: test skills, validate skills, skill CI, skill linting, skill activation test, skill regression.

2SKILL.mdUpdated Apr 23, 2026

latestaiagents/skill-testing

latestaiagents/skill-frontmatter

documentation

VerifiedTrustedCommunity

Write the YAML frontmatter for a SKILL.md file so it activates reliably — name, description, and activation keywords that the model matches against. Covers length, tone, and the most common frontmatter mistakes. Use this skill when authoring a new skill, fixing a skill that isn't auto-activating, or reviewing skills for publication. Activate when: SKILL.md frontmatter, skill description, skill activation, skill YAML, write a skill, author a skill.

2SKILL.mdUpdated Apr 23, 2026

latestaiagents/skill-frontmatter

latestaiagents/skill-activation-patterns

development

VerifiedTrustedCommunity

Design skills that fire at the right moment — neither over-eager (noise) nor under-eager (silent). Covers activation specificity, trigger phrases, disambiguation between overlapping skills, and debugging activation. Use this skill when multiple skills could fire on the same query, a skill never fires, or a skill fires too often. Activate when: skill won't activate, skill over-activates, overlapping skills, skill triggers, skill selection, skill disambiguation.

2SKILL.mdUpdated Apr 23, 2026

latestaiagents/skill-activation-patterns

latestaiagents/progressive-disclosure

development

VerifiedTrustedCommunity

Structure SKILL.md content so the model reads just enough — concise summary up front, progressively deeper detail, examples on demand. Covers section ordering, length budgets, when to split into multiple skills. Use this skill when writing or refactoring a skill body, one skill has grown too long, or a skill is wordy but not useful. Activate when: SKILL.md structure, skill content, skill too long, split skill, progressive disclosure, skill body.

2SKILL.mdUpdated Apr 23, 2026

latestaiagents/progressive-disclosure

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/latestaiagents/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/plugins/rag-architect/skills/hybrid-retrieval ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

latestaiagents/agent-skills

2 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT