skills/rag/SKILL.md
Manage the RAG MCP server — index codebases, search semantically, configure backends (ChromaDB/Redis/Qdrant)
npx skillsauth add michelabboud/claude-code-helper RAGInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Unified interface for the RAG MCP server. Index codebases, search semantically, find similar code, and configure the vector database backend.
/rag init → First-time setup wizard (backend, install, configure, teach Claude Code)
/rag index [path] → Index the current directory (or a specific path)
/rag search <query> → Semantic search across indexed code
/rag similar <snippet> → Find code similar to a snippet
/rag context <task> → Get relevant context for a task
/rag collections → List all indexed collections
/rag stats <collection> → Show stats for a collection
/rag delete <collection> → Delete a collection
/rag config → Show current RAG configuration
/rag config <backend> → Configure backend (chromadb|redis|qdrant)
/rag hello → Quick greeting
/rag hello ID → Full profile
Config file: ~/.claude/rag-config.json
This file is the single source of truth for RAG settings. It persists across sessions.
On every invocation of /rag, read ~/.claude/rag-config.json first. If it exists, use its values as the current configuration context. If it does not exist, assume defaults:
{
"backend": "chromadb",
"host": "localhost",
"port": 8000,
"embeddingType": "local",
"modelVariant": "default",
"defaultCollection": "codebase",
"persistence": {
"enabled": false,
"mode": "none",
"dataDir": null
},
"updatedAt": null
}
When any config-changing action occurs (config <backend>, index, delete), update ~/.claude/rag-config.json to reflect the new state. For example:
index /path/to/foo → set "defaultCollection": "foo" and add "foo" to a "collections" arrayconfig redis → set "backend": "redis", "port": 6379, "updatedAt": "<now>"delete <name> → remove from "collections" arrayThe config file schema:
{
"backend": "chromadb | redis | qdrant",
"host": "localhost",
"port": 8000,
"embeddingType": "local | openai",
"modelVariant": "default | quantized",
"defaultCollection": "codebase",
"collections": ["codebase", "my-project"],
"persistence": {
"enabled": true,
"mode": "aof | rdb | both | none",
"dataDir": "~/.claude/rag-data"
},
"updatedAt": "2026-02-21T10:30:00Z"
}
Indexed data persists across Claude Code sessions. When using Redis or Qdrant, the vector database runs as a separate process and retains all indexed collections between sessions. You don't need to re-index every time.
Persistence directory: ~/.claude/rag-data/
This directory stores persistent vector data. When using Docker, mount it as a volume so data survives container restarts.
Redis (recommended for persistence):
aof — Append-Only File, every write is logged, most durablerdb — Periodic snapshots, good balance of performance and safetyboth — AOF + RDB combined (safest)docker run -d -p 6379:6379 \
-v ~/.claude/rag-data:/data \
redis/redis-stack-server \
--appendonly yes
Qdrant:
docker run -d -p 6333:6333 \
-v ~/.claude/rag-data/qdrant:/qdrant/storage \
qdrant/qdrant
ChromaDB:
docker run -d -p 8000:8000 \
-v ~/.claude/rag-data/chroma:/chroma/chroma \
chromadb/chroma
/rag index — takes time to index the full codebase/rag search immediately/rag index again to re-index (overwrites existing collection)-v volume mount aboveRAG uses two layers of CLAUDE.md hints so Claude Code knows RAG is available:
~/.claude/CLAUDE.md)Written by /rag init. Tells every Claude Code session that RAG exists:
## RAG MCP
The RAG MCP server is installed and provides semantic codebase search.
When a project's CLAUDE.md contains a `## RAG Index` section, use
mcp__rag__semantic_search with the specified collection name to find
relevant code before answering architecture questions or making changes.
Each project has its own collection. Use /rag to manage indexing and configuration.
<project>/.claude/CLAUDE.md)Written by /rag index. Tells sessions in that specific project which collection to use:
## RAG Index
This project is indexed in the RAG vector database (collection: "<name>").
When exploring unfamiliar code, answering architecture questions, or making changes,
use mcp__rag__semantic_search with collection "<name>" to find relevant code context first.
Last indexed: <date>
/rag init writes the global ## RAG MCP section to ~/.claude/CLAUDE.md/rag index writes the per-project ## RAG Index section to <project>/.claude/CLAUDE.md.claude/ directory if it doesn't exist.claude/CLAUDE.md if it doesn't exist (with just the RAG section)## or end of file)/rag delete <collection>, remove the ## RAG Index section from that project's CLAUDE.md if the deleted collection matches/rag init (reconfigure), update the global section — never duplicate itWhen the user types just /rag with no command, present an interactive menu using AskUserQuestion so they can choose what to do:
First, check if ~/.claude/rag-config.json exists. If it does NOT exist (first time), automatically redirect to init instead of showing the menu.
If config exists, show the menu:
question: "What would you like to do with RAG?"
header: "RAG Action"
options:
- label: "Index codebase"
description: "Index the current project for semantic search"
- label: "Search code"
description: "Search indexed code with natural language"
- label: "View collections"
description: "List all indexed collections and stats"
- label: "Configure backend"
description: "Switch between ChromaDB, Redis, or Qdrant"
After the user selects an option:
index instructions belowsearch instructionscollections instructionsconfig instructionsinitFirst-time setup wizard. Guides the user through choosing a backend, installing it, configuring the MCP server, and teaching Claude Code that RAG is available.
If ~/.claude/rag-config.json already exists, show the current config and ask if they want to reconfigure.
Display:
## RAG Setup Wizard
RAG (Retrieval-Augmented Generation) gives Claude Code semantic search
over your codebases. Instead of grepping files, Claude can find relevant
code by meaning — "how does authentication work?" returns the actual auth
code, not just files containing the word "auth".
How it works:
1. You index a project → code is chunked and embedded into vectors
2. Vectors are stored in a database that persists across sessions
3. Claude Code searches by meaning when you ask questions or make changes
4. Multiple projects can be indexed simultaneously — each gets its own collection
Let's set it up.
Use AskUserQuestion:
question: "Which vector database backend would you like to use?"
header: "Backend"
options:
- label: "Redis (Recommended)"
description: "Fast, mature, great persistence. Best all-around choice."
markdown: |
## Redis with RediSearch
**Pros:**
- Extremely fast — sub-millisecond vector search
- Mature and battle-tested (millions of production deployments)
- Excellent persistence options (AOF, RDB, or both)
- Multi-repo: single Redis instance serves all your projects
- Rich data structures beyond vectors (caching, queues, etc.)
- Low memory overhead per vector
**Cons:**
- Requires the RediSearch module (comes with redis-stack)
- Needs local embedding generation (included, ~90 MB model)
**Best for:** Most users. Especially if you work on multiple projects.
- label: "Qdrant"
description: "Purpose-built vector DB. Best filtering and scalability."
markdown: |
## Qdrant
**Pros:**
- Purpose-built for vector search — optimized from the ground up
- Advanced filtering (combine vector search with metadata filters)
- Excellent for very large codebases (100K+ files)
- Built-in persistence to disk by default
- Multi-repo: single instance serves all projects
- REST API and gRPC support
**Cons:**
- Higher memory usage than Redis for small codebases
- Needs local embedding generation (included, ~90 MB model)
- Less ecosystem tooling compared to Redis
**Best for:** Large codebases, advanced filtering needs, or dedicated vector search.
- label: "ChromaDB"
description: "Simplest setup. Built-in embeddings, no extras needed."
markdown: |
## ChromaDB
**Pros:**
- Simplest to set up — just run the container
- Built-in embedding generation (no separate model needed)
- Good documentation and Python ecosystem
- Multi-repo: single instance serves all projects
**Cons:**
- Slower than Redis/Qdrant for large codebases
- Less mature persistence story
- Limited filtering capabilities
- Higher memory usage per embedding
**Best for:** Quick experiments, small projects, or if you want zero config.
After backend choice, use AskUserQuestion:
question: "How would you like to install <backend>?"
header: "Install"
options:
- label: "Docker (Recommended)"
description: "Isolated container with persistent storage. One command."
- label: "Local install"
description: "Install natively on your system."
- label: "Already running"
description: "I already have <backend> running."
If Docker:
Run the appropriate Docker command via Bash. Always use persistent volumes and name the container for easy management:
Redis:
mkdir -p ~/.claude/rag-data
docker run -d \
--name claude-rag-redis \
--restart unless-stopped \
-p 6379:6379 \
-v ~/.claude/rag-data:/data \
redis/redis-stack-server \
--appendonly yes
Qdrant:
mkdir -p ~/.claude/rag-data/qdrant
docker run -d \
--name claude-rag-qdrant \
--restart unless-stopped \
-p 6333:6333 \
-v ~/.claude/rag-data/qdrant:/qdrant/storage \
qdrant/qdrant
ChromaDB:
mkdir -p ~/.claude/rag-data/chroma
docker run -d \
--name claude-rag-chroma \
--restart unless-stopped \
-p 8000:8000 \
-v ~/.claude/rag-data/chroma:/chroma/chroma \
chromadb/chroma
Note: --restart unless-stopped ensures the container auto-starts on system boot.
If Local install:
Show install instructions and run them:
Redis:
## Linux (Ubuntu/Debian)
curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list
sudo apt-get update
sudo apt-get install redis-stack-server
## macOS
brew tap redis-stack/redis-stack
brew install redis-stack-server
After install, show how to enable the service:
# Linux: enable and start
sudo systemctl enable redis-stack-server
sudo systemctl start redis-stack-server
# macOS: start with brew
brew services start redis-stack-server
Qdrant:
## Using pre-built binary
curl -LO https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
./qdrant --storage-path ~/.claude/rag-data/qdrant
## macOS
brew install qdrant/tap/qdrant
qdrant --storage-path ~/.claude/rag-data/qdrant
ChromaDB:
pip install chromadb
chroma run --path ~/.claude/rag-data/chroma
If Already running:
Skip installation, proceed to verification.
Run a connectivity check via Bash:
redis-cli -h localhost -p 6379 ping → expect PONGcurl -s http://localhost:6333/healthz → expect ok or JSONcurl -s http://localhost:8000/api/v1/heartbeat → expect JSONIf the check fails:
If the check succeeds, show: <backend> is running and reachable.
Use AskUserQuestion:
question: "Which embedding provider would you like to use?"
header: "Embeddings"
options:
- label: "Local (Recommended)"
description: "Free, private, no API key. Uses all-MiniLM-L6-v2 (~90 MB download on first use)."
- label: "OpenAI"
description: "Higher quality embeddings. Requires OPENAI_API_KEY and costs per request."
If OpenAI: check if OPENAI_API_KEY is set. If not, warn and ask the user to set it before proceeding.
Find the rag-mcp build path. Check in order:
claude mcp list — if rag already registered, extract the existing node path~/.claude/mcp-servers/rag-mcp/build/index.jsmcp-servers/rag-mcp/build/index.js (if cloned from claude-code-helper)Then register:
# Remove old registration if it exists
claude mcp remove rag 2>/dev/null
# Add with new config
claude mcp add rag \
-e VECTOR_DB_TYPE=<backend> \
-e VECTOR_DB_HOST=<host> \
-e VECTOR_DB_PORT=<port> \
-e EMBEDDING_TYPE=<embedding_type> \
-e MODEL_VARIANT=default \
-- node <path-to-build/index.js>
Write ~/.claude/rag-config.json:
{
"backend": "<backend>",
"host": "localhost",
"port": <port>,
"embeddingType": "<local|openai>",
"modelVariant": "default",
"defaultCollection": "codebase",
"collections": [],
"persistence": {
"enabled": true,
"mode": "<aof for redis | disk for qdrant | disk for chromadb>",
"dataDir": "~/.claude/rag-data"
},
"installedAt": "<ISO timestamp>",
"installMethod": "<docker|local|existing>",
"updatedAt": "<ISO timestamp>"
}
Append a ## RAG MCP section to ~/.claude/CLAUDE.md (global) — so every Claude Code session is aware RAG is available:
## RAG MCP
The RAG MCP server is installed and provides semantic codebase search.
**How to use:**
- When a project's CLAUDE.md contains a `## RAG Index` section, use `mcp__rag__semantic_search` with the specified collection name to find relevant code before answering architecture questions or making changes.
- Each project has its own collection (named after the project directory).
- Use `/rag` to manage indexing, search, and configuration.
- The vector database runs as a persistent background service — indexed data survives across sessions.
Rules:
## RAG MCP already exists in ~/.claude/CLAUDE.md, replace itUse AskUserQuestion:
question: "Would you like to index the current project now?"
header: "Index"
options:
- label: "Yes, index now"
description: "Index <current-directory-name> for semantic search"
- label: "No, I'll do it later"
description: "You can run /rag index anytime"
If Yes: follow the index instructions below (which will also write the per-project CLAUDE.md hint).
If No: show a summary and remind them they can run /rag index later.
Display a completion summary:
## RAG Setup Complete
Backend: <backend> (<docker|local|existing>)
Host: localhost:<port>
Embeddings: <local|openai>
Persistence: ~/.claude/rag-data/
Config: ~/.claude/rag-config.json
Claude Code awareness:
Global: ~/.claude/CLAUDE.md → ## RAG MCP section added
<if indexed: "Project: .claude/CLAUDE.md → ## RAG Index section added">
Next steps:
/rag index → Index a project for semantic search
/rag search "query" → Search indexed code
/rag collections → View all indexed projects
/rag config → View or change configuration
Restart Claude Code for the MCP server registration to take effect.
index or index [path]Index a codebase for semantic search.
/home/user/my-project → my-project)mcp__rag__index_codebase with:
rootPath: the target pathcollectionName: derived nameexcludePatterns: ["node_modules/**", "build/**", "dist/**", ".git/**", "*.lock", "coverage/**", ".next/**", "__pycache__/**", "venv/**", ".venv/**"]mcp__rag__get_collection_stats to show the collection sizerootPath, or its parent if rootPath is a subdirectory)<project-root>/.claude/CLAUDE.md (create .claude/ dir and file if needed)## RAG Index section exists, replace it; otherwise append it## RAG Index
This project is indexed in the RAG vector database (collection: "<name>").
When exploring unfamiliar code, answering architecture questions, or making changes,
use mcp__rag__semantic_search with collection "<name>" to find relevant code context first.
Last indexed: <YYYY-MM-DD>
~/.claude/rag-config.json — set defaultCollection to the new collection name, add to collections arrayIndexed [X] files into collection "[name]"
Collection stats: [X] chunks
RAG hint added to .claude/CLAUDE.md
You can now search with: /rag search "your query"
search <query>Search the codebase using natural language.
mcp__rag__semantic_search with:
query: the user's querycollectionName: use "codebase" as default, or ask if multiple collections existnResults: 10similar <snippet>Find code similar to a provided snippet.
mcp__rag__find_similar_code with:
codeSnippet: the user's snippetnResults: 5context <task>Get relevant code context for a specific task.
mcp__rag__get_relevant_context with:
task: the user's task descriptionmaxTokens: 4000collectionsList all indexed collections.
mcp__rag__list_collectionsmcp__rag__get_collection_stats## RAG Collections
| Collection | Chunks |
|------------|--------|
| my-project | 1,200 |
| other-repo | 640 |
/rag index to index a project."stats <collection>Show detailed stats for a specific collection.
mcp__rag__get_collection_stats with the collection namedelete <collection>Delete an indexed collection.
mcp__rag__delete_collection with the collection name~/.claude/rag-config.json — remove from collections array.claude/CLAUDE.md with a ## RAG Index section referencing this collection, remove that sectionconfig (no argument)Show current RAG MCP configuration from ~/.claude/rag-config.json.
~/.claude/rag-config.json using the Read tool
## RAG Configuration
Backend: redis
Host: localhost:6379
Embeddings: local (all-MiniLM-L6-v2, 384 dim)
Model variant: default (90.4 MB full precision)
Persistence: aof (data dir: ~/.claude/rag-data)
Default collection: codebase
Known collections: codebase, my-project
Last updated: 2026-02-21T10:30:00Z
Supported backends: chromadb, redis, qdrant
Run: /rag config <backend> → switch backend
config <backend>Switch the RAG MCP server to a different vector database backend.
Supported backends:
chromadb — Default. ChromaDB with built-in embeddings. Port 8000.redis — Redis with RediSearch module. Requires local embeddings. Port 6379.qdrant — Qdrant vector database. Requires local embeddings. Port 6333.Additional config options (can be appended):
config redis --host <host> --port <port> — Custom host/portconfig <backend> --embeddings openai — Use OpenAI embeddings (requires OPENAI_API_KEY)config <backend> --model quantized — Use quantized local model (23 MB vs 90.4 MB)Steps:
Read current config from ~/.claude/rag-config.json (or use defaults if missing)
Determine the new backend and options from the user's input
Map backend to defaults:
chromadb: port 8000redis: port 6379qdrant: port 6333Merge user-provided overrides (--host, --port, --embeddings, --model) with defaults
If embeddings = openai, remind user to set OPENAI_API_KEY
Write config to ~/.claude/rag-config.json (this is the persistent store):
{
"backend": "redis",
"host": "localhost",
"port": 6379,
"embeddingType": "local",
"modelVariant": "default",
"defaultCollection": "codebase",
"collections": [],
"updatedAt": "2026-02-21T10:30:00Z"
}
Preserve existing collections and defaultCollection from the old config.
Update the MCP server registration so it picks up the new env vars:
claude mcp remove rag
claude mcp add rag \
-e VECTOR_DB_TYPE=<backend> \
-e VECTOR_DB_HOST=<host> \
-e VECTOR_DB_PORT=<port> \
-e EMBEDDING_TYPE=<type> \
-e MODEL_VARIANT=<variant> \
-- node /path/to/rag-mcp/build/index.js
To find the node path, run claude mcp list first to extract the existing path.
Output:
RAG backend switched to: redis
Host: localhost:6379
Embeddings: local (all-MiniLM-L6-v2)
Config saved to: ~/.claude/rag-config.json
Restart Claude Code for changes to take effect.
Make sure Redis is running with the RediSearch module:
docker run -p 6379:6379 redis/redis-stack-server
Show backend-specific setup instructions with persistent storage:
docker run -d -p 8000:8000 -v ~/.claude/rag-data/chroma:/chroma/chroma chromadb/chroma
docker run -d -p 6379:6379 -v ~/.claude/rag-data:/data redis/redis-stack-server --appendonly yes
docker run -d -p 6333:6333 -v ~/.claude/rag-data/qdrant:/qdrant/storage qdrant/qdrant
Always include the -v volume mount so indexed data survives container restarts.
helloRespond with:
Hello! I'm RAG v2.0.0. I manage semantic codebase search — init, index, search, configure backends. Use
/rag hello IDfor the full guide.
hello IDRespond with complete skill information:
/rag <command>init — First-time setup wizard (choose backend, install, configure, teach Claude Code)index [path] — Index the current directory or a specific pathsearch <query> — Semantic natural language searchsimilar <snippet> — Find similar codecontext <task> — Get relevant context for a taskcollections — List all indexed collectionsstats <name> — Show collection statisticsdelete <name> — Delete a collectionconfig — Show current configurationconfig <backend> — Switch backend (chromadb/redis/qdrant)hello — Quick greetinghello ID — This full profilerag-mcp) must be configured via claude mcp add or /rag initdevelopment
Score a coding task by complexity (1-10) and recommend the right model (haiku/sonnet/opus) before invoking a language agent. Holds the per-language rubric for all language/framework experts.
tools
When the user asks about [TRIGGER WORDS], wants to [ACTION], or needs help with [TOPIC], use this skill to provide [CAPABILITY]
tools
Check if your claude-code-helper installation is up to date and apply updates. Reads the local manifest and compares against the latest component-versions index on GitHub. Supports checking all components, a single component by name, and applying updates with automatic backup.
testing
Comprehensive testing skill covering TDD, E2E, BDD, contract testing, mutation testing, and visual regression. Use when writing tests, designing test strategy, adding test coverage, fixing flaky tests, mocking services, setting up testing frameworks, or any testing task. Triggers on 'write tests', 'add test coverage', 'test strategy', 'fix flaky test', 'mock', 'E2E test', 'unit test', 'integration test'.