skills/rag-pipeline-builder/SKILL.md
Build retrieval-augmented generation systems that ground LLM responses in your data
npx skillsauth add jmsktm/claude-settings RAG Pipeline BuilderInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The RAG Pipeline Builder skill guides you through designing and implementing Retrieval-Augmented Generation systems that enhance LLM responses with relevant context from your own data. RAG combines the power of large language models with the precision of information retrieval, reducing hallucinations and enabling AI to work with private, current, or domain-specific knowledge.
This skill covers the complete RAG stack: document ingestion, chunking strategies, embedding generation, vector storage, retrieval optimization, context injection, and response generation. It helps you make informed decisions at each stage based on your specific requirements for accuracy, latency, cost, and scale.
Whether you are building a documentation Q&A bot, a customer support system, or an enterprise knowledge assistant, this skill ensures your RAG implementation follows production best practices.
Documents → Loader → Chunker → Embedder → Vector DB
↓
Query → Embedder → Vector Search → Reranker → Context
↓
Context + Query → LLM → Response
def smart_chunk(doc, chunk_size=500, overlap=50):
# Respect document structure
sections = extract_sections(doc)
chunks = []
for section in sections:
if len(section) > chunk_size:
chunks.extend(sliding_window(section, chunk_size, overlap))
else:
chunks.append(section)
return add_metadata(chunks, doc)
| Action | Command/Trigger | |--------|-----------------| | Design RAG system | "Help me design a RAG pipeline for [use case]" | | Choose vector DB | "Which vector database for RAG" | | Optimize chunking | "Best chunking strategy for [content type]" | | Improve retrieval | "My RAG has poor retrieval quality" | | Reduce hallucinations | "RAG still hallucinating, help fix" | | Scale pipeline | "Scale RAG to [X] documents" |
Chunk at Semantic Boundaries: Preserve meaning in chunks
Include Rich Metadata: Enable filtering and context
Use Hybrid Search: Combine semantic and keyword search
Rerank for Quality: Two-stage retrieval improves precision
Show Your Work: Include citations and sources
Handle Edge Cases: What happens when retrieval fails?
Use different indexes for different content types:
Index 1: FAQs (short, self-contained)
Index 2: Documentation (long-form, structured)
Index 3: Conversations (temporal, contextual)
Route queries to appropriate index based on intent
Improve retrieval with query processing:
def transform_query(query):
# Step 1: Classify query type
query_type = classify_query(query)
# Step 2: Extract entities
entities = extract_entities(query)
# Step 3: Generate search queries
if query_type == "factual":
return generate_keyword_queries(query, entities)
elif query_type == "conceptual":
return generate_semantic_queries(query)
else:
return [query] # Use as-is
Reduce noise in retrieved context:
Retrieved chunks (verbose) → LLM compressor → Relevant excerpts only
Let the LLM control retrieval:
def agentic_rag(query):
# LLM decides what to search for
search_plan = llm.plan_searches(query)
# Execute searches
results = []
for search in search_plan:
results.extend(retriever.search(search.query, filters=search.filters))
# LLM synthesizes answer
return llm.synthesize(query, results)
Continuously measure RAG quality:
Metrics:
- Retrieval: Precision@k, Recall@k, MRR
- Generation: Faithfulness, Answer Relevance, Context Utilization
- End-to-end: Task Success Rate, User Satisfaction
Tools: Ragas, TruLens, LangSmith
data-ai
Optimize YouTube videos for SEO, thumbnails, descriptions, and audience retention
testing
Design and facilitate effective workshops with agendas, activities, and outcomes
data-ai
Design and optimize AI-powered workflows for complex tasks
data-ai
Design and implement automated workflows to eliminate repetitive tasks and streamline processes