.claude/skills/rag-vector-db/SKILL.md
# RAG & Vector DB Patterns ## When to Load This Skill Load when working with: Qdrant, pgvector, embeddings, chunking, retrieval-augmented generation, semantic search, knowledge bases, document ingestion pipelines. ## Vector DB Choice | Option | When to Use | |---|---| | **Qdrant** | Default choice. Standalone service, excellent filtering, production-ready, Docker-friendly | | **pgvector** | Already have PostgreSQL, simple use case, don't want extra service | | **In-memory (numpy)** | Prototy
npx skillsauth add pyramidheadshark/ml-claude-infra .claude/skills/rag-vector-dbInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Load when working with: Qdrant, pgvector, embeddings, chunking, retrieval-augmented generation, semantic search, knowledge bases, document ingestion pipelines.
| Option | When to Use | |---|---| | Qdrant | Default choice. Standalone service, excellent filtering, production-ready, Docker-friendly | | pgvector | Already have PostgreSQL, simple use case, don't want extra service | | In-memory (numpy) | Prototyping only, < 10k documents |
services:
qdrant:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
volumes:
- qdrant_data:/qdrant/storage
volumes:
qdrant_data:
Full adapter implementation: resources/qdrant-adapter.md
Two options:
text-embedding-3-small) — API-based, no local GPU requiredmultilingual-e5-base, 768 dim, ~280MB) — local, free, good for RussianFull implementations: resources/embeddings.md
Chunking is the most critical RAG quality parameter. Default: paragraph-based, 512 tokens, 1-sentence overlap.
Full strategy + Chunk dataclass: resources/chunking-strategies.md
class RAGService:
def __init__(
self,
vector_db: QdrantAdapter,
embedder: LocalEmbeddingAdapter,
llm_adapter,
) -> None:
self._db = vector_db
self._embedder = embedder
self._llm = llm_adapter
async def answer(self, question: str, top_k: int = 5) -> dict:
query_embedding = self._embedder.embed([question])[0]
retrieved = await self._db.search(query_embedding, top_k=top_k)
if not retrieved:
return {"answer": "No information found in knowledge base.", "sources": []}
context = "\n\n---\n\n".join(r["text"] for r in retrieved)
sources = list({r["source"] for r in retrieved})
answer = await self._llm.invoke(
system="Answer based only on the provided context. If the answer is not in the context, say so explicitly.",
messages=[{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}],
)
return {"answer": answer, "sources": sources, "retrieved_count": len(retrieved)}
Full IngestionService (PDF + DOCX): resources/ingestion-pipeline.md
SQL setup + search function: resources/pgvector-alternative.md
resources/reranking.md — cross-encoder reranking for precision improvementresources/eval-ragas.md — RAG quality evaluation with RAGAS frameworktesting
# Design Doc Creator ## When to Load This Skill Load when: design documents, requirements, new project start. Short fixture skill for testing (optional/meta skill).
development
# Windows Developer Guide ## When to Load Automatically loaded on Windows (`platform_trigger: "win32"`). Applies to: `.py`, `.ps1`, `.bat`, `.cmd` files and any Windows-specific workflow. ## Python on Windows ### Encoding (CRITICAL) Windows defaults to `cp1251` / `cp1252` for file I/O. Always specify UTF-8 explicitly: ```python with open("file.txt", "r", encoding="utf-8") as f: content = f.read() Path("file.txt").read_text(encoding="utf-8") Path("file.txt").write_text(content, encodin
development
# Test-First Patterns ## When to Load This Skill Load when writing tests, creating `.feature` files, setting up conftest, discussing test strategy, or reviewing coverage. ## Philosophy Tests are written BEFORE code. Always. No exceptions. The order is: Design Doc → BDD Scenarios → Unit Tests → Implementation. BDD scenarios come from the design document's use cases section — they are a direct translation of business requirements into executable specifications. This makes tests the living do
testing
# Skill: Supply Chain Auditor ## When to Load Auto-load when: adding dependencies, reviewing packages, updating versions, or discussing `requirements.txt`, `pyproject.toml`, `package.json`. Triggers on `dependency`, `install`, `package`, `CVE`, `audit`, `vulnerable` (≥2 keywords). ## Core Rules Every new dependency addition must pass this checklist before merging: 1. **Pinned** — exact version in production (`==1.2.3` for pip, `"1.2.3"` for npm, not `^` or `~`). 2. **Maintained** — last com