skills/llm/wikipedia-rag-retrieval/SKILL.md
Dense retrieval over a FAISS-indexed Wikipedia corpus to provide grounding context for LLM question answering.
npx skillsauth add wenmin-wu/ds-skills llm-wikipedia-rag-retrievalInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For knowledge-intensive QA tasks, retrieve relevant Wikipedia passages using dense embeddings (e.g., sentence-transformers) indexed in FAISS. Prepend retrieved passages as context to the LLM prompt. This grounds the model in factual content and dramatically improves accuracy over closed-book inference.
import faiss
import numpy as np
from sentence_transformers import SentenceTransformer
# Build index (offline)
encoder = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = encoder.encode(passages, batch_size=256, show_progress_bar=True)
index = faiss.IndexFlatIP(embeddings.shape[1])
index.add(np.float32(embeddings))
faiss.write_index(index, "wiki.index")
# Retrieve at inference
query_embs = encoder.encode(questions)
scores, indices = index.search(np.float32(query_embs), k=5)
contexts = ["\n".join(passages[i] for i in idx) for idx in indices]
data-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF