skills/nlp/two-stage-retrieve-rerank/SKILL.md
Two-stage pipeline where an unsupervised bi-encoder retrieves KNN candidates and a supervised cross-encoder reranks them with sigmoid thresholding
npx skillsauth add wenmin-wu/ds-skills nlp-two-stage-retrieve-rerankInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For large-scale matching tasks (content recommendation, question–document pairing), scoring every pair is infeasible. Stage 1 uses a fast bi-encoder to embed queries and documents separately, then KNN retrieves top-N candidates. Stage 2 passes each (query, candidate) pair through a cross-encoder for precise relevance scoring. Combines recall of dense retrieval with precision of cross-attention.
from cuml.neighbors import NearestNeighbors
import cupy as cp
# Stage 1: bi-encoder KNN retrieval
query_emb = encode(queries, bi_encoder) # (Q, D)
doc_emb = encode(documents, bi_encoder) # (N, D)
knn = NearestNeighbors(n_neighbors=50, metric='cosine')
knn.fit(cp.array(doc_emb))
indices = knn.kneighbors(cp.array(query_emb), return_distance=False)
# Stage 2: cross-encoder reranking
pairs = build_pairs(queries, documents, indices) # Q*50 pairs
pairs['text'] = pairs['query_title'] + '[SEP]' + pairs['doc_title']
logits = cross_encoder_inference(pairs)
pairs['score'] = torch.sigmoid(logits).numpy()
matches = pairs[pairs['score'] > threshold]
[SEP]NearestNeighbors or FAISS for corpora > 100K documentsdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF