skills/nlp/triplet-loss-biencoder-finetuning/SKILL.md
Fine-tunes a bi-encoder with triplet loss using retrieval-mined hard negatives for dense similarity search.
npx skillsauth add wenmin-wu/ds-skills nlp-triplet-loss-biencoder-finetuningInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Off-the-shelf embedding models often underperform on domain-specific retrieval. Fine-tune a bi-encoder (BGE, SentenceTransformer) with triplet loss: anchor (query), positive (correct match), negative (hard negative mined from retrieval). Hard negatives — high-similarity but incorrect items — force the model to learn fine-grained distinctions.
from sentence_transformers import SentenceTransformer, SentenceTransformerTrainer
from sentence_transformers.training_args import SentenceTransformerTrainingArguments
from sentence_transformers.losses import TripletLoss
from sentence_transformers.training_args import BatchSamplers
model = SentenceTransformer("BAAI/bge-large-en-v1.5")
loss = TripletLoss(model)
# Dataset columns: anchor, positive, negative
# Mine hard negatives: encode all docs, retrieve top-K, take non-matching as negatives
args = SentenceTransformerTrainingArguments(
output_dir="./finetuned-bge",
num_train_epochs=3,
per_device_train_batch_size=16,
learning_rate=2e-5,
warmup_ratio=0.1,
fp16=True,
batch_sampler=BatchSamplers.NO_DUPLICATES,
lr_scheduler_type="cosine_with_restarts",
)
trainer = SentenceTransformerTrainer(model=model, args=args, train_dataset=triplets, loss=loss)
trainer.train()
TripletLoss and NO_DUPLICATES batch samplerNO_DUPLICATES ensures diverse batches; avoids trivial in-batch negativesdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF