skills/nlp/pearson-correlation-metric/SKILL.md
Use Pearson correlation coefficient as evaluation metric for semantic similarity regression tasks, selecting best checkpoint by correlation rather than loss
npx skillsauth add wenmin-wu/ds-skills nlp-pearson-correlation-metricInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
For regression tasks where the target represents similarity or relatedness scores, Pearson correlation measures linear agreement between predictions and labels — invariant to scale and shift. This makes it better than MSE for model selection: a model with correct ranking but wrong scale scores poorly on MSE but perfectly on Pearson r. Standard metric for STS benchmarks and similarity competitions.
import numpy as np
from scipy import stats
def pearson_score(y_true, y_pred):
return stats.pearsonr(y_true, y_pred)[0]
# HuggingFace Trainer integration
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = predictions.reshape(-1)
return {'pearson': np.corrcoef(predictions, labels)[0][1]}
trainer = Trainer(
model=model, args=args,
train_dataset=train_ds, eval_dataset=val_ds,
compute_metrics=compute_metrics,
)
compute_metrics returning Pearson r from predictions and labelsTrainer or compute manually in a training loopif score > best_score: save_model()scipy.stats.pearsonr also returns p-value — useful for small validation setsnp.corrcoef(x, y)[0][1] is faster than scipy for large arraysdata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF