skills/nlp/uniform-stride-sampling/SKILL.md
Samples N items uniformly by stride from a variable-length list, always preserving the first and last elements, to fit long sequences into a fixed token budget.
npx skillsauth add wenmin-wu/ds-skills nlp-uniform-stride-samplingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When a document has many segments (code cells, paragraphs, passages) but you can only fit N into a model's context window, random sampling misses structure and consecutive sampling biases toward the start. Uniform stride sampling picks items at evenly spaced intervals, always including the first and last, giving the model a representative sketch of the full document regardless of length.
import numpy as np
def sample_uniform(items, n):
if n >= len(items):
return items
result = []
step = len(items) / n
idx = 0.0
while int(np.round(idx)) < len(items):
result.append(items[int(np.round(idx))])
idx += step
# Ensure last item is always included
if result[-1] != items[-1]:
result[-1] = items[-1]
return result
# Sample 20 code cells from a notebook with 100+ cells
sampled = sample_uniform(all_code_cells, n=20)
len(items) / Ndata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF