skills/llm/turn-based-prompt-accumulation/SKILL.md
Rebuild multi-turn conversation context by interleaving user/assistant turns with chat template tokens into a single prompt each call
npx skillsauth add wenmin-wu/ds-skills llm-turn-based-prompt-accumulationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Causal LLMs without native chat APIs need conversation history manually injected into the prompt. Each turn, rebuild the full prompt by interleaving all previous user and assistant messages with the model's special tokens (e.g., <|start_header_id|>, <|eot_id|> for Llama). This gives the model full context of the conversation without maintaining KV cache across calls.
def build_chat_prompt(system_prompt, questions, answers, model_format="llama3"):
prompt = (f"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n"
f"{system_prompt}<|eot_id|>")
for i, (q, a) in enumerate(zip(questions, answers)):
prompt += (f"<|start_header_id|>assistant<|end_header_id|>\n\n"
f"{q}<|eot_id|>")
prompt += (f"<|start_header_id|>user<|end_header_id|>\n\n"
f"{a}<|eot_id|>")
# Open the next assistant turn
prompt += "<|start_header_id|>assistant<|end_header_id|>\n\n"
return prompt
prompt = build_chat_prompt(sys_prompt, obs.questions, obs.answers)
output = model.generate(tokenizer(prompt, return_tensors="pt").to("cuda"),
max_new_tokens=50)
<|start_header_id|>role<|end_header_id|>; Gemma uses <start_of_turn>modeldata-ai
Scaled Pinball Loss (SPL) metric for evaluating quantile forecasts, normalized by mean absolute successive differences of training data
data-ai
Walk backward through a time series and multiplicatively rescale segments when jumps exceed a fraction of the running mean to correct data collection anomalies
testing
Transform forecasting target to next/current ratio minus one so that optimizing MAE or squared error implicitly minimizes SMAPE
tools
Convert point forecasts to prediction intervals by scaling with logit-transformed quantile ratios passed through a Normal CDF