skills/skillxiv-v0.0.2-claude-opus-4.6/finece-confidence/SKILL.md
Provide continuous confidence scores throughout LLM text generation via supervised learning and backward confidence integration, enabling real-time uncertainty awareness
npx skillsauth add ADu2021/skillXiv finece-confidenceInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
FineCE enables language models to assess confidence in their own text generation in real-time. The approach trains a supervised confidence predictor that can leverage future context (backward confidence integration) to improve estimates for current sequences. Unlike post-hoc confidence estimation, FineCE provides granular confidence at each position during generation, enabling applications like beam search biasing, early stopping, and uncertainty-aware retrieval.
Create a model to estimate generation confidence.
# Supervised confidence prediction
import torch
from torch import nn
from typing import Tuple, List
class ConfidencePredictor(nn.Module):
"""Predict confidence scores for generated text"""
def __init__(
self,
model_dim: int = 4096,
hidden_dim: int = 1024,
confidence_dim: int = 128
):
super().__init__()
self.model_dim = model_dim
self.hidden_dim = hidden_dim
# Input projection (from LM hidden state)
self.input_proj = nn.Linear(model_dim, hidden_dim)
# Confidence estimation backbone
self.confidence_backbone = nn.Sequential(
nn.Linear(hidden_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(hidden_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.1)
)
# Output: confidence score per position
self.confidence_head = nn.Linear(hidden_dim, 1)
def forward(
self,
hidden_state: torch.Tensor, # [batch, seq_len, model_dim]
mask: torch.Tensor = None # [batch, seq_len]
) -> torch.Tensor:
"""
Predict confidence for each position.
Returns:
confidence: [batch, seq_len, 1] normalized to [0, 1]
"""
# Project input
projected = self.input_proj(hidden_state)
# Compute confidence features
features = self.confidence_backbone(projected)
# Output confidence score
logit = self.confidence_head(features)
# Normalize to [0, 1]
confidence = torch.sigmoid(logit)
# Apply mask if provided
if mask is not None:
confidence = confidence * mask.unsqueeze(-1)
return confidence
class ConfidenceTrainer:
"""Train confidence prediction model"""
def __init__(self, predictor: ConfidencePredictor, lr: float = 1e-4):
self.predictor = predictor
self.optimizer = torch.optim.Adam(predictor.parameters(), lr=lr)
def prepare_training_data(
self,
generated_texts: List[str],
ground_truth: List[str],
hidden_states: List[torch.Tensor]
) -> List[Tuple]:
"""
Create training pairs: (hidden_state, position_correctness).
For each position in generated text:
- Label = 1 if position matches ground truth (or continues on correct path)
- Label = 0 if diverges from ground truth
"""
training_data = []
for gen_text, gt_text, hidden_state in zip(
generated_texts, ground_truth, hidden_states
):
# Token-level correctness
gen_tokens = gen_text.split()
gt_tokens = gt_text.split()
for pos, gen_token in enumerate(gen_tokens):
if pos < len(gt_tokens):
is_correct = gen_token == gt_tokens[pos]
else:
is_correct = False
label = float(is_correct)
training_data.append({
"hidden_state": hidden_state[pos].unsqueeze(0),
"label": label,
"position": pos,
"text": gen_text
})
return training_data
def train_step(self, batch: List[Dict]) -> float:
"""Single training step"""
total_loss = 0
for example in batch:
hidden_state = example["hidden_state"]
label = torch.tensor([example["label"]], dtype=torch.float32)
# Predict confidence
confidence = self.predictor(hidden_state)
# Loss: binary cross-entropy between confidence and label
loss = torch.nn.functional.binary_cross_entropy(
confidence,
label.unsqueeze(-1)
)
total_loss += loss.item()
# Backward
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
return total_loss / len(batch)
Use future tokens to improve current position confidence.
# Backward Confidence Integration
class BackwardConfidenceIntegrator:
"""Integrate future context to improve confidence estimates"""
def __init__(self, predictor: ConfidencePredictor):
self.predictor = predictor
def forward_confidence(
self,
hidden_states: torch.Tensor, # [seq_len, model_dim]
mask: torch.Tensor = None
) -> torch.Tensor:
"""Standard forward confidence: left-to-right"""
confidence = self.predictor(hidden_states.unsqueeze(0), mask)
return confidence.squeeze(0)
def backward_integrated_confidence(
self,
hidden_states: torch.Tensor, # [seq_len, model_dim]
mask: torch.Tensor = None
) -> torch.Tensor:
"""
BCI: Use future context to improve confidence.
Key insight: A token's correctness often becomes clear from
subsequent context. Use bidirectional information.
"""
seq_len = hidden_states.shape[0]
confidence_scores = []
for pos in range(seq_len):
# Current position hidden state
current_hidden = hidden_states[pos:pos+1]
# Future context (next 5 tokens or until end)
future_start = min(pos + 1, seq_len)
future_end = min(pos + 6, seq_len)
future_hidden = hidden_states[future_start:future_end]
if future_hidden.shape[0] > 0:
# Combine current + future for confidence estimate
# Strategy: concatenate and re-evaluate
combined_hidden = torch.cat([
current_hidden,
future_hidden.mean(dim=0, keepdim=True)
], dim=-1)
# Project combined to model dim
combined_hidden = self._project_to_model_dim(combined_hidden)
confidence = self.predictor(combined_hidden)
else:
# No future context, use forward only
confidence = self.predictor(current_hidden)
confidence_scores.append(confidence.squeeze())
# Stack scores
bci_confidence = torch.stack(confidence_scores)
return bci_confidence
def _project_to_model_dim(self, tensor: torch.Tensor) -> torch.Tensor:
"""Project concatenated tensor back to model dim"""
if tensor.shape[-1] != self.predictor.model_dim:
# Linear projection
proj = nn.Linear(tensor.shape[-1], self.predictor.model_dim)
tensor = proj(tensor)
return tensor
Offer different approaches for measuring confidence.
# Multiple confidence estimation strategies
class ConfidenceStrategies:
"""Different strategies for estimating confidence"""
@staticmethod
def strategy_1_current_only(
predictor: ConfidencePredictor,
hidden_states: torch.Tensor
) -> torch.Tensor:
"""
Strategy 1: Confidence from current position only
(Standard approach, left-to-right)
"""
confidence = predictor(hidden_states.unsqueeze(0))
return confidence.squeeze(0)
@staticmethod
def strategy_2_with_context(
predictor: ConfidencePredictor,
hidden_states: torch.Tensor,
context_window: int = 3
) -> torch.Tensor:
"""
Strategy 2: Use surrounding context (window of ±N positions)
"""
seq_len = hidden_states.shape[0]
confidence_scores = []
for pos in range(seq_len):
# Get context window
context_start = max(0, pos - context_window)
context_end = min(seq_len, pos + context_window + 1)
context_hidden = hidden_states[context_start:context_end]
# Mean pool or attention over context
context_mean = context_hidden.mean(dim=0, keepdim=True)
# Predict confidence
confidence = predictor(context_mean)
confidence_scores.append(confidence.squeeze())
return torch.stack(confidence_scores)
@staticmethod
def strategy_3_backward_integration(
predictor: ConfidencePredictor,
hidden_states: torch.Tensor
) -> torch.Tensor:
"""
Strategy 3: BCI - use future context
"""
integrator = BackwardConfidenceIntegrator(predictor)
return integrator.backward_integrated_confidence(hidden_states)
Compute confidence scores during decoding.
# Real-time confidence during generation
class OnlineConfidenceScorer:
"""Compute confidence scores during LLM generation"""
def __init__(
self,
language_model,
confidence_predictor: ConfidencePredictor,
strategy: str = "strategy_3_backward_integration"
):
self.lm = language_model
self.predictor = confidence_predictor
self.strategy = strategy
def generate_with_confidence(
self,
prompt: str,
max_tokens: int = 256,
temperature: float = 0.7,
confidence_threshold: float = 0.5
) -> Dict:
"""
Generate text and track confidence at each step.
Returns:
- generated_text: model output
- confidence_scores: per-token confidence
- divergence_points: where confidence drops below threshold
"""
token_ids = self.lm.tokenize(prompt)
generated_tokens = []
confidence_scores = []
hidden_states_list = []
for step in range(max_tokens):
# Get model hidden states
with torch.no_grad():
outputs = self.lm.forward(
input_ids=torch.tensor(token_ids + generated_tokens).unsqueeze(0)
)
hidden_state = outputs.hidden_states[-1] # Last layer
hidden_states_list.append(hidden_state[0, -1])
# Get next token prediction
logits = outputs.logits[0, -1, :]
# Sample next token
next_token = self._sample_token(logits, temperature)
generated_tokens.append(next_token)
# Compute confidence for this step
if len(hidden_states_list) > 1:
# Use strategy to compute confidence
all_hidden = torch.stack(hidden_states_list)
if self.strategy == "strategy_1_current_only":
confidence = ConfidenceStrategies.strategy_1_current_only(
self.predictor, all_hidden
)
elif self.strategy == "strategy_2_with_context":
confidence = ConfidenceStrategies.strategy_2_with_context(
self.predictor, all_hidden
)
else: # strategy_3
confidence = ConfidenceStrategies.strategy_3_backward_integration(
self.predictor, all_hidden
)
current_confidence = confidence[-1].item()
else:
current_confidence = 0.5 # Initial uncertainty
confidence_scores.append(current_confidence)
# Early stopping if confidence too low
if current_confidence < confidence_threshold and step > 10:
break
# Decode tokens
generated_text = self.lm.tokenizer.decode(generated_tokens)
return {
"generated_text": generated_text,
"confidence_scores": confidence_scores,
"avg_confidence": sum(confidence_scores) / len(confidence_scores),
"min_confidence": min(confidence_scores) if confidence_scores else 0,
"num_tokens": len(generated_tokens)
}
def _sample_token(self, logits: torch.Tensor, temperature: float) -> int:
"""Sample token from logits"""
probs = torch.nn.functional.softmax(logits / temperature, dim=-1)
return torch.multinomial(probs, 1).item()
Leverage confidence for downstream applications.
# Applications leveraging confidence scores
class ConfidenceApplications:
"""Use confidence scores for improved generation"""
def __init__(self, scorer: OnlineConfidenceScorer):
self.scorer = scorer
def biased_beam_search(
self,
prompt: str,
num_beams: int = 4,
length_penalty: float = 1.0
) -> List[Dict]:
"""
Beam search biased by confidence scores.
Hypothesis with higher average confidence are preferred.
"""
beams = [
{"text": "", "logprob": 0, "avg_confidence": 1.0}
]
for step in range(256):
candidates = []
for beam in beams:
# Generate next tokens
completions = self.scorer.generate_with_confidence(
prompt + beam["text"],
max_tokens=3,
confidence_threshold=0.3
)
# Score by logprob + confidence
avg_confidence = completions["avg_confidence"]
logprob = 0 # Placeholder
score = logprob + 0.3 * avg_confidence
candidates.append({
"text": beam["text"] + " " + completions["generated_text"],
"logprob": score,
"avg_confidence": avg_confidence
})
# Keep top-k beams
beams = sorted(candidates, key=lambda x: x["logprob"], reverse=True)[:num_beams]
return beams
def uncertainty_aware_retrieval(
self,
prompt: str,
knowledge_base: List[str],
confidence_weight: float = 0.3
) -> List[Tuple]:
"""
Retrieve knowledge when model confidence is low.
For positions with low confidence, augment with retrieval.
"""
# Generate with confidence
result = self.scorer.generate_with_confidence(prompt)
# Identify low-confidence regions
low_conf_threshold = 0.5
low_conf_positions = [
i for i, conf in enumerate(result["confidence_scores"])
if conf < low_conf_threshold
]
# Retrieve for low-confidence positions
retrieved = []
for pos in low_conf_positions:
# What is the model uncertain about?
tokens_at_pos = result["generated_text"].split()[pos:pos+3]
query = " ".join(tokens_at_pos)
# Retrieve from knowledge base
best_docs = self._semantic_search(query, knowledge_base, top_k=3)
retrieved.append((pos, query, best_docs))
return retrieved
def _semantic_search(self, query: str, docs: List[str], top_k: int = 3):
"""Simple semantic search (placeholder)"""
return docs[:top_k]
| Strategy | Pros | Cons | Use Case | |----------|------|------|----------| | Current Only | Fast, simple | Limited context | Real-time systems | | With Context | Moderate accuracy | Moderate overhead | Batch processing | | BCI | Best accuracy | More computation | Offline generation |
Mind the Generation Process: Fine-Grained Confidence during LLM Generation. arXiv:2508.12040
testing
Uses flow maps as look-ahead operators to enable principled reward-guided diffusion by predicting trajectory endpoints at any denoising step. Deploy when applying rewards or preferences to diffusion trajectories with meaningful gradients throughout generation.
testing
Train language models where each expert learns independently on closed datasets, enabling flexible inference with selective data inclusion or exclusion. 41% performance improvement while allowing users to opt out of specific data sources without retraining.
data-ai
Understand how token generation flexibility in diffusion LMs paradoxically constrains reasoning, as models exploit ordering flexibility to avoid uncertain tokens, and apply simplified approaches that preserve parallel decoding benefits. Use when optimizing diffusion-based language models for reasoning tasks.
devops
Enable LLM agents to improve continuously during deployment by constructing structured experience libraries through self-reflection on successes and failures—achieving 23% improvement on reasoning without gradient-based parameter updates or external training.