skills-experimental/distribution-search/SKILL.md
Guidance for finding probability distributions that satisfy specific statistical constraints such as KL divergence targets, entropy requirements, or moment conditions. This skill should be used when tasks involve constructing discrete or continuous probability distributions with specified divergence measures, entropy values, or other distributional properties through numerical optimization.
npx skillsauth add bianhaifeng789-hue/openclaw-config distribution-searchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill provides systematic approaches for finding probability distributions that meet specific statistical constraints. Common tasks include constructing distributions with target KL divergence values (forward or backward), specified entropy, moment constraints, or combinations thereof. The approach emphasizes mathematical analysis before implementation, efficient parameterization, modular code structure, and rigorous verification.
Before writing any code, thoroughly analyze the mathematical constraints:
1. Constraint Feasibility
2. Degrees of Freedom Analysis
3. Analytical Derivations
Start Simple, Plan for Complexity
Two-group distributions: Divide elements into high-probability and low-probability groups
Multi-group distributions: If two groups are insufficient, add more groups
Continuous parameterizations: For smooth optimization landscapes
Computational Efficiency for Large Vocabularies
For large vocabulary sizes (e.g., V = 150,000):
Forward KL = k * p_high * log(p_high * V) + (V - k) * p_low * log(p_low * V)
Choose Appropriate Methods
Implementation Pattern
def objective(params, target_forward_kl, target_backward_kl, vocab_size):
# Extract parameters
k, log_ratio = params
k = int(round(k))
# Compute probabilities
p_high, p_low = compute_probs(k, log_ratio, vocab_size)
# Validate probabilities
if p_high <= 0 or p_low <= 0 or p_high > 1 or p_low > 1:
return [1e10, 1e10] # Infeasible
# Compute KL divergences using closed-form formulas
forward_kl = compute_forward_kl(k, p_high, p_low, vocab_size)
backward_kl = compute_backward_kl(k, p_high, p_low, vocab_size)
return [forward_kl - target_forward_kl, backward_kl - target_backward_kl]
Grid Search for Discrete Parameters
best_solution = None
best_error = float('inf')
for k in range(1, vocab_size):
# Optimize continuous parameters for this k
result = optimize_continuous_params(k, targets, vocab_size)
if result.error < best_error:
best_error = result.error
best_solution = result
Modular Structure to Prevent Inconsistencies
Create separate, reusable functions for core computations:
# Core computation functions - define ONCE, use everywhere
def forward_kl(p, q, mask=None):
"""Compute D_KL(P || Q) = sum_i p_i * log(p_i / q_i)"""
if mask is None:
mask = p > 1e-30
return np.sum(p[mask] * np.log(p[mask] / q[mask]))
def backward_kl(p, q, mask=None):
"""Compute D_KL(Q || P) = sum_i q_i * log(q_i / p_i)"""
if mask is None:
mask = p > 1e-30
return np.sum(q[mask] * np.log(q[mask] / p[mask]))
def entropy(p, mask=None):
"""Compute H(P) = -sum_i p_i * log(p_i)"""
if mask is None:
mask = p > 1e-30
return -np.sum(p[mask] * np.log(p[mask]))
Import in All Scripts
# In optimization script
from kl_utils import forward_kl, backward_kl
# In verification script - use SAME functions
from kl_utils import forward_kl, backward_kl
Verification Checklist
For the final solution, verify:
Distribution Properties:
[ ] All probabilities are positive
[ ] All probabilities are <= 1
[ ] Sum of probabilities equals 1.0 (within floating-point tolerance)
[ ] No NaN or Inf values
Constraint Satisfaction:
[ ] Forward KL divergence within tolerance
[ ] Backward KL divergence within tolerance
[ ] Other constraints (entropy, moments) within tolerance
Numerical Precision:
[ ] Tolerance requirements are met (e.g., |error| < 1e-6)
[ ] Floating-point sum is acceptably close to 1.0
Verification Script Structure
def verify_distribution(p, q, target_forward, target_backward, tol=1e-6):
print(f"Sum of probabilities: {np.sum(p)}")
print(f"Min probability: {np.min(p)}")
print(f"Max probability: {np.max(p)}")
print(f"Any NaN: {np.any(np.isnan(p))}")
print(f"Any Inf: {np.any(np.isinf(p))}")
fwd = forward_kl(p, q)
bwd = backward_kl(p, q)
print(f"\nForward KL: {fwd:.10f} (target: {target_forward}, error: {abs(fwd - target_forward):.2e})")
print(f"Backward KL: {bwd:.10f} (target: {target_backward}, error: {abs(bwd - target_backward):.2e})")
fwd_ok = abs(fwd - target_forward) < tol
bwd_ok = abs(bwd - target_backward) < tol
print(f"\nForward KL within tolerance: {'PASS' if fwd_ok else 'FAIL'}")
print(f"Backward KL within tolerance: {'PASS' if bwd_ok else 'FAIL'}")
return fwd_ok and bwd_ok
Problem: Creating arrays of size V = 150,000 elements causes memory issues and timeouts Solution: Use closed-form formulas for group-based distributions; only create full arrays for final verification
Problem: Different scripts implement KL divergence formulas differently, leading to discrepancies Solution: Define core computation functions once and import them everywhere
Problem: Masking logic differs between forward and backward KL, or mask sum is incorrectly used Solution: Use consistent masking (p > 1e-30) and sum over masked elements, not multiply by mask count
Problem: Simple parameterizations cannot satisfy all constraints simultaneously Solution: Analyze degrees of freedom before implementation; plan for more flexible parameterizations
Problem: File writes are truncated, leaving incomplete code Solution: Verify file content after every write by reading it back or attempting to import/execute
Problem: Attempting optimization without verifying a solution exists Solution: Mathematically analyze constraints to establish feasibility before coding
Problem: Optimization finds a local minimum that doesn't satisfy constraints Solution: Try multiple initializations; use grid search over discrete parameters; verify final solution
Problem: Probability sum not exactly 1.0 due to floating-point arithmetic Solution: Use appropriate tolerances; normalize probabilities after construction; verify precision is acceptable for the task
Forward KL (information projection):
D_KL(P || Q) = sum_i P(i) * log(P(i) / Q(i))
Backward KL (moment projection):
D_KL(Q || P) = sum_i Q(i) * log(Q(i) / P(i))
For P with k elements at probability p_high and (V-k) elements at probability p_low, with Q uniform:
D_KL(P || Q) = k * p_high * log(p_high * V) + (V - k) * p_low * log(p_low * V)
D_KL(Q || P) = (1/V) * [k * log(1 / (V * p_high)) + (V - k) * log(1 / (V * p_low))]
= (1/V) * [-k * log(V * p_high) - (V - k) * log(V * p_low)]
When initial approaches fail:
Avoid completely rewriting from scratch each time; instead, modularly modify specific components.
business
IAA 日报飞书输出能力。 支持把固定 CSV 模板一键转换成: - 中文运营结论 - 飞书卡片 JSON - 飞书发送载荷 Use when: - 需要把 IAA 日报直接发到飞书 - 需要从 CSV 一键生成运营日报
data-ai
IAA日报分析模型 功能: - 渠道日报自动分析 - 小时级+日级ROI联动判断 - 按地区输出加量/降量/停投建议 - 按产品类型输出阈值 - 自动识别利润区/观察区/止损区 Use when: - 分析每天投放数据 - 生成运营日报结论 - 判断是否加量/降量/停投 - 对比美加澳/日韩表现 Keywords: - 日报模型, 投放日报, 加量, 降量, 停投, ROI日报, 分地区分析
data-ai
IAA固定日报分析模板 功能: - 固定字段模板(可直接贴每天数据) - 自动输出总盘结论 - 自动输出美加澳/日韩结论 - 自动给出加量/降量/停投建议 - 适配文件修复/清理两类产品 Use when: - 需要固定日报格式 - 每天复盘渠道表现 - 给运营团队出统一结论 Keywords: - 固定模板, 日报模板, ROI模板, IAA日报, 运营模板
development
# HyperlinkPool Pattern Skill HyperlinkPool Pattern - HyperlinkPool class + strings array + stringMap + Index 0 no hyperlink + intern(hyperlink) + get(id) + undefined handling + 5-minute reset + OSC8 hyperlink interning。 ## 功能概述 从Claude Code的ink/screen.ts提取的HyperlinkPool模式,用于OpenClaw的OSC8超链接池管理。 ## 核心机制 ### HyperlinkPool Class ```typescript export class HyperlinkPool { private strings: string[] = [''] // Index 0 = no hyperlink private stringMap = new Map<string, number>() // strings