.agents/skills/doppler-debug/SKILL.md
Diagnose inference regressions with Doppler's shared browser/Node command contract, runtime profiles, and report artifacts. (project)
npx skillsauth add clocksmith/doppler doppler-debugInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when generation fails, outputs drift, or Node/browser parity breaks.
Use this skill with doppler-bench when investigating performance regressions.
Read these before non-trivial debug-flow, parity, or harness-contract changes:
docs/style/general-style-guide.mddocs/style/javascript-style-guide.mddocs/style/config-style-guide.mddocs/style/command-interface-design-guide.mddocs/style/harness-style-guide.mdWhen debugging turns into extension work, also open:
docs/developer-guides/README.mdCommon routes:
docs/developer-guides/07-manifest-runtime-field.mddocs/developer-guides/12-command-surface.mddocs/developer-guides/11-wgsl-kernel.mddocs/developer-guides/13-attention-variant.mddocs/developer-guides/15-kvcache-layout.mddocs/developer-guides/composite-model-family.md or docs/developer-guides/composite-pipeline-family.mdruntime + harness contract); do not substitute behavior in-place.Use this order for inference failures that load successfully but generate bad output:
Once token IDs or embeddings match, stop changing prompt wrappers or harness formatting until later evidence requires it.
For quantized failures, run one F16 or source-precision control before changing quantized kernels.
Reference workflow: docs/debug-playbook.md
Reusable report template: docs/debug-investigation-template.md
Canonical protocol source: docs/agents/debug-protocol.md
See also: docs/agents/conversion-protocol.md
Do not report conversion success unless all of these are true:
manifest.json existsIf shards exist without a manifest, classify the output as interrupted/incomplete and clean or overwrite it before retrying. Do not treat it as a reference artifact.
# Primary debug run (auto surface = node-first transport; browser fallback only when node transport is unavailable)
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","runtimeProfile":"profiles/verbose-trace"},"run":{"surface":"auto"}}' --json
# Verify pass/fail with inference suite
npm run verify:model -- --config '{"request":{"workload":"inference","modelId":"MODEL_ID","runtimeProfile":"profiles/verbose-trace"},"run":{"surface":"auto"}}' --json
# Force browser relay for mobile/WebGPU parity checks
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","runtimeProfile":"diagnostics/debug-logits"},"run":{"surface":"browser","browser":{"channel":"chrome","console":true}}}' --json
Use runtime JSON patches instead of ad-hoc flags:
npm run debug -- \
--config '{"request":{"modelId":"MODEL_ID"},"run":{"surface":"auto"}}' \
--runtime-config '{"shared":{"debug":{"trace":{"enabled":true,"categories":["attn","ffn"],"maxDecodeSteps":2}}},"inference":{"generation":{"maxTokens":8},"sampling":{"temperature":0,"topK":1,"topP":1,"repetitionPenalty":1,"greedyThreshold":0},"session":{"decodeLoop":{"batchSize":1,"stopCheckMode":"batch","readbackInterval":1}}}}' \
--json
Notes:
runtime.inference.session.decodeLoop, not runtime.inference.batching.runtime.inference.generation.maxTokens, not sampling.maxTokens.# Broad trace-heavy debug run
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","runtimeProfile":"profiles/verbose-trace"},"run":{"surface":"auto"}}' --json
# Logit-focused browser relay run
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","runtimeProfile":"diagnostics/debug-logits"},"run":{"surface":"browser","browser":{"channel":"chrome","console":true}}}' --json
# Minimal deterministic decode probe
npm run debug -- \
--config '{"request":{"modelId":"MODEL_ID"},"run":{"surface":"auto"}}' \
--runtime-config '{"inference":{"generation":{"maxTokens":16},"sampling":{"temperature":0,"topK":1,"topP":1,"repetitionPenalty":1,"greedyThreshold":0},"session":{"decodeLoop":{"batchSize":1,"stopCheckMode":"batch","readbackInterval":1}}}}' \
--json
Notes:
debug for trace/probe work and verify:model for pass/fail gates.doppler-perf; do not overload debug runs with benchmark methodology.# Cold browser run (wipe OPFS cache before launch)
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","cacheMode":"cold"},"run":{"surface":"browser"}}' --json
# Warm browser run (reuse OPFS cache)
npm run debug -- --config '{"request":{"modelId":"MODEL_ID","cacheMode":"warm"},"run":{"surface":"browser"}}' --json
result.metrics.modelLoadMs, result.metrics.firstTokenMsresult.metrics.prefillTokensPerSecTtft (preferred) and result.metrics.prefillTokensPerSecresult.metrics.decodeTokensPerSecresult.metrics.gpu (if available)result.memoryStatsresult.deviceInforesult.reportInfo (report backend/path)src/cli/doppler-cli.jssrc/tooling/command-api.jssrc/tooling/node-command-runner.jssrc/tooling/node-browser-command-runner.jssrc/inference/browser-harness.jssrc/config/runtime/profiles/verbose-trace.jsondocs/developer-guides/README.mddoppler-bench for perf regression quantificationdoppler-convert when conversion integrity is suspecteddevelopment
Diagnose and improve Doppler model/path performance with baselines, profiling traces, and controlled runtime/code experiments. (project)
documentation
Review kernels against DOPPLER style guide and propose style guide updates.
development
Diagnose inference regressions with Doppler's shared browser/Node command contract, runtime profiles, and report artifacts. (project)
testing
Convert GGUF or SafeTensors assets into Doppler RDRR manifests/shards using the current Node command surface, then verify load + inference. (project)