plugins/leyline/skills/utility/SKILL.md
Scores agent actions by expected gain, cost, uncertainty, and redundancy. Use when deciding whether to dispatch an agent or invoke a tool.
npx skillsauth add athola/claude-night-market utilityInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A decision framework for agent orchestration based on Liu et al., "Utility-Guided Agent Orchestration for Efficient LLM Tool Use" (arXiv:2603.19896). Each candidate action is scored by subtracting weighted costs from expected gain, producing a single utility value that guides action selection. The framework prevents over-calling tools and premature stopping by making both errors costly. Utility range is [-2.3, 1.0].
A = {respond, retrieve, tool_call, verify, delegate, stop}
| Action | Description | |-----------|------------------------------------------------------| | respond | Emit a final answer from current context | | retrieve | Fetch additional information (search, read, lookup) | | tool_call | Execute a tool (code runner, API, file write) | | verify | Check a prior result for correctness or completeness | | delegate | Spawn a sub-agent or hand off to a specialist | | stop | Terminate the loop and return current state |
U(a | s_t) = Gain(a | s_t)
- λ₁ · StepCost(a | s_t)
- λ₂ · Uncertainty(a | s_t)
- λ₃ · Redundancy(a | s_t)
| Parameter | Default | Rationale | |-----------|---------|---------------------------------------------------| | λ₁ | 1.0 | Cost baseline; all other weights relative to this | | λ₂ | 0.5 | Weak empirical correlation with outcome (r=0.0131) | | λ₃ | 0.8 | Redundancy pruning yields ~10% token savings |
Utility range: [-2.3, 1.0]. Positive values indicate the action is worth taking. Values below the floor (-0.5 default) indicate the action should be skipped.
Stop the loop when any of the following is true:
stopstop actions score below the floor (default: -0.5)High-gain override: If Gain >= 0.7 for any action, condition
(c) may be overridden.
Document the override and the gain value in your reasoning trace.
Minimal 4-step advisory pattern:
modules/state-builder.mdA per
modules/action-selector.mdU(a | s_t), subject to termination conditionsmodules/state-builder.md — how to
populate s_t from task contextmodules/gain.md — estimating expected information
or progress gainmodules/step-cost.md — token, latency, and
monetary cost tablesmodules/uncertainty.md — confidence
estimation and calibrationmodules/redundancy.md — detecting duplicate
or low-delta actionsmodules/action-selector.md — scoring
loop and tie-breaking rulesmodules/integration.md — wiring utility
scoring into existing orchestration loopstools
Detect friction signals; graduate patterns into rules. Use for session retrospectives.
testing
Use when you need a diff-derived test plan for an MR — reads the diff, groups changes by area, runs targeted verifications, and proves revert-tests are genuine guards, not dead assertions.
development
Curate the web-capture index. Use when the capture backlog grows, captures sit unprocessed at seedling/pending, or to surface stored research during work.
testing
Probe memory/summary clarity via dual anchor questions: task progress, info gaps. Use when verifying session state or summary before handoff or compression.