skills/mlb-decision-logger/SKILL.md
Appends structured decision entries to the yahoo-mlb decision log (tracker/decisions-log.md) on behalf of any agent in the MLB team. Validates entries against the authoritative schema, serializes concurrent writes from parallel agents, and runs the Monday calibration pass to fill in outcomes and update the variant scoreboard. Use when any MLB agent needs to record a decision, when the coach requests "log decision", "append to decision log", "record variant outcome", or runs the "calibration pass".
npx skillsauth add lyndonkl/claude mlb-decision-loggerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Scenario: Three agents fire in a single morning brief on 2026-04-17. Each agent hands a decision payload to this skill. On Monday 2026-04-21, the coach runs the calibration pass over Friday's lineup decision.
Sequence:
mlb-lineup-optimizer submits a start/sit decision for Junior Caminero.mlb-waiver-analyst submits a FAAB bid on a two-start streamer.mlb-streaming-strategist submits a stream pickup.Skill behavior:
tracker/decisions-log.md.decision_id by counting same-day, same-type entries and incrementing (2026-04-17-lineup-01, 2026-04-17-waiver-01, 2026-04-17-stream-01).context/frameworks/decision-log-format.md.decision_id to the calling agent.Monday calibration (2026-04-21):
will_verify_on <= 2026-04-21 and outcome_recorded_on is empty.tracker/variant-scoreboard.md.tilt per agent using the advocate/critic counts over the most recent 20 verified decisions.Worked entry appended during step 1:
### 2026-04-17T08:30:00Z | lineup | mlb-lineup-optimizer
- **decision_id:** 2026-04-17-lineup-01
- **recommendation:** START Caminero at 3B
- **signals_in:** caminero.daily_quality=78 (matchup=85, form=72, opportunity=80)
- **variants:**
- advocate -> "Start: strong matchup vs RHP, hitter-friendly park, batting 3rd"
- critic -> "Sit: 0-for-12 last 3 games, BABIP-driven regression due"
- **dialectical_synthesis:** Advocate wins -- matchup+opportunity outweigh slump. Confidence 0.72.
- **red_team_findings:**
- severity: 2, likelihood: 3, score: 6, note: "MIA rain watch", mitigation: "Check 1pm forecast"
- **confidence:** 0.72
- **will_verify_on:** 2026-04-18
- **outcome_recorded_on:**
- **outcome:**
- **variant_that_was_right:**
After Monday calibration the last three fields are filled in; the scoreboard row for mlb-lineup-optimizer increments Total decisions, Advocate correct (if 1-for-4 with an RBI counted as success), and Synthesis correct; Tilt is recomputed.
Copy this checklist and track progress for each invocation:
Decision Logger Progress:
- [ ] Step 1: Determine mode (append vs calibrate)
- [ ] Step 2: Read current log tail
- [ ] Step 3: Validate payload against schema
- [ ] Step 4: Serialize write (append or in-place edit)
- [ ] Step 5: Update scoreboard (calibration mode only)
- [ ] Step 6: Return decision_id and confirmation
Step 1: Determine mode
Two supported modes. Exactly one fires per invocation.
append -- calling agent passes a complete decision payload. Skill creates a new entry.calibrate -- calling agent passes an existing decision_id plus outcome fields. Skill edits that entry in place and updates the scoreboard.See resources/methodology.md for the full decision tree.
Step 2: Read current log tail
Always re-read the log immediately before any write. This is how concurrent writes from parallel agents are serialized without a real file lock.
tracker/decisions-log.md.decision_id.decision_id numbers used today.See resources/methodology.md for the full serialization protocol.
Step 3: Validate payload against schema
Check every required field. Reject (do not write) if any are missing or malformed. The authoritative schema is in context/frameworks/decision-log-format.md; resources/template.md mirrors it verbatim.
timestamp_iso8601 is UTC ISO 8601 (YYYY-MM-DDTHH:MM:SSZ).decision_type in enum: lineup | waiver | stream | trade | category-plan | playoff-push | add-drop | ad-hoc.emitted_by is a known agent name.recommendation starts with an action verb (START, SIT, ADD, DROP, BID $X, ACCEPT, COUNTER, REJECT, STREAM, HOLD).signals_in references at least one signal by name.variants has both advocate and critic entries (or explicit n/a for bootstrap).confidence in [0.00, 1.00].will_verify_on is a date, end of week N, or end of season.calibrate mode: outcome in {happened, did not happen, partial}, variant_that_was_right in {advocate, critic, both, neither}.Step 4: Serialize write
append: assign decision_id using format {YYYY-MM-DD}-{decision_type}-{NN} where NN is last_same_type_same_day_index + 1, zero-padded to 2 digits. Append the formatted entry plus a trailing separator line.calibrate: locate the target entry by decision_id, replace only the three outcome fields, preserve everything else byte-for-byte.See resources/methodology.md.
Step 5: Update scoreboard (calibration mode only)
tracker/variant-scoreboard.md.emitted_by agent.Total decisions by 1.Advocate correct, Critic correct, or both (for variant_that_was_right = both). Increment Synthesis correct when the synthesized recommendation matched reality.Tilt per the rules in resources/methodology.md.Step 6: Return decision_id and confirmation
decision_id (append) or confirmation of calibrated fields (calibrate).Pattern 1: Parallel agent append (morning brief)
Coach launches lineup-optimizer, waiver-analyst, and streaming-strategist in one run. Each completes its variant pair, synthesizes, and calls this skill. Skill processes them serially in arrival order, re-reading the log tail between each write, assigning unique decision_ids per type.
Pattern 2: Monday calibration pass
Coach opens the log, filters for entries where will_verify_on <= today and outcome_recorded_on is empty, and for each one: web-searches the outcome, builds the calibration payload, calls this skill in calibrate mode. Each call edits one entry and updates one scoreboard row.
Pattern 3: Trade decision (on-demand, deferred verification)
mlb-trade-analyzer fires when a trade offer arrives. The decision is REJECT, ACCEPT, or COUNTER. will_verify_on is typically end of week N or end of season because trade value plays out over weeks. Skill logs normally; the trade stays "open" on the calibration queue until its verify date.
Pattern 4: Bootstrap / ad-hoc entry
For team initialization or meta-decisions (changing a weighting, amending a framework), decision_type = ad-hoc or bootstrap, variants = n/a, variant_that_was_right = neither. Skill accepts this as a special shape.
Append only. Never rewrite the log. Never reorder entries. The only edit allowed is filling in the three outcome fields on an existing entry during calibration. Any other mutation is a bug.
Re-read before every write. Parallel agents share one file. Reading the tail immediately before writing is the serialization primitive. Do not cache the tail from earlier in the run.
Validate before writing. A malformed entry poisons downstream calibration. Reject with a structured error; do not attempt partial writes or auto-fix missing fields.
decision_id is deterministic. Same day, same type, sequential NN. If another agent has already used 2026-04-17-lineup-01, this agent gets 02. Never reuse an id.
Outcome-editing is surgical. When filling outcome fields, replace only the three target lines. Preserve timestamps, decision_id, recommendation, variants, synthesis, red-team, confidence, and will_verify_on byte-for-byte. If you have to reformat to write, you are doing it wrong.
Scoreboard updates are idempotent per decision. Each decision_id contributes exactly one row increment. If a calibration call is retried, re-check whether the entry already has outcome_recorded_on populated; if so, do not double-count.
Tilt needs sample size. Report neutral when Total decisions < 10 for the agent. Only switch to advocate+ / critic+ once both sample size and margin are met. See resources/methodology.md for exact thresholds.
Agents never write to the log directly. Every MLB agent routes through this skill. If the coach finds a malformed entry not produced here, treat it as a bug and log it under tracker/calibration-review.md.
Mode inputs:
append needs: timestamp_iso8601, decision_type, emitted_by, recommendation, signals_in, variants (advocate + critic), dialectical_synthesis, red_team_findings, confidence, will_verify_on.calibrate needs: decision_id, outcome_recorded_on, outcome, variant_that_was_right.decision_id format:
{YYYY-MM-DD}-{decision_type}-{NN}
e.g., 2026-04-17-lineup-01
Tilt thresholds (per agent, over last 20 verified decisions):
| Condition | Tilt |
|---|---|
| Total < 10 verified | neutral |
| Advocate correct > Critic correct by 10 to 20 pct points | advocate+ |
| Advocate correct > Critic correct by 20+ pct points | advocate++ |
| Critic correct > Advocate correct by 10 to 20 pct points | critic+ |
| Critic correct > Advocate correct by 20+ pct points | critic++ |
| Within 10 pct points | neutral |
Files touched:
~/Documents/Projects/yahoo-mlb/tracker/decisions-log.md (append, or surgical edit of outcome fields)~/Documents/Projects/yahoo-mlb/tracker/variant-scoreboard.md (row increments, tilt recompute)~/Documents/Projects/yahoo-mlb/tracker/calibration-review.md (written only when high-confidence decision was wrong)Files read:
~/Documents/Projects/yahoo-mlb/context/frameworks/decision-log-format.md (schema of record)~/Documents/Projects/yahoo-mlb/tracker/decisions-log.md (tail, for serialization and id assignment)~/Documents/Projects/yahoo-mlb/tracker/variant-scoreboard.md (for calibration updates)Key resources:
decision-log-format.md.Inputs required:
decision_id and verified outcome fields.Outputs produced:
decision_id, confirmation that entry is in the log.testing
--- name: advisory-edit description: A strict advisory-only editing discipline for a writer who dictates ("speaks out") essays and wants help WITHOUT having their voice changed. The editor directs structure, flags grammar, and suggests strategic language — but never modifies the writer's text unless the writer explicitly says "apply" / "make that change" / "rewrite this." Produces a line-referenced, suggestion-only critique where every item is marked the writer's call. Four passes: structural, l
testing
Provides the house style for analyst-grade strategist writing — third-person register with sparing first-person, no em dashes, no "not X, not Y, not Z" negation cascades, numbered footnote citations rather than inline source parentheticals, specific opinion-signaling phrases, and topic-forward paragraph structure modeled on voice patterns observed in Damodaran's Musings on Markets and Thompson's Stratechery. Use when consolidating working notes into a finished long-form strategist or analyst report that must read as written by a senior human analyst rather than an AI assistant.
testing
Renders a markdown report to a PDF using pandoc with xelatex (11pt serif body, 1-inch margins, numbered footnotes, formal heading hierarchy). Requires a one-time install of pandoc and a LaTeX engine on the user's machine — basictex on macOS or texlive-xetex on Linux. Does not attempt automatic install. Fails loudly with the exact install commands if pandoc or xelatex is missing on the user's PATH. Use when producing a finished strategist or analyst report PDF from a polished markdown source.
testing
Produces step-by-step computational walkthroughs of vector and matrix operations as a sequence of numbered "frames", showing the explicit state at each step. The text-equivalent of a 3Blue1Brown animation — each frame shows what changed and why, so the learner can re-trace the operation by hand. Use when the learner needs to *see* a computation unfold (eigenvalue computation, attention with 3 tokens, gradient descent step, SVD on a 2×2, layer norm on a 3-vector, softmax of a small input), when an explanation has been given but the learner needs to ground it in a worked example, or when introducing an operation that's intimidating in symbol form but trivial in pencil-and-paper form.