MLB Category State Analyzer

Example
Workflow
Common Patterns
Guardrails
Quick Reference

Example

Scenario: Week 3, K L's Boomers (Team 5) vs. Los Doyers. Wednesday AM (mid-week). 4 scoring days remain.

Raw matchup scores (pulled from Yahoo matchup page):

| Cat | Us | Opp | Margin | Games left (us/opp) | |---|---|---|---|---| | R | 28 | 31 | -3 | 26 / 22 | | HR | 9 | 7 | +2 | 26 / 22 | | RBI | 30 | 29 | +1 | 26 / 22 | | SB | 4 | 6 | -2 | 26 / 22 | | OBP | .342 (82 PA) | .336 (78 PA) | +.006 | 26 / 22 GP | | K | 42 | 38 | +4 | 9 SP starts / 7 SP starts | | ERA | 3.80 (21 IP) | 4.12 (19 IP) | -0.32 (better) | 9 / 7 | | WHIP | 1.18 (21 IP) | 1.25 (19 IP) | -0.07 (better) | 9 / 7 | | QS | 2 | 1 | +1 | 9 / 7 | | SV | 3 | 5 | -2 | ~8 RP days / ~8 RP days |

Projections built for the sim (rest-of-week {mean, stddev} — see resources/methodology.md):

| Cat | Our projection | Opp projection | |---|---|---| | R | final 52 ± 9 | final 57 ± 8 | | HR | final 15 ± 3.5 | final 13 ± 3.2 | | RBI | final 52 ± 9 | final 55 ± 8 | | SB | final 6 ± 2.3 | final 10 ± 2.5 | | OBP | .346 ± .015 | .341 ± .014 | | K | final 96 ± 11 | final 85 ± 10 | | ERA | 3.88 ± 0.40 | 4.05 ± 0.45 | | WHIP | 1.20 ± 0.07 | 1.25 ± 0.08 | | QS | final 6.1 ± 1.5 | final 3.8 ± 1.4 | | SV | final 4.8 ± 1.4 | final 7.7 ± 1.5 |

Delegate to matchup-win-probability-sim with:

cat_list = [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]
cat_inverse_list = [ERA, WHIP]
cat_win_threshold = 6
our_per_cat_projection / opp_per_cat_projection from the table above
sim_mode = "monte_carlo", n_simulations = 10000, random_seed = 42

Sim output (consumed by this skill):

matchup_win_probability = 0.58
per_cat_win_probability: R 0.36, HR 0.65, RBI 0.42, SB 0.16, OBP 0.60, K 0.74, ERA 0.62, WHIP 0.68, QS 0.85, SV 0.10
expected_cats_won = 5.18

Per-cat signals (derived here from sim output + baseball state — see resources/methodology.md):

| Cat | Position (from state) | Pressure (state + pace) | Reachability (= round(100 × p_cat)) | Punt Score (= f(1 − p_cat) + volatility) | Verdict | |---|---|---|---|---|---| | R | losing | 72 | 36 | 44 | push (contested) | | HR | winning | 48 | 65 | 21 | maintain | | RBI | winning (thin) | 65 | 42 | 35 | push | | SB | losing | 55 | 16 | 58 | evaluate punt | | OBP | winning (thin) | 70 | 60 | 24 | push | | K | winning | 55 | 74 | 16 | push | | ERA | winning | 62 | 62 | 23 | push | | WHIP | winning | 60 | 68 | 19 | maintain | | QS | winning | 78 | 85 | 9 | push hard | | SV | losing | 38 | 10 | 84 | punt |

Overall recommendation: Push 6, maintain 2, punt 2. Matchup win prob 58% (neutral favorite).

Push (6): HR, OBP, K, ERA, QS — each has per_cat_win_probability ≥ 0.60. Plus RBI as the contested-but-reachable 6th.
Maintain (2): WHIP (locked-ish), R (reachability lowish but not a true punt).
Punt (2): SB (p = 0.16, low reach) and SV (p = 0.10 + volatility bonus → punt score 84).

Downstream implications for other agents:

Lineup optimizer: matchup_win_probability = 0.58 → neutral-to-favorite, standard daily_quality optimization (no variance tilt).
Waiver analyst: prioritize SP (QS, K, ERA), OBP-heavy bats; not closers or speed specialists.
Streaming strategist: every QS-capable SP starts; skip any 5-inning risk arm.

Workflow

Copy this checklist and track progress:

MLB Category State Analysis Progress:
- [ ] Step 1: Pull current matchup scores from Yahoo
- [ ] Step 2: Count remaining games/PAs/IP for both rosters
- [ ] Step 3: Build per-cat projection dicts ({mean, stddev}) for both rosters
- [ ] Step 4: Delegate to matchup-win-probability-sim (pass cat_list, projections, threshold=6, inverse=[ERA,WHIP])
- [ ] Step 5: Derive cat_position (from state), cat_pressure, cat_reachability, cat_punt_score from sim output
- [ ] Step 6: Rank cats and emit push/maintain/punt plan
- [ ] Step 7: Write signal file with YAML frontmatter (include matchup_win_probability from sim)

Step 1: Pull current matchup scores

Web-fetch the Yahoo matchup page: https://baseball.fantasysports.yahoo.com/b1/23756/5/matchup?week=N. Extract current totals for both teams in each of the 10 cats. For ratio cats (OBP, ERA, WHIP), also capture the denominator (PAs for OBP, IP for ERA/WHIP). This is required — you cannot build a ratio-cat projection without the volume underlying the ratio.

[ ] 5 batting cats: R, HR, RBI, SB, OBP (+ at-bats / plate-appearances)
[ ] 5 pitching cats: K, ERA, WHIP, QS, SV (+ innings pitched)
[ ] Source URL cited in signal file

See resources/methodology.md for scrape procedure and fallback if Yahoo is unreachable.

Step 2: Count remaining games/PAs/IP

For each roster, count the number of MLB games its players will play for the rest of the scoring period, and project PAs (hitters) and IP (pitchers).

[ ] Hitter games remaining: sum of (each rostered hitter's team games × probability they start)
[ ] Pitcher starts remaining: number of scheduled SP starts for the rest of the week per roster
[ ] Reliever days remaining: days × eligible RPs (for SV projection)
[ ] Volume imbalance: if one team has meaningfully more games, that will show up directly in the projection means (and so in per_cat_win_probability)

Use MLB.com schedules + probable pitcher grids. See resources/methodology.md.

Step 3: Build per-cat projection dicts

For each team, build a dict {cat: {mean, stddev}} where mean is the projected final (or remaining, consistently used across both teams — pick one convention) and stddev reflects uncertainty given remaining volume.

[ ] Counting cats (R, HR, RBI, SB, K, QS, SV): mean = current_total + Σ(per-player per-game rate × games remaining × daily_quality). stddev ≈ 0.35 × expected_remaining as a default CV.
[ ] Ratio cats (OBP, ERA, WHIP): mean = (current_ratio × current_volume + projected_remaining_ratio × remaining_volume) / total_volume. stddev ≈ σ_per_obs / sqrt(total_volume) — shrinks as total IP/PA grows.
[ ] Both dicts have identical keys and the exact league cat_list.
[ ] Use OBP (not AVG) and qs_probability (not W) from upstream mlb-player-analyzer signals — see Guardrails.

See resources/methodology.md.

Step 4: Delegate to matchup-win-probability-sim

Invoke the sibling skill with a well-formed input payload:

inputs to matchup-win-probability-sim:
  cat_list:           [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]
  cat_inverse_list:   [ERA, WHIP]
  cat_win_threshold:  6
  our_per_cat_projection:  <dict from Step 3>
  opp_per_cat_projection:  <dict from Step 3>
  sim_mode:           "monte_carlo"
  n_simulations:      10000
  random_seed:        42
  tie_rule:           "half"

outputs consumed:
  matchup_win_probability  (float in [0,1])
  per_cat_win_probability  (dict[cat, float])
  expected_cats_won        (float)
  variance_estimate        (float)

[ ] All 10 cats present in both projection dicts
[ ] cat_inverse_list = [ERA, WHIP] (lower-is-better)
[ ] cat_win_threshold = 6 (Yahoo 10-cat majority)
[ ] Seed passed for reproducibility
[ ] Sim output fields captured and stored for Step 5

Step 5: Derive per-cat signals from sim output + state

Apply the formulas in resources/methodology.md. The sim owns the probability math; this skill owns the baseball-state interpretation.

[ ] cat_position ∈ {winning, tied, losing} — computed locally from current totals (not sim). Ratio-cat direction handled (OBP higher = winning; ERA/WHIP lower = winning).
[ ] cat_pressure (0–100) — simple arithmetic from position + close-margin + volume-edge + locked-in flags. See pressure formula in Quick Reference.
[ ] cat_reachability (0–100) — now = round(100 × per_cat_win_probability[cat]), taken directly from the sim.
[ ] cat_punt_score (0–100) — (100 × (1 − per_cat_win_probability[cat])) × 0.6 + 30 × is_volatile + 20 × below_min_threshold − 10 × has_spillover, clamped.

Step 6: Rank and emit plan

Rank all 10 cats by cat_pressure × cat_reachability / 100:

[ ] Top 6: push — mark these as priority for waivers, streams, starts
[ ] Middle 2: maintain — hold position, don't overspend
[ ] Bottom 2: evaluate punt — if cat_punt_score > 60, confirm punt; otherwise hold

Goal in H2H Cats is 6-of-10. A defensible plan is "push 6, concede up to 4." See resources/template.md for the output signal format.

Step 7: Write signal file

Write to signals/YYYY-MM-DD-cat-state.md with YAML frontmatter (type: cat-state). Include matchup_win_probability from the sim as a top-level field. Validate with mlb-signal-emitter before persisting.

[ ] All 10 cats present with all 4 signals each
[ ] matchup_win_probability and expected_cats_won recorded in frontmatter
[ ] sim_meta block (sim_mode, n_simulations, random_seed) recorded for reproducibility
[ ] Confidence reflects data quality (lower if Yahoo scrape was partial)
[ ] source_urls includes Yahoo matchup page + MLB.com schedule pages + a reference to the sim skill
[ ] Red-team findings noted (e.g., "Opp has a two-start ace coming that could flip K + ERA + WHIP all at once")

Validate output using resources/evaluators/rubric_mlb_category_state_analyzer.json. Minimum: average score of 3.5 or above.

Common Patterns

Pattern 1: Balanced mid-week state

Typical Wednesday AM state: 3-4 cats already locked, 3-4 close, 2-3 volatile.
Action: push the close cats hardest, coast the locked wins, ignore locked losses.
The sim's per-cat probs already reflect this — cats with p ∈ [0.40, 0.65] are the contested ones.

Pattern 2: Volume-imbalanced matchup

We have 30 hitter games left, opp has 22. Our counting-cat projection means rise; sim's per_cat_win_probability for R/HR/RBI/SB rises accordingly.
Action: stack the lineup (fewer off-days, prefer teams playing doubleheaders), bid on streamers. Pressure boost comes from the volume-edge flag, reachability boost comes automatically from the sim.

Pattern 3: Two-start ace incoming (us or them)

One pitcher's two-start week can swing K, ERA, WHIP, QS simultaneously.
Encode this in the projection dict: their expected IP and K rise, ERA/WHIP means improve (toward their ERA/WHIP), QS mean rises by ~0.45 per expected QS-quality start.
The sim then shows 4 pitching cats moving together in per_cat_win_probability deltas.

Pattern 4: Save-category volatility

SVs are low-frequency; one walkoff blown save flips the category.
In the projection dict, use a low mean (≤ 2.5/week per locked closer) and moderate stddev (≥ 1.2). The sim will naturally report per_cat_win_probability near 0.1–0.25 when behind by 2+.
The +30 volatility bonus in cat_punt_score (applied here, not in the sim) pushes SV to punt when sim reachability agrees.

Pattern 5: Ratio-cat "freeze"

Late in the week, if opp is far below the IP/PA minimum (e.g., has 9 IP on Friday with no more starts), their ratio cats are locked at whatever they have.
Encode by setting opp ratio-cat stddev near zero and their mean at a punitive-or-forfeited value. The sim then returns per_cat_win_probability ≈ 1.0 for those cats.

Guardrails

Never compute OBP/ERA/WHIP from rates alone — always include volume (PA/IP). A .400 OBP in 10 PAs is not better than .342 in 82 PAs. The projection-dict mean/stddev for ratio cats must come from the weighted-average formula; the sim takes those as truth.
QS is the #1 category, not Wins. This league uses Quality Starts (6+ IP, ≤3 ER). A 5-inning outing scores zero. When projecting remaining QS, multiply each SP start by its QS probability (from mlb-player-analyzer's qs_probability signal) — don't just count scheduled starts.
OBP is the #5 category, not AVG. Walks count. When projecting OBP contribution, use players' OBP (not AVG). A high-BB, low-AVG player like Juan Soto is worth more in this league than his raw hit rate suggests.
SV is volatile — trust the punt when signals agree. Unlike counting batting cats, a 2-save deficit with 3 days left has low per_cat_win_probability regardless of roster. Don't fight for saves if the closer role on your roster isn't locked (check save_role_certainty < 70 → automatic punt candidate). The volatility bonus in cat_punt_score is applied here, not in the sim — the sim returns raw probability.
cat_reachability comes from the sim — don't recompute. This is a delegation. If the sim returns per_cat_win_probability[R] = 0.36, then cat_reachability[R] = 36. Do not apply z-score shortcuts or best/worst-case buckets here — those lived in the old heuristic and are now owned by the sim skill.
Locked-in cats get pressure adjustments, not zero. A locked-in win still has cat_pressure ≈ 40 (it's banked). A locked-in loss still has cat_pressure ≈ 20 (stop investing). Don't set them to zero — downstream agents use non-zero values to decide bench vs. drop.
Ratio cats need the minimum-IP/PA rule. Yahoo enforces minimums for pitcher ratio cats (usually 20 IP for the week). If either roster is tracking below the minimum late in the week, the ratio cat may auto-loss. Encode this in the projection dict (stddev → 0, mean → punitive) before calling the sim, AND add +20 below_min_threshold to cat_punt_score.
Never re-derive upstream signals. qs_probability, sb_opportunity, obp_contribution, save_role_certainty come from mlb-player-analyzer. Read them from the signal directory; do not recompute.
Always pass a random_seed to the sim. Without it, two runs of this skill produce slightly different cat_reachability values, which will confuse downstream agents doing diff comparisons. Default seed: 42.

Quick Reference

Where the math lives now:

| Signal | Owner | Formula | |---|---|---| | cat_position | this skill | enum from current totals (ratio-direction aware) | | cat_pressure | this skill | baseline 50 + 20 × close + 15 × vol-edge − 10 × locked_win − 30 × locked_loss | | cat_reachability | delegated to matchup-win-probability-sim | = round(100 × per_cat_win_probability[cat]) | | cat_punt_score | this skill (uses sim output) | (100 × (1 − p_cat)) × 0.6 + 30 × volatile + 20 × below_min − 10 × spillover | | matchup_win_probability | delegated to matchup-win-probability-sim | Monte Carlo P(cats_won ≥ 6) |

cat_pressure =
    50                                         # neutral baseline
  + 20 × (is_close_margin: deficit/lead ≤ 10% of total)
  + 15 × (opponent_volume_exhausted: we have more games left)
  - 10 × (locked_in_win)
  - 30 × (locked_in_loss)
  clamp(0, 100)

cat_reachability = round(100 × per_cat_win_probability[cat])   # from sim

cat_punt_score =
    (100 - cat_reachability) × 0.6                      # base: if we can't reach, consider punting
  + 30 × (cat is traditionally volatile: SV)
  + 20 × (below min-PA/IP threshold)
  - 10 × (cat has spillover: K→QS, OBP→R, HR→R+RBI)
  clamp(0, 100)

League constants (from context/league-config.md):

10 cats: R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV
Inverse cats: ERA, WHIP (lower-is-better — passed as cat_inverse_list to the sim)
OBP (not AVG) — walks matter
QS (not W) — 6+ IP with ≤3 ER
H2H Cats, goal = win 6+ of 10 each week (cat_win_threshold = 6)
Daily lineup lock; weekly matchup rolls Mon-Sun

Signal file output schema (from context/frameworks/signal-framework.md):

---
type: cat-state
date: YYYY-MM-DD
emitted_by: mlb-category-state-analyzer
week: N
matchup_opponent: <team name>
scoring_days_remaining: N
matchup_win_probability: 0.58          # from matchup-win-probability-sim
expected_cats_won: 5.18                # from matchup-win-probability-sim
sim_meta:
  sim_mode: monte_carlo
  n_simulations: 10000
  random_seed: 42
synthesis_confidence: 0.0-1.0
source_urls:
  - https://baseball.fantasysports.yahoo.com/b1/23756/5/matchup?week=N
---

Body: per-cat table + overall push/maintain/punt recommendation + red-team findings.

Thresholds used downstream:

| Agent | Threshold | Effect | |---|---|---| | Waiver analyst | cat_pressure ≥ 60 | Prioritize targets that fill that cat | | Streaming strategist | cat_pressure (ERA/WHIP) < 30 | Allow riskier streamers (we're punting) | | Lineup optimizer | matchup_win_probability < 0.4 / > 0.6 | Variance-seek as underdog / damp as favorite | | Trade analyzer | weights trade_cat_delta | Multiplied by cat_pressure / 50 |

Key resources:

resources/template.md: Output signal file format, per-cat table, sim-integration worked example
resources/methodology.md: Yahoo scrape procedure, remaining-games projection, building per-cat projection dicts, sim-integration formulas
resources/evaluators/rubric_mlb_category_state_analyzer.json: Evaluator rubric
Sibling skill: matchup-win-probability-sim — owns per-cat and matchup-level win-probability math via Monte Carlo / Poisson-binomial

Inputs required:

Current matchup scores (10 cats, both teams, with volume for ratio cats)
Roster IDs for both teams
Remaining MLB schedule through Sunday
Upstream signals: qs_probability, save_role_certainty, obp_contribution, sb_opportunity, daily_quality
League config (cats list, min-IP/PA thresholds, cat_win_threshold)

Outputs produced:

signals/YYYY-MM-DD-cat-state.md — signal file with 10-cat table, overall plan, matchup_win_probability, confidence, source URLs
cat_position, cat_pressure, cat_reachability, cat_punt_score per cat
Overall "push N, maintain M, punt P" recommendation (N + M + P = 10, target N ≥ 6)
matchup_win_probability (from sim delegate) for lineup-optimizer variance decisions

MLB Category State Analyzer

Example
Workflow
Common Patterns
Guardrails
Quick Reference

Example

Scenario: Week 3, K L's Boomers (Team 5) vs. Los Doyers. Wednesday AM (mid-week). 4 scoring days remain.

Raw matchup scores (pulled from Yahoo matchup page):

Projections built for the sim (rest-of-week {mean, stddev} — see resources/methodology.md):

Delegate to matchup-win-probability-sim with:

cat_list = [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]
cat_inverse_list = [ERA, WHIP]
cat_win_threshold = 6
our_per_cat_projection / opp_per_cat_projection from the table above
sim_mode = "monte_carlo", n_simulations = 10000, random_seed = 42

Sim output (consumed by this skill):

matchup_win_probability = 0.58
per_cat_win_probability: R 0.36, HR 0.65, RBI 0.42, SB 0.16, OBP 0.60, K 0.74, ERA 0.62, WHIP 0.68, QS 0.85, SV 0.10
expected_cats_won = 5.18

Per-cat signals (derived here from sim output + baseball state — see resources/methodology.md):

Overall recommendation: Push 6, maintain 2, punt 2. Matchup win prob 58% (neutral favorite).

Push (6): HR, OBP, K, ERA, QS — each has per_cat_win_probability ≥ 0.60. Plus RBI as the contested-but-reachable 6th.
Maintain (2): WHIP (locked-ish), R (reachability lowish but not a true punt).
Punt (2): SB (p = 0.16, low reach) and SV (p = 0.10 + volatility bonus → punt score 84).

Downstream implications for other agents:

Lineup optimizer: matchup_win_probability = 0.58 → neutral-to-favorite, standard daily_quality optimization (no variance tilt).
Waiver analyst: prioritize SP (QS, K, ERA), OBP-heavy bats; not closers or speed specialists.
Streaming strategist: every QS-capable SP starts; skip any 5-inning risk arm.

Workflow

Copy this checklist and track progress:

MLB Category State Analysis Progress:
- [ ] Step 1: Pull current matchup scores from Yahoo
- [ ] Step 2: Count remaining games/PAs/IP for both rosters
- [ ] Step 3: Build per-cat projection dicts ({mean, stddev}) for both rosters
- [ ] Step 4: Delegate to matchup-win-probability-sim (pass cat_list, projections, threshold=6, inverse=[ERA,WHIP])
- [ ] Step 5: Derive cat_position (from state), cat_pressure, cat_reachability, cat_punt_score from sim output
- [ ] Step 6: Rank cats and emit push/maintain/punt plan
- [ ] Step 7: Write signal file with YAML frontmatter (include matchup_win_probability from sim)

Step 1: Pull current matchup scores

[ ] 5 batting cats: R, HR, RBI, SB, OBP (+ at-bats / plate-appearances)
[ ] 5 pitching cats: K, ERA, WHIP, QS, SV (+ innings pitched)
[ ] Source URL cited in signal file

See resources/methodology.md for scrape procedure and fallback if Yahoo is unreachable.

Step 2: Count remaining games/PAs/IP

For each roster, count the number of MLB games its players will play for the rest of the scoring period, and project PAs (hitters) and IP (pitchers).

[ ] Hitter games remaining: sum of (each rostered hitter's team games × probability they start)
[ ] Pitcher starts remaining: number of scheduled SP starts for the rest of the week per roster
[ ] Reliever days remaining: days × eligible RPs (for SV projection)
[ ] Volume imbalance: if one team has meaningfully more games, that will show up directly in the projection means (and so in per_cat_win_probability)

Use MLB.com schedules + probable pitcher grids. See resources/methodology.md.

Step 3: Build per-cat projection dicts

[ ] Counting cats (R, HR, RBI, SB, K, QS, SV): mean = current_total + Σ(per-player per-game rate × games remaining × daily_quality). stddev ≈ 0.35 × expected_remaining as a default CV.
[ ] Ratio cats (OBP, ERA, WHIP): mean = (current_ratio × current_volume + projected_remaining_ratio × remaining_volume) / total_volume. stddev ≈ σ_per_obs / sqrt(total_volume) — shrinks as total IP/PA grows.
[ ] Both dicts have identical keys and the exact league cat_list.
[ ] Use OBP (not AVG) and qs_probability (not W) from upstream mlb-player-analyzer signals — see Guardrails.

See resources/methodology.md.

Step 4: Delegate to matchup-win-probability-sim

Invoke the sibling skill with a well-formed input payload:

inputs to matchup-win-probability-sim:
  cat_list:           [R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV]
  cat_inverse_list:   [ERA, WHIP]
  cat_win_threshold:  6
  our_per_cat_projection:  <dict from Step 3>
  opp_per_cat_projection:  <dict from Step 3>
  sim_mode:           "monte_carlo"
  n_simulations:      10000
  random_seed:        42
  tie_rule:           "half"

outputs consumed:
  matchup_win_probability  (float in [0,1])
  per_cat_win_probability  (dict[cat, float])
  expected_cats_won        (float)
  variance_estimate        (float)

[ ] All 10 cats present in both projection dicts
[ ] cat_inverse_list = [ERA, WHIP] (lower-is-better)
[ ] cat_win_threshold = 6 (Yahoo 10-cat majority)
[ ] Seed passed for reproducibility
[ ] Sim output fields captured and stored for Step 5

Step 5: Derive per-cat signals from sim output + state

Apply the formulas in resources/methodology.md. The sim owns the probability math; this skill owns the baseball-state interpretation.

[ ] cat_position ∈ {winning, tied, losing} — computed locally from current totals (not sim). Ratio-cat direction handled (OBP higher = winning; ERA/WHIP lower = winning).
[ ] cat_pressure (0–100) — simple arithmetic from position + close-margin + volume-edge + locked-in flags. See pressure formula in Quick Reference.
[ ] cat_reachability (0–100) — now = round(100 × per_cat_win_probability[cat]), taken directly from the sim.
[ ] cat_punt_score (0–100) — (100 × (1 − per_cat_win_probability[cat])) × 0.6 + 30 × is_volatile + 20 × below_min_threshold − 10 × has_spillover, clamped.

Step 6: Rank and emit plan

Rank all 10 cats by cat_pressure × cat_reachability / 100:

[ ] Top 6: push — mark these as priority for waivers, streams, starts
[ ] Middle 2: maintain — hold position, don't overspend
[ ] Bottom 2: evaluate punt — if cat_punt_score > 60, confirm punt; otherwise hold

Goal in H2H Cats is 6-of-10. A defensible plan is "push 6, concede up to 4." See resources/template.md for the output signal format.

Step 7: Write signal file

[ ] All 10 cats present with all 4 signals each
[ ] matchup_win_probability and expected_cats_won recorded in frontmatter
[ ] sim_meta block (sim_mode, n_simulations, random_seed) recorded for reproducibility
[ ] Confidence reflects data quality (lower if Yahoo scrape was partial)
[ ] source_urls includes Yahoo matchup page + MLB.com schedule pages + a reference to the sim skill
[ ] Red-team findings noted (e.g., "Opp has a two-start ace coming that could flip K + ERA + WHIP all at once")

Validate output using resources/evaluators/rubric_mlb_category_state_analyzer.json. Minimum: average score of 3.5 or above.

Common Patterns

Pattern 1: Balanced mid-week state

Typical Wednesday AM state: 3-4 cats already locked, 3-4 close, 2-3 volatile.
Action: push the close cats hardest, coast the locked wins, ignore locked losses.
The sim's per-cat probs already reflect this — cats with p ∈ [0.40, 0.65] are the contested ones.

Pattern 2: Volume-imbalanced matchup

We have 30 hitter games left, opp has 22. Our counting-cat projection means rise; sim's per_cat_win_probability for R/HR/RBI/SB rises accordingly.
Action: stack the lineup (fewer off-days, prefer teams playing doubleheaders), bid on streamers. Pressure boost comes from the volume-edge flag, reachability boost comes automatically from the sim.

Pattern 3: Two-start ace incoming (us or them)

One pitcher's two-start week can swing K, ERA, WHIP, QS simultaneously.
Encode this in the projection dict: their expected IP and K rise, ERA/WHIP means improve (toward their ERA/WHIP), QS mean rises by ~0.45 per expected QS-quality start.
The sim then shows 4 pitching cats moving together in per_cat_win_probability deltas.

Pattern 4: Save-category volatility

SVs are low-frequency; one walkoff blown save flips the category.
In the projection dict, use a low mean (≤ 2.5/week per locked closer) and moderate stddev (≥ 1.2). The sim will naturally report per_cat_win_probability near 0.1–0.25 when behind by 2+.
The +30 volatility bonus in cat_punt_score (applied here, not in the sim) pushes SV to punt when sim reachability agrees.

Pattern 5: Ratio-cat "freeze"

Late in the week, if opp is far below the IP/PA minimum (e.g., has 9 IP on Friday with no more starts), their ratio cats are locked at whatever they have.
Encode by setting opp ratio-cat stddev near zero and their mean at a punitive-or-forfeited value. The sim then returns per_cat_win_probability ≈ 1.0 for those cats.

Guardrails

Never compute OBP/ERA/WHIP from rates alone — always include volume (PA/IP). A .400 OBP in 10 PAs is not better than .342 in 82 PAs. The projection-dict mean/stddev for ratio cats must come from the weighted-average formula; the sim takes those as truth.
QS is the #1 category, not Wins. This league uses Quality Starts (6+ IP, ≤3 ER). A 5-inning outing scores zero. When projecting remaining QS, multiply each SP start by its QS probability (from mlb-player-analyzer's qs_probability signal) — don't just count scheduled starts.
OBP is the #5 category, not AVG. Walks count. When projecting OBP contribution, use players' OBP (not AVG). A high-BB, low-AVG player like Juan Soto is worth more in this league than his raw hit rate suggests.
SV is volatile — trust the punt when signals agree. Unlike counting batting cats, a 2-save deficit with 3 days left has low per_cat_win_probability regardless of roster. Don't fight for saves if the closer role on your roster isn't locked (check save_role_certainty < 70 → automatic punt candidate). The volatility bonus in cat_punt_score is applied here, not in the sim — the sim returns raw probability.
cat_reachability comes from the sim — don't recompute. This is a delegation. If the sim returns per_cat_win_probability[R] = 0.36, then cat_reachability[R] = 36. Do not apply z-score shortcuts or best/worst-case buckets here — those lived in the old heuristic and are now owned by the sim skill.
Locked-in cats get pressure adjustments, not zero. A locked-in win still has cat_pressure ≈ 40 (it's banked). A locked-in loss still has cat_pressure ≈ 20 (stop investing). Don't set them to zero — downstream agents use non-zero values to decide bench vs. drop.
Ratio cats need the minimum-IP/PA rule. Yahoo enforces minimums for pitcher ratio cats (usually 20 IP for the week). If either roster is tracking below the minimum late in the week, the ratio cat may auto-loss. Encode this in the projection dict (stddev → 0, mean → punitive) before calling the sim, AND add +20 below_min_threshold to cat_punt_score.
Never re-derive upstream signals. qs_probability, sb_opportunity, obp_contribution, save_role_certainty come from mlb-player-analyzer. Read them from the signal directory; do not recompute.
Always pass a random_seed to the sim. Without it, two runs of this skill produce slightly different cat_reachability values, which will confuse downstream agents doing diff comparisons. Default seed: 42.

Quick Reference

Where the math lives now:

cat_pressure =
    50                                         # neutral baseline
  + 20 × (is_close_margin: deficit/lead ≤ 10% of total)
  + 15 × (opponent_volume_exhausted: we have more games left)
  - 10 × (locked_in_win)
  - 30 × (locked_in_loss)
  clamp(0, 100)

cat_reachability = round(100 × per_cat_win_probability[cat])   # from sim

cat_punt_score =
    (100 - cat_reachability) × 0.6                      # base: if we can't reach, consider punting
  + 30 × (cat is traditionally volatile: SV)
  + 20 × (below min-PA/IP threshold)
  - 10 × (cat has spillover: K→QS, OBP→R, HR→R+RBI)
  clamp(0, 100)

League constants (from context/league-config.md):

10 cats: R, HR, RBI, SB, OBP, K, ERA, WHIP, QS, SV
Inverse cats: ERA, WHIP (lower-is-better — passed as cat_inverse_list to the sim)
OBP (not AVG) — walks matter
QS (not W) — 6+ IP with ≤3 ER
H2H Cats, goal = win 6+ of 10 each week (cat_win_threshold = 6)
Daily lineup lock; weekly matchup rolls Mon-Sun

Signal file output schema (from context/frameworks/signal-framework.md):

---
type: cat-state
date: YYYY-MM-DD
emitted_by: mlb-category-state-analyzer
week: N
matchup_opponent: <team name>
scoring_days_remaining: N
matchup_win_probability: 0.58          # from matchup-win-probability-sim
expected_cats_won: 5.18                # from matchup-win-probability-sim
sim_meta:
  sim_mode: monte_carlo
  n_simulations: 10000
  random_seed: 42
synthesis_confidence: 0.0-1.0
source_urls:
  - https://baseball.fantasysports.yahoo.com/b1/23756/5/matchup?week=N
---

Body: per-cat table + overall push/maintain/punt recommendation + red-team findings.

Thresholds used downstream:

Key resources:

resources/template.md: Output signal file format, per-cat table, sim-integration worked example
resources/methodology.md: Yahoo scrape procedure, remaining-games projection, building per-cat projection dicts, sim-integration formulas
resources/evaluators/rubric_mlb_category_state_analyzer.json: Evaluator rubric
Sibling skill: matchup-win-probability-sim — owns per-cat and matchup-level win-probability math via Monte Carlo / Poisson-binomial

Inputs required:

Current matchup scores (10 cats, both teams, with volume for ratio cats)
Roster IDs for both teams
Remaining MLB schedule through Sunday
Upstream signals: qs_probability, save_role_certainty, obp_contribution, sb_opportunity, daily_quality
League config (cats list, min-IP/PA thresholds, cat_win_threshold)

Outputs produced:

signals/YYYY-MM-DD-cat-state.md — signal file with 10-cat table, overall plan, matchup_win_probability, confidence, source URLs
cat_position, cat_pressure, cat_reachability, cat_punt_score per cat
Overall "push N, maintain M, punt P" recommendation (N + M + P = 10, target N ≥ 6)
matchup_win_probability (from sim delegate) for lineup-optimizer variance decisions

Adoption

lyndonkl/mlb-category-state-analyzer

$ install --global

Security Scan Results

SKILL.md

MLB Category State Analyzer

Table of Contents

Example

Workflow

Common Patterns

Guardrails

Quick Reference

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation

lyndonkl/mlb-category-state-analyzer

$ install --global

Security Scan Results

SKILL.md

MLB Category State Analyzer

Table of Contents

Example

Workflow

Common Patterns

Guardrails

Quick Reference

Related Skills

lyndonkl/conf-theme-clustering

lyndonkl/conf-schedule-optimization

lyndonkl/conf-program-extraction

lyndonkl/conf-preference-elicitation