skills/opponent-archetype-classifier/SKILL.md
Classifies an opposing player, manager, or agent into one of a configurable archetype set using Bayesian inference over observed behavior (roster composition, transaction pattern, lineup moves, trade activity). Domain-neutral scaffold -- callers supply the archetype taxonomy (names, priors, characteristic feature distributions) and observed features; the skill returns a normalized posterior, MAP archetype, classification confidence, feature-contribution breakdown, and best-response hints. Use when modeling opponents, classifying player types, performing Bayesian archetype inference, producing opponent posteriors, or when user mentions opponent archetype, classify opponent, Bayesian archetype inference, player type classification, opponent modeling, or archetype posterior.
npx skillsauth add lyndonkl/claude opponent-archetype-classifierInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Scenario: Fantasy baseball league, Week 5. Classify opponent "Manager A" into one of six archetypes -- balanced, stars_and_scrubs, punt_sv, punt_sb, punt_wins_qs, hitter_heavy.
Inputs (abbreviated; full taxonomy in resources/template.md):
archetype_taxonomy:
balanced: {prior: 0.30, feature_distributions: {sp_roster_share: {mean: 0.40, std: 0.06}, closer_count: {mean: 2.0, std: 0.6}, sb_speed_count: {mean: 3.0, std: 1.0}, moves_per_week: {mean: 2.5, std: 1.0}, bid_aggression: {low: 0.5, high: 0.5}}}
stars_and_scrubs: {prior: 0.15, ...}
punt_sv: {prior: 0.15, feature_distributions: {closer_count: {mean: 0.3, std: 0.4}, ...}}
punt_sb: {prior: 0.15, feature_distributions: {sb_speed_count: {mean: 0.8, std: 0.7}, ...}}
punt_wins_qs: {prior: 0.10, feature_distributions: {sp_roster_share: {mean: 0.20, std: 0.05}, closer_count: {mean: 3.0, std: 0.8}, moves_per_week: {mean: 4.0, std: 1.2}, ...}}
hitter_heavy: {prior: 0.15, feature_distributions: {sp_roster_share: {mean: 0.28, std: 0.05}, ...}}
observed_features:
sp_roster_share: 0.22 # low -- thin on starters
closer_count: 3 # heavy RP
sb_speed_count: 2 # moderate speed
moves_per_week: 4.2 # high activity
bid_aggression: high # active FAAB bidder
observation_weight: 0.7 # ~4 weeks of data
Computation (per-feature log-likelihood, summed, exponentiated, multiplied by prior, normalized):
| Archetype | log L(features) | L * prior | Normalized Posterior | |-----------|-----------------|-----------|---------------------| | balanced | -14.8 | 1.1e-7 | 0.04 | | stars_and_scrubs | -12.5 | 5.6e-7 | 0.21 | | punt_sv | -18.2 | 1.8e-9 | 0.00 | | punt_sb | -22.6 | 2.3e-11 | 0.00 | | punt_wins_qs | -10.1 | 1.8e-6 | 0.68 | | hitter_heavy | -13.2 | 4.2e-7 | 0.16 |
Outputs:
posterior:
balanced: 0.04
stars_and_scrubs: 0.21
punt_sv: 0.00
punt_sb: 0.00
punt_wins_qs: 0.68
hitter_heavy: 0.16
map_archetype: punt_wins_qs
classification_confidence: 47.6 # = max_posterior (0.68) * observation_weight (0.7) * 100
best_response_hints:
- "Concede K and QS; lock 6 of remaining 8 cats"
- "Don't stream starting pitchers against them"
- "They will dominate SV and ratios via all-RP staff; push hitting cats hard"
feature_contribution_breakdown:
sp_roster_share: {map_likelihood: 0.92, alternative_max: 0.15, likelihood_ratio: 6.1} # low SP share -- strong punt_wins_qs signal
closer_count: {map_likelihood: 0.52, alternative_max: 0.46, likelihood_ratio: 1.1} # 3 closers fits several archetypes
sb_speed_count: {map_likelihood: 0.38, alternative_max: 0.41, likelihood_ratio: 0.9} # not discriminating
moves_per_week: {map_likelihood: 0.33, alternative_max: 0.28, likelihood_ratio: 1.2}
bid_aggression: {map_likelihood: 0.70, alternative_max: 0.60, likelihood_ratio: 1.2}
assumptions_flagged:
- "Conditional independence across features assumed -- sp_roster_share and moves_per_week may be correlated (active punt strategy drives both)"
- "Feature distributions are SME priors, not empirically fit -- refresh after season 1 with posterior data"
Note on confidence: posterior peaks at 0.68 but observation_weight=0.7 (only 4 weeks of data) dampens confidence to 47.6. Above the 40 threshold, so MAP is reported; but caller is advised that another 2-3 weeks of observation will sharpen the call.
Copy this checklist and track progress:
Opponent Archetype Classification Progress:
- [ ] Step 1: Load archetype taxonomy (names, priors, feature distributions)
- [ ] Step 2: Collect observed features for the target opponent
- [ ] Step 3: Compute per-feature likelihood under each archetype
- [ ] Step 4: Combine likelihoods (assume conditional independence; flag it)
- [ ] Step 5: Apply Bayes rule, normalize posterior
- [ ] Step 6: Select MAP archetype; compute confidence
- [ ] Step 7: Check inconclusive threshold; report or defer
- [ ] Step 8: Produce feature-contribution breakdown and best-response hints
Step 1: Load archetype taxonomy
The caller supplies the taxonomy. See resources/template.md for required fields.
best_response string arrayStep 2: Collect observed features
Observed features must match feature names in the taxonomy. Missing features are dropped (not imputed) and flagged.
observation_weight (0-1) supplied; rises with sample size -- see methodology.mdStep 3: Compute per-feature likelihood
For each (archetype, feature) pair compute P(feature_value | archetype). See methodology.md.
L = (1/(std*sqrt(2*pi))) * exp(-0.5 * ((x - mean)/std)^2)L = P_archetype[category] with Laplace smoothing if zerolog L) to avoid underflow when combining 5+ featuresStep 4: Combine likelihoods (conditional independence)
First-approximation assumption: features are conditionally independent given archetype. This is rarely exactly true; flag it.
sp_roster_share and moves_per_week in fantasy baseball)Step 5: Apply Bayes rule, normalize
posterior_unnorm[a] = exp(sum_log_L[a]) * prior[a]
posterior[a] = posterior_unnorm[a] / sum_a(posterior_unnorm[a])
sum(posterior) == 1.0 within roundingStep 6: Select MAP archetype; compute confidence
map_archetype = argmax(posterior)
classification_confidence = max(posterior) * observation_weight * 100
Step 7: Inconclusive threshold
If classification_confidence < 40:
map_archetype: "inconclusive"Step 8: Feature-contribution breakdown + best-response hints
L(feature | MAP) / max_{a != MAP} L(feature | a)best_response_hints from the MAP archetype's documented string arrayPattern 1: Fantasy sports (baseball, basketball, hockey) manager archetypes
balanced, punt_<cat>, stars_and_scrubs, inactive covering category-league strategy.Pattern 2: Poker opponent archetypes
tight_aggressive (TAG), loose_aggressive (LAG), tight_passive (rock), loose_passive (calling station), maniac.Pattern 3: DFS lineup-construction archetypes
cash_game_optimizer, GPP_ceiling_chaser, contrarian_pivot, chalk_herding.Pattern 4: M&A / auction bidder archetypes
strategic_premium_bidder, financial_disciplined_bidder, fishing_expedition, structured_earnout_preferer.Conditional independence is almost never exactly true. Always flag it. If two features are strongly correlated (|r| > 0.6), merge them into a single composite feature or down-weight one by 50%. Otherwise the confident archetype gets credited twice for the same underlying signal.
Priors matter when data is thin. Uniform priors are a choice, not a neutral default. Use domain-informed priors when the population distribution is known (e.g., in a 12-team fantasy league, inactive has a real base rate of ~1-2 managers, not 1/12).
Numeric stability: work in log-space. Multiplying 5+ small likelihoods underflows to 0 in float64. Sum log-likelihoods, subtract the max log-posterior before exponentiating, then normalize.
Laplace smoothing for categoricals. If an archetype has P(category=X) = 0 in its distribution and the observation is X, the posterior for that archetype becomes 0 -- permanently ruling it out on one data point. Apply add-epsilon smoothing (epsilon = 0.01 is typical).
Inconclusive is a feature, not a failure. A low-confidence classification is valuable information -- it tells the caller to gather more data before committing. Don't force a MAP when confidence is below 40.
Best-response hints come from the taxonomy, not the classifier. The skill should not invent strategy; it should retrieve what the taxonomy author documented for that archetype. If the taxonomy's best_response is empty, return an empty array and note that the taxonomy needs enrichment.
Posterior is a distribution, not a point estimate. When downstream agents consume the output, they should ideally consume the full posterior (and make expected-value decisions over it) rather than collapsing to MAP. Expose both.
Update sequentially as new observations arrive. The current week's posterior becomes next week's prior. See methodology.md for the recursive formula.
Feature contribution breakdown guards against overfitting. If one feature's likelihood ratio is > 10, that single feature is driving the classification -- verify the feature was measured correctly before trusting the result.
Refresh feature distributions with empirical data once available. Initial distributions are SME priors. After N opponents have been labelled and observed, fit the distributions empirically and replace the SME priors.
Key formulas:
Gaussian likelihood:
L(x | a) = (1 / (std_a * sqrt(2*pi))) * exp(-0.5 * ((x - mean_a) / std_a)^2)
Categorical likelihood (with Laplace smoothing, epsilon = 0.01):
L(x = c | a) = (count_a[c] + epsilon) / (sum_c' count_a[c'] + epsilon * num_categories)
Joint likelihood (conditional independence assumption):
L(features | a) = prod_f L(feature_f | a)
Log-space joint likelihood:
log L(features | a) = sum_f log L(feature_f | a)
Bayes posterior:
posterior(a) proportional to L(features | a) * prior(a)
posterior(a) = posterior_unnorm(a) / sum_a' posterior_unnorm(a')
MAP selection:
map_archetype = argmax_a posterior(a)
Classification confidence:
confidence = max_a posterior(a) * observation_weight * 100
if confidence < 40:
map_archetype = "inconclusive"
Sequential update (week t):
prior_t(a) = posterior_{t-1}(a)
posterior_t(a) proportional to L(new_features_t | a) * prior_t(a)
Feature contribution (likelihood ratio):
LR(feature_f) = L(feature_f | MAP) / max_{a != MAP} L(feature_f | a)
Confidence bands:
| Confidence | Interpretation | Action | |------------|---------------|--------| | 0-39 | Inconclusive | Gather more data | | 40-59 | Weak MAP | Treat MAP as tentative; hedge downstream decisions | | 60-79 | Solid MAP | Act on MAP but keep top-2 in mind | | 80-100 | Confident MAP | Commit to MAP |
Key resources:
Inputs required:
archetype_taxonomy: dict of archetype_name -> {prior, feature_distributions, best_response}observed_features: dict of feature_name -> observed valueobservation_weight: 0-1, how much to trust the observation vs the priorarchetype_prior (optional): dict of archetype_name -> prior; defaults to taxonomy priors (or uniform)Outputs produced:
posterior: dict<archetype, probability>, sums to 1map_archetype: string, most likely archetype (or "inconclusive")classification_confidence: 0-100best_response_hints: string[], pulled from the MAP archetype's documented best-responsefeature_contribution_breakdown: dict<feature, {map_likelihood, alternative_max, likelihood_ratio}>assumptions_flagged: string[], e.g., correlated features, smoothing applied, priors forced uniformdevelopment
--- name: zettel-note description: The note-writing discipline for this vault's evergreen knowledge graph, modeled on a Zettelkasten reading companion and governed by the vault conventions. Enforces declarative-claim titles, one claim per note (atomicity), own-words prose with no block quotes, the piped [[slug|Title]] link form, the labeled link-relationship vocabulary (Confirms/Contradicts/Extends/Context/Prerequisite/Builds-on/Applies/Example-of/Contrasts-with), 3-6 links per note, and search-
development
Plans between-round FIFA World Cup Fantasy transfers — budgets the round's free transfer(s), forces out players whose nation has been eliminated, chases fixture-swing drops, upgrades on value, and decides when a rebuild is large enough to fire the Wildcard instead of spending free transfers one at a time. Ranks candidate in/out pairs by EV gain over each player's remaining survival horizon (delta xEV weighted by progression_carry) MINUS transfer cost (a free transfer is cheap, a points hit is real, churning the squad for marginal swings is a critic flag), and tags forced/fixture/upgrade priority. Emits a `transfer-plan` signal. Use when called by wc-squad-architect (whose transfer work this skill is the engine for) and by the strategists in the populate stage when their candidate is transfer-adjacent rather than a full rebuild.
testing
Reads and updates the FIFA World Cup Fantasy tournament state machine (footballfantasy/context/tournament-state.md) — the temporal backbone tracking phase (pre-tournament → group MD1-3 → R32 → R16 → QF → SF → final), budget ($100m group / $105m knockouts), nation cap (3 group, loosening in knockouts), chips remaining, surviving nations, each owned player's elimination-risk horizon, and deadlines. Validates state on load (count/feasibility checks), applies phase transitions, and appends to the append-only state log (never silent overwrite). Use to load state at the start of a run and to commit state changes after the manager makes a move.
development
Validates and persists FIFA World Cup Fantasy signal files to signals/YYYY-MM-DD-<type>.md. Checks the required frontmatter (type, round, date, emitted_by, confidence, source_urls), range-checks declared numeric signals, confirms every factual claim carries a source URL or "manager-provided", rejects unknown signal types, and refuses to persist a signal that fails validation (logging the failure instead). Keeps the inter-agent signal layer auditable so downstream agents can trust what they read and never re-derive it. Use whenever an agent or skill writes a signal.