skills/transaction-categorizer/SKILL.md
Assigns a category and subcategory to a financial transaction by matching its raw description against a configurable taxonomy and rules table, falling back to LLM inference when no rule matches. Emits a normalized merchant name, category path, recurring flag, and confidence score, and proposes new rules from confirmed classifications. Use when categorizing bank, credit-card, or brokerage transactions, building or refining a category taxonomy, or when user mentions transaction categorization, merchant normalization, expense classification, or category rules.
npx skillsauth add lyndonkl/claude transaction-categorizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A transaction is just a description_raw string and a signed amount_cents. This skill turns that into a clean merchant, a category + subcategory, an is_recurring boolean candidate, and a confidence. It applies rules first (cheap, deterministic) and only falls back to LLM inference for the residual.
It also produces learned rules — when a confident classification matches a clear merchant pattern, propose a new rule for the rule table so future identical transactions match for free.
The caller provides:
transactions — array of {id, description_raw, amount_cents, account_id, date, account_type}.taxonomy — the categories.json taxonomy block (top-level → subcategory list).rules — array of existing rules (see Rule format).account_type_hints (optional) — when known, helps disambiguate (e.g., a deposit on a brokerage account is more likely dividends than salary).Categorization Progress:
- [ ] Step 1: Normalize description_raw
- [ ] Step 2: Apply rules in priority order
- [ ] Step 3: Classify residual via LLM with taxonomy guard
- [ ] Step 4: Detect recurring candidates
- [ ] Step 5: Score confidence
- [ ] Step 6: Propose new rules from high-confidence matches
Build a description_normalized for matching only — never overwrite description_raw.
SQ *, TST*, PAYPAL *, CKCD, POS DEBIT, ACH DEBIT). PORTLAND OR, 800-555-1234 CA, #1234).SQ *TRADER JOES #123 PORTLAND OR → TRADER JOES.
For each transaction, walk rules in priority order. First rule whose match substring (case-insensitive) is a substring of description_normalized wins. Apply its merchant, category, subcategory, and is_recurring (if set).
If multiple rules match, the most specific (longest match) wins.
If a rule matches, set source: "rule" and confidence: 1.0.
For unmatched transactions, classify via LLM:
taxonomy — never invent a category.category.subcategory (e.g., food.groceries).merchant name.income.* branch.brokerage or 401k and amount is positive, prefer income.dividends, income.interest_earned, or savings_investment.*.description_raw looks like an internal transfer between two of the user's accounts, classify as financial.transfers_internal.Set source: "llm" and confidence: 0.6–0.9 based on signal strength.
Set is_recurring: true candidate if:
is_recurring: true.This is a candidate — promotion to recurring.json is the recurring-charge-detector skill's job.
| Source | Default confidence |
|---|---|
| Rule match (substring length ≥ 8) | 1.00 |
| Rule match (substring length 4–7) | 0.92 |
| LLM with strong taxonomic signal (e.g., "NETFLIX" → entertainment.streaming) | 0.85 |
| LLM with weak signal | 0.65 |
| Cannot classify above uncategorized | 0.30 |
If confidence < 0.5, mark category: "uncategorized.unknown" and flag for review.
After classification, scan high-confidence LLM matches (confidence ≥ 0.85) where the same description_normalized substring covers ≥ 3 transactions in the input set. For each, propose a new rule and append to rules.proposed[] in the output. The bookkeeper agent confirms these before they merge into categories.json.
The skill respects the taxonomy supplied by the caller. The default taxonomy used by the household-finance team is:
housing → mortgage, rent, property_tax, hoa, home_insurance, home_maintenance,
utilities_electric, utilities_gas, utilities_water, utilities_internet
food → groceries, restaurants, coffee, alcohol
transportation → gas, auto_insurance, auto_maintenance, public_transit, rideshare,
parking, tolls
health → medical_copay, prescriptions, dental, vision, mental_health, gym
personal → clothing, haircare, subscriptions_personal
kids → childcare, school, activities, kids_clothing
entertainment → streaming, events, hobbies, books
travel → flights, lodging, travel_food, travel_other
financial → fees, interest_paid, transfers_internal
income → salary, bonus, interest_earned, dividends, capital_gains, refund, other_income
savings_investment → 401k_contribution, ira_contribution, hsa_contribution,
brokerage_deposit, savings_deposit
uncategorized → unknown
Never invent a category. If a transaction does not fit, use uncategorized.unknown and emit a taxonomy_gap warning.
{
"match": "TRADER JOE",
"merchant": "Trader Joe's",
"category": "food",
"subcategory": "groceries",
"is_recurring": false,
"priority": 100,
"added_on": "2026-01-20",
"source": "user_confirmed | learned"
}
Higher priority values win ties. Rules added by humans default to priority 200; rules learned by this skill default to 100.
Every output transaction carries:
category and subcategory — must be in taxonomy.merchant — clean display name.confidence — [0.0, 1.0].source — rule | llm | uncategorized.matched_rule_id (if source: rule).Never overwrite description_raw; always preserve it for re-classification.
{
"categorized": [
{
"id": "tx_20260115_001",
"merchant": "Trader Joe's",
"category": "food",
"subcategory": "groceries",
"is_recurring_candidate": false,
"confidence": 1.0,
"source": "rule",
"matched_rule_id": "rule_trader_joes"
}
],
"rules_proposed": [
{
"match": "BLUE BOTTLE",
"merchant": "Blue Bottle Coffee",
"category": "food",
"subcategory": "coffee",
"evidence_count": 4,
"evidence_tx_ids": ["tx_20260103_004", "tx_20260110_002", "tx_20260117_007", "tx_20260124_001"]
}
],
"warnings": [
{ "tx_id": "tx_20260118_009", "type": "taxonomy_gap", "description_raw": "ZELLE TO M COPPENS" }
],
"summary": {
"total": 142,
"rule_matched": 118,
"llm_classified": 22,
"uncategorized": 2,
"uncategorized_pct": 1.4
}
}
description_raw byte-for-byte. Normalization is for matching only.financial.transfers_internal is classified on one side, the matching opposite-sign transaction on the other account should also be transfers_internal — flag if not.match: "ZELLE TO JOHN SMITH" exposes a name; redact or skip such proposals.testing
--- name: advisory-edit description: A strict advisory-only editing discipline for a writer who dictates ("speaks out") essays and wants help WITHOUT having their voice changed. The editor directs structure, flags grammar, and suggests strategic language — but never modifies the writer's text unless the writer explicitly says "apply" / "make that change" / "rewrite this." Produces a line-referenced, suggestion-only critique where every item is marked the writer's call. Four passes: structural, l
testing
Provides the house style for analyst-grade strategist writing — third-person register with sparing first-person, no em dashes, no "not X, not Y, not Z" negation cascades, numbered footnote citations rather than inline source parentheticals, specific opinion-signaling phrases, and topic-forward paragraph structure modeled on voice patterns observed in Damodaran's Musings on Markets and Thompson's Stratechery. Use when consolidating working notes into a finished long-form strategist or analyst report that must read as written by a senior human analyst rather than an AI assistant.
testing
Renders a markdown report to a PDF using pandoc with xelatex (11pt serif body, 1-inch margins, numbered footnotes, formal heading hierarchy). Requires a one-time install of pandoc and a LaTeX engine on the user's machine — basictex on macOS or texlive-xetex on Linux. Does not attempt automatic install. Fails loudly with the exact install commands if pandoc or xelatex is missing on the user's PATH. Use when producing a finished strategist or analyst report PDF from a polished markdown source.
testing
Produces step-by-step computational walkthroughs of vector and matrix operations as a sequence of numbered "frames", showing the explicit state at each step. The text-equivalent of a 3Blue1Brown animation — each frame shows what changed and why, so the learner can re-trace the operation by hand. Use when the learner needs to *see* a computation unfold (eigenvalue computation, attention with 3 tokens, gradient descent step, SVD on a 2×2, layer norm on a 3-vector, softmax of a small input), when an explanation has been given but the learner needs to ground it in a worked example, or when introducing an operation that's intimidating in symbol form but trivial in pencil-and-paper form.