skills/functional-area-resolver/SKILL.md
Compress an agent's routing file (RESOLVER.md or AGENTS.md) by converting granular skill-per-row tables into functional-area dispatchers. Each area lists sub-skills in a "(dispatcher for: ...)" clause. The LLM reads one area entry and routes to the correct sub-skill. Proven via held-out A/B eval: dispatcher pattern outperforms naive pipe-table compression.
npx skillsauth add garrytan/gbrain functional-area-resolverInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Routing files (RESOLVER.md, AGENTS.md) grow as skills are added. Each skill gets its own row (trigger -> skill path). At ~200+ skills this hits 25-30KB, eating context budget that should go to actual work.
Replace N rows per area with one entry per functional area. Each entry
lists all sub-skills it can dispatch to in a (dispatcher for: ...) clause.
- Creating/enriching a person or company page -> `enrich`
- Fix broken citations in brain pages -> `citation-fixer`
- Publish/share a brain page as link -> `brain-publish`
- Generate PDF from brain page -> `brain-pdf`
- Read a book through lens of a problem -> `strategic-reading`
- Personalized book analysis -> `book-mirror`
- Brain integrity -> `brain-librarian`
...
- **Brain & knowledge**: create/enrich/search/export brain pages, filing,
citations, publishing, book analysis, strategic reading, concept synthesis,
archive mining -> `brain-ops` (dispatcher for: enrich, query, brain-pdf,
brain-publish, brain-export, brain-librarian, citation-fixer, book-mirror,
strategic-reading, concept-synthesis, archive-crawler, ...)
The LLM doesn't need one row per sub-skill. It needs:
(dispatcher for: ...) list shows what's availablebrain-ops/SKILL.md, it has full routing detailThis is a two-layer dispatch: routing file routes to the area, the area skill routes to the specific sub-skill. Each layer does one job well.
Three resolver architectures tested across three Anthropic frontier models (Opus 4.7, Sonnet 4.6, Haiku 4.5) on real production AGENTS.md content, 20 hand-authored training fixtures + 5 held-out blind fixtures, n=3 seeded repeats per (fixture, variant). Two scoring rules: STRICT (predicted slug exactly equals expected) and LENIENT (predicted is in the same dispatcher area as expected). Both matter:
(dispatcher for: ...)?" This is closer to
production behavior — an agent that lands in gmail for an email intent
succeeds even if the resolver entry said executive-assistant.| Variant | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 | Size | |---|---|---|---|---| | baseline (270 bullet rows) | 81.7% ± 7.2% | 86.7% ± 7.2% | 73.3% ± 7.2% | 25KB | | functional-areas (this pattern) | 98.3% ± 7.2% | 100% ± 0% | 88.3% ± 7.2% | 13KB | | resolver-of-resolvers (no dispatcher clause) | 63.3% ± 14.3% | 41.7% ± 7.2% | 65.0% ± 12.4% | 10KB |
| Variant | Opus 4.7 | Sonnet 4.6 | Haiku 4.5 | |---|---|---|---| | baseline | 100% ± 0% | 100% ± 0% | 100% ± 0% | | functional-areas | 100% ± 0% | 100% ± 0% | 100% ± 0% | | resolver-of-resolvers | 100% ± 0% | 73.3% ± 28.7% | 100% ± 0% |
Functional-areas BEATS baseline on training across all three models (+13 to +17pp) at 48% the size. Held-out is saturated at 100% for both — within margin of error.
The (dispatcher for: ...) clause is the load-bearing signal. resolver-of-resolvers strips that clause and collapses to 41.7% on Sonnet — the catastrophic failure case the original PR predicted, now observed.
The pattern works because the LLM can drill into the dispatcher list. Most "STRICT failures" are the LLM picking a more-specific sub-skill (gmail instead of executive-assistant). That's the pattern working as designed. STRICT scoring under-counts; LENIENT scoring reflects production agent behavior.
The pattern's value scales with model tier. Compression gain (functional-areas vs baseline, training, LENIENT) is +17pp on Opus, +13pp on Sonnet, +15pp on Haiku. Sonnet shows the cleanest separation between functional-areas and resolver-of-resolvers (100% vs 41.7%) — model capacity affects how much the dispatcher signal matters.
cd evals/functional-area-resolver
node harness.mjs --model opus # ~225 LLM calls, ~$1.70 at Opus pricing
node harness.mjs --model sonnet # ~$1.00
node harness.mjs --model haiku # ~$0.30
node rescore.mjs baseline-runs/2026-05-11-opus-4-7.jsonl # zero-cost re-score
Receipts (model, prompt_template_hash, fixtures_hash, harness_sha, ts):
evals/functional-area-resolver/baseline-runs/2026-05-11-{opus-4-7,sonnet-4-6,haiku-4-5}.jsonl.
(dispatcher for: ...)), every compression variant
collapses to ~30-60% on Opus. The dispatcher-aware prompt is in
evals/functional-area-resolver/harness-runner.ts:PROMPT_TEMPLATE. Use it
as the template for your agent's harness; without it, compression breaks.The pattern is a static-prompt analog of hierarchical agent routing, a 2024-2025 research direction:
(dispatcher for: ...) clause is the
meta-agent's view collapsed into a single LLM pass.The 2025-2026 literature has no published benchmark for static-prompt
hierarchical routing (every published hierarchical scheme resolves the
hierarchy at runtime via a second LLM call). Our finding — that the
hierarchy can be inlined into a single-LLM-pass dispatcher list and retain
routing accuracy — is the open contribution. See
evals/functional-area-resolver/README.md for methodology details.
Refuse to compress if either gate fails:
git status shows uncommitted changes to the routing file (the
compressor's edit would entangle with whatever the user was doing).If a user wants to override either gate, they ask explicitly with --force.
GBrain workspaces often have TWO routing files merged at runtime (per
src/core/check-resolvable.ts v0.31.7): skills/RESOLVER.md and a sibling
../AGENTS.md. Choose which to compress:
If the deployment uses only one routing file, this section is a no-op — compress that one.
Group skills by domain. Typical areas (adjust per deployment):
Each area entry follows this template:
- **{Area Name}**: {comma-separated trigger phrases} -> `{dispatcher-skill}`
(dispatcher for: {comma-separated sub-skill names})
Rules:
Gates and always-on entries (acknowledge, multi-user, entity-detector, etc.) stay as individual rows — they're checked on every message, not dispatched.
Run two gates before committing the compressed file. Do NOT commit if either fails.
Gate 1: Structural verification. Confirms your routing-eval.jsonl
fixtures still resolve to the right skills under the compressed routing file.
Run from the workspace whose routing file you just edited:
gbrain routing-eval --json
If accuracy on your fixtures drops below 95%, revert and tune the area entries before re-running.
Gate 2: LLM A/B verification on YOUR edited file. Confirms a frontier
LLM can still drill into the dispatcher list and reach sub-skills under
your specific compression. Requires a gbrain repo checkout because the
harness lives there. Copy your edited routing file into the harness's
variants directory, then invoke the harness with --variants pointing
at it:
# In your agent workspace, identify the routing file you just compressed.
EDITED=/path/to/your/AGENTS.md # or skills/RESOLVER.md, whichever you edited
# In your gbrain repo checkout:
cd /path/to/gbrain/evals/functional-area-resolver
TMP=$(mktemp -d)/variants && mkdir -p "$TMP"
cp "$EDITED" "$TMP/my-edit.md"
# Run the harness against your file (sequential, ~75 calls × $0.0076 ≈ $0.57 on Opus).
ANTHROPIC_API_KEY=... node harness.mjs --variants-dir "$TMP" --variants my-edit \
--model opus --parallel 3 --yes
The harness uses gbrain's bundled fixture set, so this verifies "did the LLM
land in the right sub-skill for routing intents the gbrain-bundled fixtures
cover" — a regression check on shared skills, not a full re-eval of YOUR
fixture set. For full eval coverage, mirror this skill's
fixtures.jsonl + fixtures-held-out.jsonl setup with intents specific
to your skills.
If the lenient (same-area) score on your variant drops below 95%, revert the compression and tune. Common causes:
(dispatcher for: ...) list.-> vs Unicode → mismatch — the harness now accepts both, but
earlier versions only matched Unicode. Pin gbrain to v0.32.3.0+.Common false negatives on the harness eval (NOT bugs in your compression):
enrich, query,
gmail, executive-assistant. If your routing file doesn't expose
those skills at all, expect strict-scoring failures on those fixtures.
Lenient scoring stays accurate for any sub-skill present in your
(dispatcher for: ...) lists.Show the user the proposed edit (or the actual git diff) and wait for
explicit approval before staging. Same convention as skills/book-mirror/SKILL.md.
This skill guarantees:
--force).gbrain routing-eval --json AND the gbrain-repo harness (node harness.mjs --variants-dir <tmp> --variants my-edit) before committing the compressed file.The full behavior contract is documented in the body sections above; this section exists for the conformance test.
The compressed routing file follows the area-entry template documented in Step 4 ("Build the area entry format"). Each entry: - **{Area Name}**: {trigger phrases} -> \{dispatcher-skill}` (dispatcher for: {sub-skill list}). The dispatcher arrow may be either ASCII ->(default in this template) or Unicode→` (used in some production deployments); the gbrain harness accepts both.
Resolver-of-resolvers with pipe tables. Tested and failed (see eval table). The LLM picks area names from the table instead of drilling into sub-skills.
Removing sub-skill names. Without the (dispatcher for: ...) list,
the LLM can't route to specific sub-skills. The list is the routing signal.
Too few areas. Collapsing to <5 areas makes each area too broad. 12-15 areas is the sweet spot.
Too many areas. Defeats the purpose. If you have 50 areas, just keep individual rows.
When adding a new skill:
(dispatcher for: ...) list.When adding a new functional area:
evals/functional-area-resolver/).compress-agents-md to functional-area-resolver
pre-release; the contribution is the pattern, not the filename.research
Self-evolving skill optimization via SkillOpt-paper-grounded text-space optimizer.
development
Keep gbrain current. When a `gbrain` invocation prints an `UPGRADE_AVAILABLE <old> <new>` marker (or `gbrain self-upgrade --check-only` reports an update), apply it per the configured self_upgrade.mode: notify (prompt the operator with a 4-option question + snooze) or auto (apply silently). The action is always the hardcoded `gbrain self-upgrade` — never a command read from the marker.
data-ai
Set up GBrain with auto-provision Supabase or PGLite, AGENTS.md injection, first import
tools
--- name: query-helper triggers: - find a page tools: - search - query writes_pages: false --- # query-helper This skill helps you query the brain. The first prose line becomes the description when no `description:` frontmatter is present.