schema-author — evolve your schema pack

Non-goals (use these other skills instead)

This skill AUTHORS the schema pack (adds page types, link verbs, prefixes, flags). For these adjacent jobs, route elsewhere:

Filing one specific page → skills/brain-taxonomist/SKILL.md. Brain- taxonomist routes at WRITE TIME ("where does this note go?"). schema-author changes the rules at AUTHORING TIME ("what types and prefixes exist?").
Schema-check as part of EIIRP iteration → skills/eiirp/SKILL.md already has a schema-check phase. Don't duplicate.
Just looking up a type's settings → gbrain schema explain <type> directly. This skill is for CHANGING the pack, not READING from it.
Querying who knows about X → skills/expert-routing/SKILL.md (or gbrain whoknows directly). schema-author makes a type expert-routable; it does not run the query.

Convention

Convention: see conventions/brain-first.md for the lookup chain (search → query → get_page → external).

Convention: see conventions/schema-evolution.md for "when to add a type vs alias vs prefix" — the heuristic.

When to invoke

Invoke when the user (or a sibling skill) says any of:

"Add a researcher type to my schema"
"I have 4000 untyped pages under meetings/"
"My brain doesn't know that journal-article is a type"
"Set paper to be extractable"
"Propose types from what I've ingested"
"Sync the new types to backfill existing pages"

DON'T invoke for "where does THIS note go" (use brain-taxonomist) or "who knows about X" (use expert-routing / gbrain whoknows).

Tutorial + vision

Why this matters: docs/what-schemas-unlock.md — 7 killer use cases (4000 invisible meetings made queryable, founder ops brain, research brain, legal brain, team brain, agent-as-co-curator) plus the structural argument for why types matter at query time. Read this before pitching schema authoring to a user — it's the doc that explains the difference between a pile of notes and a brain with structure.
5-minute walkthrough: docs/schema-author-tutorial.md — fork the bundled pack, add a researcher type, sync, prove the T1.5 wiring via gbrain whoknows. Use placeholder pages so it runs against any brain without affecting real content.

Workflow

Phase 1 — Brain (know which pack is active)

gbrain schema active --json

Output gives you pack_name, version, sha8, page_types_count, source_tier. If source_tier === "default", the user is on bundled gbrain-base and any mutation will need a fork first (Phase 4).

Phase 2 — Assess (what does the current pack cover?)

gbrain schema stats --json

Returns per-type page counts, untyped count, and dead_prefixes (pack- declared prefixes with zero matching pages — probable mis-declarations). If coverage < 90%, there's untyped content worth typing.

gbrain schema review-orphans --limit 50 --json

Untyped pages drilldown. Look for shared path prefixes (e.g. "12 of these are under research/papers/") — those are candidates for a new type.

Phase 3 — Propose (what types should the pack add?)

gbrain schema detect --json

Clusters pages by source_path and proposes candidate types. Heuristic only (no LLM call).

gbrain schema suggest --json

LLM-refined candidates with confidence scores. Use the top-3 hit rate as the signal for which to promote.

Phase 4 — Apply (mutate the pack)

If the active pack is bundled (gbrain-base or gbrain-recommended), fork it first:

gbrain schema fork gbrain-base mine
gbrain schema use mine

Then add the types one at a time:

gbrain schema add-type researcher \
  --primitive entity \
  --prefix people/researchers/ \
  --extractable \
  --expert

For complex multi-mutation refactors (e.g. add a type AND the link verb that points to it), agents reaching this surface over MCP can use the batched schema_apply_mutations op:

{"op": "add_type", "name": "researcher", "primitive": "entity", "prefix": "people/researchers/", "extractable": true, "expert_routing": true}
{"op": "add_type", "name": "paper", "primitive": "annotation", "prefix": "research/papers/", "extractable": true}
{"op": "add_link_type", "name": "authored", "inference": {"page_type": "researcher", "target_type": "paper"}}

Validate before sync:

gbrain schema lint --with-db

The --with-db flag opts into the 2 DB-aware rules (extractable_empty_corpus, mutation_count_anomaly) that detect mis-declared types you'd otherwise discover only at runtime.

Phase 5 — Sync (backfill existing pages with the new types)

Dry-run first:

gbrain schema sync --json

Returns per-prefix would_apply counts + sample slugs. If the numbers look right:

gbrain schema sync --apply

Chunked UPDATE in 1000-row batches; never wedges concurrent writers. Idempotent on re-run (second --apply finds nothing to backfill).

Phase 6 — Verify

gbrain schema stats --json

Coverage should be ≥95% now. Spot-check the new type:

gbrain whoknows "machine learning"

If researcher was declared --expert, results should include researcher-typed pages. (The pack-aware wiring at the query path was added in v0.40.6.0 — pre-v0.40.6 brains silently ignored custom expert-routed types.)

Phase 7 — Commit (preserve the change)

If the pack is in source control, commit:

cd ~/.gbrain/schema-packs/mine
git add pack.json
git commit -m "schema: add researcher + paper types + authored link"
git push

If the brain daemon is running (gbrain serve --http), other processes pick up the change within 1 second (stat-mtime TTL gate in loadActivePack — v0.40.6.0 closed the cross-process invalidation gap).

Outputs

Mutated pack file at ~/.gbrain/schema-packs/<name>/pack.{json,yaml}.
Audit row in ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl per mutation.
pages.type backfilled on matching rows after sync --apply.
Query paths (whoknows, find_experts) now route through the new expert types.

Contract

Inputs: a natural-language request that names a type / prefix / link verb / flag change, OR the result of gbrain schema review-orphans showing untyped pages that need a new type.
Outputs: mutated pack file at ~/.gbrain/schema-packs/<name>/pack.{json,yaml} + an audit row in ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl + (if sync --apply ran) backfilled pages.type on matching rows.
Side effects: invalidates the in-process pack cache + the query cache for the source. Other processes pick up the change within 1 second (stat-mtime TTL).
Idempotency: every primitive is idempotent. add-alias/add-prefix no-op on duplicate; sync --apply finds nothing to update on second run.
Trust: CLI = local trust (no scope check). MCP = OAuth admin scope (write ops). Audit log captures actor: mcp:<clientId8> per mutation.
Atomicity: every mutation is wrapped in withMutation's atomic write (.tmp + fsync + rename) + per-pack O_CREAT|O_EXCL lock. Crash mid-write leaves the original file untouched.

Anti-Patterns

Don't mutate gbrain-base or gbrain-recommended. Fork first (gbrain schema fork gbrain-base mine). These are bundled packs; edits would be lost on upgrade. The mutation primitives refuse with PACK_READONLY.
Don't add a type for a directory you imported once for triage. Pack types are permanent decisions; one-time imports are not. See skills/conventions/schema-evolution.md for the <20-pages-don't-pack-codify heuristic.
Don't add --expert to a type with no path_prefixes. The expert_routing_without_prefix lint warns about this — expert-routed types with no prefix never match a put_page inference, so whoknows silently never surfaces them.
Don't promote a schema suggest candidate without verifying the prefix matches real content. Run lint --with-db before add-type to catch prefix collisions pre-write.
Don't conflate "filing one page" with "evolving the schema." Filing routes via brain-taxonomist; schema-author is for authoring the type taxonomy itself. The Non-goals section above names the boundary.
Don't skip the dry-run before sync --apply. Always run sync first to see would_apply counts + sample slugs. A pack prefix that matches 50,000 pages is recoverable but slow; verifying first is cheap.
Don't remove a type without checking references. remove-type refuses with STILL_REFERENCED if another type's aliases / enrichable_types / link_types / frontmatter_links references it. Break the references first; don't add --force.

Output Format

When invoked, this skill produces structured output suitable for both human + JSON consumption:

Per-mutation result (JSON):

{"schema_version": 1, "pack": "mine", "path": "/Users/.../pack.json", "format": "json", "prev_sha8": "a1b2c3d4", "new_sha8": "e5f6g7h8"}

Per-batch result (from schema_apply_mutations MCP op):

{"schema_version": 1, "pack": "mine", "batch_id": "batch-1716491400-abc123", "mutations_applied": 3, "results": [{...}, {...}, {...}]}

Stats JSON (per-source + aggregate + dead-prefix hints):

{"schema_version": 1, "pack_identity": "[email protected]+abc12345", "aggregate": {"total_pages": 4823, "typed_pages": 4710, "untyped_pages": 113, "coverage": 0.9766, "by_type": [{"type": "person", "count": 2104}, ...]}, "per_source": [...], "dead_prefixes": [{"type": "researcher", "prefix": "people/researchers/"}]}

Sync dry-run JSON:

{"schema_version": 1, "apply": false, "pack_identity": "[email protected]+abc12345", "per_prefix": [{"type": "meeting", "prefix": "meetings/", "would_apply": 4000, "sample_slugs": ["meetings/2026-01-01-foo", ...], "dead_prefix": false, "applied": 0}], "total_would_apply": 4000, "total_applied": 0}

Human output (the agent's final summary):

One line per mutation: Pack: <name> (<format>) and Sha8: <prev> → <new>
Stats: total pages, typed %, untyped count, per-type breakdown, dead-prefix list
Sync: per-prefix would_apply/applied count + sample slugs in dry-run mode

On failure, the error envelope follows the standard StructuredAgentError shape from src/core/errors.ts: {error, code, message, details?}. Codes from the mutation primitives: PACK_NOT_FOUND, PACK_READONLY, PACK_CORRUPT, TYPE_EXISTS, TYPE_NOT_FOUND, INVALID_PRIMITIVE, INVALID_RESULT, IO_ERROR, STILL_REFERENCED, LOCK_BUSY.

Failure modes

PACK_READONLY → you tried to mutate gbrain-base or gbrain-recommended. Fork first.
INVALID_RESULT → the mutation would create a dangling reference or prefix collision. The pre-write lint gate caught it. Read the error message; the lint rule name names the problem.
STILL_REFERENCED → you tried to remove a type that another type's aliases / enrichable_types / link_types / frontmatter_links references. The error names every reference. Remove those first.
LOCK_BUSY → another process is mid-mutation. Wait 30s and retry, or pass --force if you know the holder is wedged.
permission_denied (MCP only) → your OAuth client doesn't have admin scope. Re-register with gbrain auth register-client --scopes admin.

schema-author — evolve your schema pack

Non-goals (use these other skills instead)

This skill AUTHORS the schema pack (adds page types, link verbs, prefixes, flags). For these adjacent jobs, route elsewhere:

Filing one specific page → skills/brain-taxonomist/SKILL.md. Brain- taxonomist routes at WRITE TIME ("where does this note go?"). schema-author changes the rules at AUTHORING TIME ("what types and prefixes exist?").
Schema-check as part of EIIRP iteration → skills/eiirp/SKILL.md already has a schema-check phase. Don't duplicate.
Just looking up a type's settings → gbrain schema explain <type> directly. This skill is for CHANGING the pack, not READING from it.
Querying who knows about X → skills/expert-routing/SKILL.md (or gbrain whoknows directly). schema-author makes a type expert-routable; it does not run the query.

Convention

Convention: see conventions/brain-first.md for the lookup chain (search → query → get_page → external).

Convention: see conventions/schema-evolution.md for "when to add a type vs alias vs prefix" — the heuristic.

When to invoke

Invoke when the user (or a sibling skill) says any of:

"Add a researcher type to my schema"
"I have 4000 untyped pages under meetings/"
"My brain doesn't know that journal-article is a type"
"Set paper to be extractable"
"Propose types from what I've ingested"
"Sync the new types to backfill existing pages"

DON'T invoke for "where does THIS note go" (use brain-taxonomist) or "who knows about X" (use expert-routing / gbrain whoknows).

Tutorial + vision

Why this matters: docs/what-schemas-unlock.md — 7 killer use cases (4000 invisible meetings made queryable, founder ops brain, research brain, legal brain, team brain, agent-as-co-curator) plus the structural argument for why types matter at query time. Read this before pitching schema authoring to a user — it's the doc that explains the difference between a pile of notes and a brain with structure.
5-minute walkthrough: docs/schema-author-tutorial.md — fork the bundled pack, add a researcher type, sync, prove the T1.5 wiring via gbrain whoknows. Use placeholder pages so it runs against any brain without affecting real content.

Workflow

Phase 1 — Brain (know which pack is active)

gbrain schema active --json

Phase 2 — Assess (what does the current pack cover?)

gbrain schema stats --json

gbrain schema review-orphans --limit 50 --json

Untyped pages drilldown. Look for shared path prefixes (e.g. "12 of these are under research/papers/") — those are candidates for a new type.

Phase 3 — Propose (what types should the pack add?)

gbrain schema detect --json

Clusters pages by source_path and proposes candidate types. Heuristic only (no LLM call).

gbrain schema suggest --json

LLM-refined candidates with confidence scores. Use the top-3 hit rate as the signal for which to promote.

Phase 4 — Apply (mutate the pack)

If the active pack is bundled (gbrain-base or gbrain-recommended), fork it first:

gbrain schema fork gbrain-base mine
gbrain schema use mine

Then add the types one at a time:

gbrain schema add-type researcher \
  --primitive entity \
  --prefix people/researchers/ \
  --extractable \
  --expert

For complex multi-mutation refactors (e.g. add a type AND the link verb that points to it), agents reaching this surface over MCP can use the batched schema_apply_mutations op:

{"op": "add_type", "name": "researcher", "primitive": "entity", "prefix": "people/researchers/", "extractable": true, "expert_routing": true}
{"op": "add_type", "name": "paper", "primitive": "annotation", "prefix": "research/papers/", "extractable": true}
{"op": "add_link_type", "name": "authored", "inference": {"page_type": "researcher", "target_type": "paper"}}

Validate before sync:

gbrain schema lint --with-db

The --with-db flag opts into the 2 DB-aware rules (extractable_empty_corpus, mutation_count_anomaly) that detect mis-declared types you'd otherwise discover only at runtime.

Phase 5 — Sync (backfill existing pages with the new types)

Dry-run first:

gbrain schema sync --json

Returns per-prefix would_apply counts + sample slugs. If the numbers look right:

gbrain schema sync --apply

Chunked UPDATE in 1000-row batches; never wedges concurrent writers. Idempotent on re-run (second --apply finds nothing to backfill).

Phase 6 — Verify

gbrain schema stats --json

Coverage should be ≥95% now. Spot-check the new type:

gbrain whoknows "machine learning"

Phase 7 — Commit (preserve the change)

If the pack is in source control, commit:

cd ~/.gbrain/schema-packs/mine
git add pack.json
git commit -m "schema: add researcher + paper types + authored link"
git push

Outputs

Mutated pack file at ~/.gbrain/schema-packs/<name>/pack.{json,yaml}.
Audit row in ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl per mutation.
pages.type backfilled on matching rows after sync --apply.
Query paths (whoknows, find_experts) now route through the new expert types.

Contract

Inputs: a natural-language request that names a type / prefix / link verb / flag change, OR the result of gbrain schema review-orphans showing untyped pages that need a new type.
Outputs: mutated pack file at ~/.gbrain/schema-packs/<name>/pack.{json,yaml} + an audit row in ~/.gbrain/audit/schema-mutations-YYYY-Www.jsonl + (if sync --apply ran) backfilled pages.type on matching rows.
Side effects: invalidates the in-process pack cache + the query cache for the source. Other processes pick up the change within 1 second (stat-mtime TTL).
Idempotency: every primitive is idempotent. add-alias/add-prefix no-op on duplicate; sync --apply finds nothing to update on second run.
Trust: CLI = local trust (no scope check). MCP = OAuth admin scope (write ops). Audit log captures actor: mcp:<clientId8> per mutation.
Atomicity: every mutation is wrapped in withMutation's atomic write (.tmp + fsync + rename) + per-pack O_CREAT|O_EXCL lock. Crash mid-write leaves the original file untouched.

Anti-Patterns

Don't mutate gbrain-base or gbrain-recommended. Fork first (gbrain schema fork gbrain-base mine). These are bundled packs; edits would be lost on upgrade. The mutation primitives refuse with PACK_READONLY.
Don't add a type for a directory you imported once for triage. Pack types are permanent decisions; one-time imports are not. See skills/conventions/schema-evolution.md for the <20-pages-don't-pack-codify heuristic.
Don't add --expert to a type with no path_prefixes. The expert_routing_without_prefix lint warns about this — expert-routed types with no prefix never match a put_page inference, so whoknows silently never surfaces them.
Don't promote a schema suggest candidate without verifying the prefix matches real content. Run lint --with-db before add-type to catch prefix collisions pre-write.
Don't conflate "filing one page" with "evolving the schema." Filing routes via brain-taxonomist; schema-author is for authoring the type taxonomy itself. The Non-goals section above names the boundary.
Don't skip the dry-run before sync --apply. Always run sync first to see would_apply counts + sample slugs. A pack prefix that matches 50,000 pages is recoverable but slow; verifying first is cheap.
Don't remove a type without checking references. remove-type refuses with STILL_REFERENCED if another type's aliases / enrichable_types / link_types / frontmatter_links references it. Break the references first; don't add --force.

Output Format

When invoked, this skill produces structured output suitable for both human + JSON consumption:

Per-mutation result (JSON):

{"schema_version": 1, "pack": "mine", "path": "/Users/.../pack.json", "format": "json", "prev_sha8": "a1b2c3d4", "new_sha8": "e5f6g7h8"}

Per-batch result (from schema_apply_mutations MCP op):

{"schema_version": 1, "pack": "mine", "batch_id": "batch-1716491400-abc123", "mutations_applied": 3, "results": [{...}, {...}, {...}]}

Stats JSON (per-source + aggregate + dead-prefix hints):

{"schema_version": 1, "pack_identity": "[email protected]+abc12345", "aggregate": {"total_pages": 4823, "typed_pages": 4710, "untyped_pages": 113, "coverage": 0.9766, "by_type": [{"type": "person", "count": 2104}, ...]}, "per_source": [...], "dead_prefixes": [{"type": "researcher", "prefix": "people/researchers/"}]}

Sync dry-run JSON:

{"schema_version": 1, "apply": false, "pack_identity": "[email protected]+abc12345", "per_prefix": [{"type": "meeting", "prefix": "meetings/", "would_apply": 4000, "sample_slugs": ["meetings/2026-01-01-foo", ...], "dead_prefix": false, "applied": 0}], "total_would_apply": 4000, "total_applied": 0}

Human output (the agent's final summary):

One line per mutation: Pack: <name> (<format>) and Sha8: <prev> → <new>
Stats: total pages, typed %, untyped count, per-type breakdown, dead-prefix list
Sync: per-prefix would_apply/applied count + sample slugs in dry-run mode

Failure modes

PACK_READONLY → you tried to mutate gbrain-base or gbrain-recommended. Fork first.
INVALID_RESULT → the mutation would create a dangling reference or prefix collision. The pre-write lint gate caught it. Read the error message; the lint rule name names the problem.
STILL_REFERENCED → you tried to remove a type that another type's aliases / enrichable_types / link_types / frontmatter_links references. The error names every reference. Remove those first.
LOCK_BUSY → another process is mid-mutation. Wait 30s and retry, or pass --force if you know the holder is wedged.
permission_denied (MCP only) → your OAuth client doesn't have admin scope. Re-register with gbrain auth register-client --scopes admin.

Adoption

garrytan/schema-author

$ install --global

Security Scan Results

SKILL.md

schema-author — evolve your schema pack

Non-goals (use these other skills instead)

Convention

When to invoke

Tutorial + vision

Workflow

Phase 1 — Brain (know which pack is active)

Phase 2 — Assess (what does the current pack cover?)

Phase 3 — Propose (what types should the pack add?)

Phase 4 — Apply (mutate the pack)

Phase 5 — Sync (backfill existing pages with the new types)

Phase 6 — Verify

Phase 7 — Commit (preserve the change)

Outputs

Contract

Anti-Patterns

Output Format

Failure modes

Related Skills

garrytan/frontmatter-guard

garrytan/idea-lineage

garrytan/voice-persona-venus

garrytan/voice-persona-mars

garrytan/schema-author

$ install --global

Security Scan Results

SKILL.md

schema-author — evolve your schema pack

Non-goals (use these other skills instead)

Convention

When to invoke

Tutorial + vision

Workflow

Phase 1 — Brain (know which pack is active)

Phase 2 — Assess (what does the current pack cover?)

Phase 3 — Propose (what types should the pack add?)

Phase 4 — Apply (mutate the pack)

Phase 5 — Sync (backfill existing pages with the new types)

Phase 6 — Verify

Phase 7 — Commit (preserve the change)

Outputs

Contract

Anti-Patterns

Output Format

Failure modes

Related Skills

garrytan/frontmatter-guard

garrytan/idea-lineage

garrytan/voice-persona-venus

garrytan/voice-persona-mars