Featuring

Generate _FEATURES.md files — top-down documentation of what a codebase does, organized by feature/capability, anchored to specific source symbols.

tree-sitting tells you WHAT symbols exist. _FEATURES.md tells you WHY they exist and what they accomplish together.

For large codebases, the root _FEATURES.md decomposes into sub-feature files linked by capability area — not by folder structure. An agent starts at the root and is drawn into sub-files only when working on a relevant area.

Dependency

Requires tree-sitting skill. Uses its engine for AST scanning.

uv venv /home/claude/.venv 2>/dev/null
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python

For quick structural orientation before running gather.py, use tree-sitting's CLI:

TREESIT=/mnt/skills/user/tree-sitting/scripts/treesit.py

# Complete tree, sparse detail — see the full shape
/home/claude/.venv/bin/python $TREESIT /path/to/repo --depth=-1 --detail=sparse

Workflow: Multi-Pass Synthesis

Feature documentation is built in three passes. The overview is written LAST, after all features are understood — not first.

Pass 1: Orientation (quick scan)

/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/gather.py /path/to/repo \
  --skip tests,.github,node_modules --source-budget 8000

Read the gather output. Before writing anything, form a hypothesis:

"This codebase appears to be a [what it is] that provides [capability A], [capability B], and [capability C]."

Write this down as a DRAFT overview. It will be wrong or incomplete — that's fine. The point is to orient before diving into detail.

How to identify capability areas:

What can a user/consumer DO with this? (commands, API endpoints, UI actions)
What problems does it solve? (the WHY behind the code)
What are the main workflows? (how features compose)
What are the constraints/invariants? (rules the code enforces)

Pass 2: Detailed feature extraction

For each capability area identified in Pass 1:

Gather the symbols that implement it (from gather output + targeted get_source())
Understand how they collaborate — the workflow
Identify constraints and invariants
Write the feature section

During this pass, you'll discover:

Capabilities you missed in Pass 1
Features that are more complex than expected (decomposition candidates)
Features that are simpler than expected (merge candidates)
Cross-cutting concerns that span multiple capability areas

Hierarchy decision (per feature, during this pass):

| Signal | Action | |--------|--------| | ≤6 key symbols, self-contained | Inline in root _FEATURES.md | | >6 key symbols OR clear sub-capabilities | Own _FEATURES.md sub-file | | Spans many files but is ONE capability | Inline (breadth ≠ complexity) | | Has sub-features that are independently useful | Own sub-file | | Is infrastructure (logging, DB layer) | Inline briefly, unless it IS the product |

Pass 3: Overview rewrite

NOW — after all features are documented — rewrite the overview. The Pass 1 draft was a hypothesis. Pass 3 replaces it with a proper progressive-disclosure overview that:

States what the codebase is in one sentence
Lists the top-level capability areas (3-8 items)
For each area that has a sub-file: one sentence + link + "read when" guidance
For inline features: just the list entry (detail is below in the same file)

This is the most important part. The overview IS the entry point for every agent session. It must be accurate, complete, and fast to scan.

_FEATURES.md Format

Root file

# Features: {project-name}

> One-sentence description of what this codebase is and does.

**Capability areas:**
- **[Area A]** — one-sentence summary
- **[Area B]** — one-sentence summary → [details](path/to/_FEATURES.md)
- **[Area C]** — one-sentence summary

## {Inline Feature Name}

{2-3 sentences: what this feature does from a user perspective.}

**Key symbols:**
- `file.py#function_name` — role in this feature
- `file.py#ClassName` — role in this feature

**Workflow:** {How a user exercises this feature or how symbols collaborate.}

**Constraints:** {Invariants, limits, rules.}

---

## {Complex Feature Area}

> One-sentence summary of what this area covers.

This area is documented in detail in [{area-name}/_FEATURES.md]({path}).
Read it when working on {specific trigger — e.g., "the memory retrieval pipeline",
"adding a new API endpoint", "modifying the build system"}.

At a glance, this area provides:
- {sub-capability 1} — one line
- {sub-capability 2} — one line
- {sub-capability 3} — one line

Sub-feature files

Sub-feature files follow the SAME format as the root, recursively. They can contain inline features and further sub-file references. Each sub-file:

Has its own # Features: {area-name} header
Has its own overview paragraph
Is self-contained — an agent reading only this file understands the area
Links back to the root: ← [Root features](../_FEATURES.md)

Format rules

Organized by capability, not by file/directory
Symbol references use file#symbol notation (relative to repo root)
Leading paragraph per feature: what a user gets, not implementation details
Key symbols: the 2-6 most important symbols, with their role explained
Workflow: how the feature works end-to-end (include when non-obvious)
Constraints: rules/invariants (include when they exist)
No source code in _FEATURES.md — it's a map, not a mirror
"Read when" guidance on every sub-file link — tells agents WHEN to drill in

What makes a good feature entry

Good: "Memory Storage — Persist observations across sessions. Stores typed, tagged memories to a Turso database with BM25 full-text search. Memories have priority levels that affect retrieval ranking."

Bad: "memory.py — Contains remember(), recall(), forget(), and supersede() functions."

The first tells you WHAT you can do. The second describes file contents — tree-sitting already gives you that.

Hierarchy design principles

The hierarchy is feature-driven, not folder-driven. Folders are natural candidates for decomposition boundaries, but the decision is based on:

Does this capability area have enough complexity to warrant its own file? (>6 key symbols, multiple sub-workflows, or independently useful sub-features)
Would an agent working on this area benefit from focused context? (if yes, a sub-file saves them from parsing unrelated features)
Is this area likely to be read independently of the rest? (if yes, it should be self-contained in its own file)

Counter-examples — do NOT split just because:

The code lives in a separate folder (folder ≠ feature)
There are many files (files ≠ complexity)
A class has many methods (one class = one feature unless methods serve distinct user-facing purposes)

Identifying features

Heuristics for finding feature boundaries:

Entry points (main, CLI commands, route handlers) often map 1:1 to features
Public API functions that aren't helpers are usually feature surfaces
Type hierarchies (class + methods) often represent a cohesive feature
Config/constants clusters sometimes reveal features (e.g., a group of timeout constants → a retry feature)
Import clusters — files that import each other heavily are likely co-implementing a feature

Features to SKIP in _FEATURES.md:

Pure infrastructure (logging, error handling) unless it's the project's purpose
Internal utilities that only serve other features
Test code (unless the testing approach IS a feature, e.g., a testing framework)

Keeping _FEATURES.md in Sync

Three mechanisms, layered:

1. Check script (detect drift)

/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/check.py /path/to/repo \
  [--features _FEATURES.md] [--skip tests,.github]

Parses file#symbol references from ALL _FEATURES.md files (root + sub-files), resolves them against the live codebase via tree-sitting, and reports:

Broken refs — symbol deleted or renamed (exit code 1)
Moved symbols — symbol exists but in a different file than referenced
Dead features — ALL key symbols in a feature section are gone
Uncovered symbols — new public API not mentioned in any feature
Orphan sub-files — sub-feature files not linked from any parent

Exit code 0 = clean, 1 = drift detected. Suitable for CI or pre-commit hooks.

2. Agent instructions (prevent drift)

Add to CLAUDE.md or equivalent:

## Feature Documentation

- `_FEATURES.md` documents what this codebase does, organized by capability.
- Start here when orienting to the codebase. Follow sub-file links as needed.
- After changing behavior (new feature, renamed API, deleted functionality):
  run `python featuring/scripts/check.py .` and fix any broken refs.
- After adding a new public API surface: add it to the appropriate feature
  section, or create a new feature section if it's a new capability.
- Run check before committing. Broken refs = broken documentation.

3. Targeted regeneration (fix drift)

When check reports broken refs, the fix is usually surgical: update the file#symbol reference to the new name/location. For dead features (all refs gone), either delete the section or regenerate it.

Full regeneration (re-running all three passes) is the nuclear option. Prefer targeted updates — they're cheaper and preserve hand-written narrative.

CI Integration

# .github/workflows/features-check.yml
name: Check _FEATURES.md
on: [push, pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v4
      - run: uv pip install tree-sitter-language-pack
      - run: python featuring/scripts/check.py . --skip tests

Claude Code Integration

In Claude Code, use tree-sitting's CLI or engine directly. The agent should:

Run treesit.py /path --depth=-1 --detail=sparse for full structural overview
Pass 1: Form a hypothesis about what the codebase does
Run treesit.py /path --path=DIR --detail=full for each capability area
Run treesit.py /path --no-tree 'source:symbol_name' where intent isn't clear
Pass 2: Write detailed feature sections, deciding hierarchy per-feature
Pass 3: Rewrite the overview now that all features are documented

Add to CLAUDE.md:

## Codebase Understanding

Read `_FEATURES.md` for top-down feature orientation before modifying code.
Follow links to sub-feature files when working on a specific area.
Use tree-sitting MCP tools for structural queries (symbol lookup, source retrieval).
After adding new features or changing behavior, update the relevant _FEATURES.md.

Example: Small Codebase (flat)

A CLI tool with 15 public symbols → single _FEATURES.md, all features inline. No sub-files needed.

Example: Large Codebase (hierarchical)

The remembering skill (memory system for an AI agent) has ~60 public symbols across 8 files. Hierarchical decomposition:

_FEATURES.md              (root — overview + 3 inline features + 3 sub-file refs)
├── scripts/_FEATURES.md  (memory operations — storage, retrieval, lifecycle, maintenance)
└── utils/_FEATURES.md    (utility modules — therapy, reminders, blog publishing)

Root _FEATURES.md would contain:

Overview: "Persistent memory system for Muninn. Stores, retrieves, and maintains typed memories across sessions via Turso."
Inline: Boot Sequence, Configuration, Task Tracking (simple, ≤4 symbols each)
Sub-file ref: Memory Operations → scripts/_FEATURES.md ("Read when working on storage, retrieval, or memory lifecycle")
Sub-file ref: Utility Modules → utils/_FEATURES.md ("Read when working on therapy sessions, reminders, or blog publishing")

Relationship to Other Skills

| Skill | What it provides | Drift detection | |-------|-----------------|-----------------| | tree-sitting | Structural inventory (symbols, signatures) | N/A (live queries) | | featuring | Feature documentation (what/why), hierarchical | check.py — docs → code | | generating-lattice | Bidirectional knowledge graph | lat check — docs ↔ code | | mapping-webapp | Web app behavioral docs (pages, flows) | None |

featuring's check is lighter than lattice's: no source code annotations needed, no @lat: comments, just reference resolution. The trade-off is that new code without docs is only flagged as "uncovered symbols" — it's advisory, not enforced. Use lattice when you need strict bidirectional traceability; use featuring when you need good-enough orientation docs that catch renames and deletions.

Featuring

Generate _FEATURES.md files — top-down documentation of what a codebase does, organized by feature/capability, anchored to specific source symbols.

tree-sitting tells you WHAT symbols exist. _FEATURES.md tells you WHY they exist and what they accomplish together.

Dependency

Requires tree-sitting skill. Uses its engine for AST scanning.

uv venv /home/claude/.venv 2>/dev/null
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python

For quick structural orientation before running gather.py, use tree-sitting's CLI:

TREESIT=/mnt/skills/user/tree-sitting/scripts/treesit.py

# Complete tree, sparse detail — see the full shape
/home/claude/.venv/bin/python $TREESIT /path/to/repo --depth=-1 --detail=sparse

Workflow: Multi-Pass Synthesis

Feature documentation is built in three passes. The overview is written LAST, after all features are understood — not first.

Pass 1: Orientation (quick scan)

/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/gather.py /path/to/repo \
  --skip tests,.github,node_modules --source-budget 8000

Read the gather output. Before writing anything, form a hypothesis:

"This codebase appears to be a [what it is] that provides [capability A], [capability B], and [capability C]."

Write this down as a DRAFT overview. It will be wrong or incomplete — that's fine. The point is to orient before diving into detail.

How to identify capability areas:

What can a user/consumer DO with this? (commands, API endpoints, UI actions)
What problems does it solve? (the WHY behind the code)
What are the main workflows? (how features compose)
What are the constraints/invariants? (rules the code enforces)

Pass 2: Detailed feature extraction

For each capability area identified in Pass 1:

Gather the symbols that implement it (from gather output + targeted get_source())
Understand how they collaborate — the workflow
Identify constraints and invariants
Write the feature section

During this pass, you'll discover:

Capabilities you missed in Pass 1
Features that are more complex than expected (decomposition candidates)
Features that are simpler than expected (merge candidates)
Cross-cutting concerns that span multiple capability areas

Hierarchy decision (per feature, during this pass):

Pass 3: Overview rewrite

NOW — after all features are documented — rewrite the overview. The Pass 1 draft was a hypothesis. Pass 3 replaces it with a proper progressive-disclosure overview that:

States what the codebase is in one sentence
Lists the top-level capability areas (3-8 items)
For each area that has a sub-file: one sentence + link + "read when" guidance
For inline features: just the list entry (detail is below in the same file)

This is the most important part. The overview IS the entry point for every agent session. It must be accurate, complete, and fast to scan.

_FEATURES.md Format

Root file

# Features: {project-name}

> One-sentence description of what this codebase is and does.

**Capability areas:**
- **[Area A]** — one-sentence summary
- **[Area B]** — one-sentence summary → [details](path/to/_FEATURES.md)
- **[Area C]** — one-sentence summary

## {Inline Feature Name}

{2-3 sentences: what this feature does from a user perspective.}

**Key symbols:**
- `file.py#function_name` — role in this feature
- `file.py#ClassName` — role in this feature

**Workflow:** {How a user exercises this feature or how symbols collaborate.}

**Constraints:** {Invariants, limits, rules.}

---

## {Complex Feature Area}

> One-sentence summary of what this area covers.

This area is documented in detail in [{area-name}/_FEATURES.md]({path}).
Read it when working on {specific trigger — e.g., "the memory retrieval pipeline",
"adding a new API endpoint", "modifying the build system"}.

At a glance, this area provides:
- {sub-capability 1} — one line
- {sub-capability 2} — one line
- {sub-capability 3} — one line

Sub-feature files

Sub-feature files follow the SAME format as the root, recursively. They can contain inline features and further sub-file references. Each sub-file:

Has its own # Features: {area-name} header
Has its own overview paragraph
Is self-contained — an agent reading only this file understands the area
Links back to the root: ← [Root features](../_FEATURES.md)

Format rules

Organized by capability, not by file/directory
Symbol references use file#symbol notation (relative to repo root)
Leading paragraph per feature: what a user gets, not implementation details
Key symbols: the 2-6 most important symbols, with their role explained
Workflow: how the feature works end-to-end (include when non-obvious)
Constraints: rules/invariants (include when they exist)
No source code in _FEATURES.md — it's a map, not a mirror
"Read when" guidance on every sub-file link — tells agents WHEN to drill in

What makes a good feature entry

Bad: "memory.py — Contains remember(), recall(), forget(), and supersede() functions."

The first tells you WHAT you can do. The second describes file contents — tree-sitting already gives you that.

Hierarchy design principles

The hierarchy is feature-driven, not folder-driven. Folders are natural candidates for decomposition boundaries, but the decision is based on:

Does this capability area have enough complexity to warrant its own file? (>6 key symbols, multiple sub-workflows, or independently useful sub-features)
Would an agent working on this area benefit from focused context? (if yes, a sub-file saves them from parsing unrelated features)
Is this area likely to be read independently of the rest? (if yes, it should be self-contained in its own file)

Counter-examples — do NOT split just because:

The code lives in a separate folder (folder ≠ feature)
There are many files (files ≠ complexity)
A class has many methods (one class = one feature unless methods serve distinct user-facing purposes)

Identifying features

Heuristics for finding feature boundaries:

Entry points (main, CLI commands, route handlers) often map 1:1 to features
Public API functions that aren't helpers are usually feature surfaces
Type hierarchies (class + methods) often represent a cohesive feature
Config/constants clusters sometimes reveal features (e.g., a group of timeout constants → a retry feature)
Import clusters — files that import each other heavily are likely co-implementing a feature

Features to SKIP in _FEATURES.md:

Pure infrastructure (logging, error handling) unless it's the project's purpose
Internal utilities that only serve other features
Test code (unless the testing approach IS a feature, e.g., a testing framework)

Keeping _FEATURES.md in Sync

Three mechanisms, layered:

1. Check script (detect drift)

/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/check.py /path/to/repo \
  [--features _FEATURES.md] [--skip tests,.github]

Parses file#symbol references from ALL _FEATURES.md files (root + sub-files), resolves them against the live codebase via tree-sitting, and reports:

Broken refs — symbol deleted or renamed (exit code 1)
Moved symbols — symbol exists but in a different file than referenced
Dead features — ALL key symbols in a feature section are gone
Uncovered symbols — new public API not mentioned in any feature
Orphan sub-files — sub-feature files not linked from any parent

Exit code 0 = clean, 1 = drift detected. Suitable for CI or pre-commit hooks.

2. Agent instructions (prevent drift)

Add to CLAUDE.md or equivalent:

## Feature Documentation

- `_FEATURES.md` documents what this codebase does, organized by capability.
- Start here when orienting to the codebase. Follow sub-file links as needed.
- After changing behavior (new feature, renamed API, deleted functionality):
  run `python featuring/scripts/check.py .` and fix any broken refs.
- After adding a new public API surface: add it to the appropriate feature
  section, or create a new feature section if it's a new capability.
- Run check before committing. Broken refs = broken documentation.

3. Targeted regeneration (fix drift)

Full regeneration (re-running all three passes) is the nuclear option. Prefer targeted updates — they're cheaper and preserve hand-written narrative.

CI Integration

# .github/workflows/features-check.yml
name: Check _FEATURES.md
on: [push, pull_request]
jobs:
  check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v4
      - run: uv pip install tree-sitter-language-pack
      - run: python featuring/scripts/check.py . --skip tests

Claude Code Integration

In Claude Code, use tree-sitting's CLI or engine directly. The agent should:

Run treesit.py /path --depth=-1 --detail=sparse for full structural overview
Pass 1: Form a hypothesis about what the codebase does
Run treesit.py /path --path=DIR --detail=full for each capability area
Run treesit.py /path --no-tree 'source:symbol_name' where intent isn't clear
Pass 2: Write detailed feature sections, deciding hierarchy per-feature
Pass 3: Rewrite the overview now that all features are documented

Add to CLAUDE.md:

## Codebase Understanding

Read `_FEATURES.md` for top-down feature orientation before modifying code.
Follow links to sub-feature files when working on a specific area.
Use tree-sitting MCP tools for structural queries (symbol lookup, source retrieval).
After adding new features or changing behavior, update the relevant _FEATURES.md.

Example: Small Codebase (flat)

A CLI tool with 15 public symbols → single _FEATURES.md, all features inline. No sub-files needed.

Example: Large Codebase (hierarchical)

The remembering skill (memory system for an AI agent) has ~60 public symbols across 8 files. Hierarchical decomposition:

_FEATURES.md              (root — overview + 3 inline features + 3 sub-file refs)
├── scripts/_FEATURES.md  (memory operations — storage, retrieval, lifecycle, maintenance)
└── utils/_FEATURES.md    (utility modules — therapy, reminders, blog publishing)

Root _FEATURES.md would contain:

Overview: "Persistent memory system for Muninn. Stores, retrieves, and maintains typed memories across sessions via Turso."
Inline: Boot Sequence, Configuration, Task Tracking (simple, ≤4 symbols each)
Sub-file ref: Memory Operations → scripts/_FEATURES.md ("Read when working on storage, retrieval, or memory lifecycle")
Sub-file ref: Utility Modules → utils/_FEATURES.md ("Read when working on therapy sessions, reminders, or blog publishing")

Adoption

oaustegard/featuring

$ install --global

Security Scan Results

SKILL.md

Featuring

Dependency

Workflow: Multi-Pass Synthesis

Pass 1: Orientation (quick scan)

Pass 2: Detailed feature extraction

Pass 3: Overview rewrite

_FEATURES.md Format

Root file

Sub-feature files

Format rules

What makes a good feature entry

Hierarchy design principles

Identifying features

Keeping _FEATURES.md in Sync

1. Check script (detect drift)

2. Agent instructions (prevent drift)

3. Targeted regeneration (fix drift)

CI Integration

Claude Code Integration

Example: Small Codebase (flat)

Example: Large Codebase (hierarchical)

Relationship to Other Skills

Related Skills

oaustegard/writing-instructions

oaustegard/finding-skills

oaustegard/transcribing-images

oaustegard/svg-portrait-mode

oaustegard/featuring

$ install --global

Security Scan Results

SKILL.md

Featuring

Dependency

Workflow: Multi-Pass Synthesis

Pass 1: Orientation (quick scan)

Pass 2: Detailed feature extraction

Pass 3: Overview rewrite

_FEATURES.md Format

Root file

Sub-feature files

Format rules

What makes a good feature entry

Hierarchy design principles

Identifying features

Keeping _FEATURES.md in Sync

1. Check script (detect drift)

2. Agent instructions (prevent drift)

3. Targeted regeneration (fix drift)

CI Integration

Claude Code Integration

Example: Small Codebase (flat)

Example: Large Codebase (hierarchical)

Relationship to Other Skills

Related Skills

oaustegard/writing-instructions

oaustegard/finding-skills

oaustegard/transcribing-images

oaustegard/svg-portrait-mode