skills/codebase-learn/SKILL.md
Learn codebase structure with tree-sitter + SSL patterns
npx skillsauth add genomewalker/cc-soul codebase-learnInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Two-phase codebase understanding:
learn_codebase): AST extraction, provenance, hierarchical state[codebase-learn] tool + understanding
phase1: learn_codebase→tree-sitter→symbols+triplets+hierarchy
handles: parsing, storage, provenance, staleness tracking
output: Symbol nodes, file→contains→symbol triplets, ModuleState
phase2: Claude→architecture→SSL patterns
handles: why, how, relationships between components
output: Wisdom nodes with [LEARN] markers
Tree-sitter parsers available:
.c, .h, .cpp, .hpp, .cc, .cxx, .hxx.py, .pyw.js, .jsx, .mjs, .ts, .tsx.go.rs.java.rb.cschitta learn_codebase --path /path/to/project --project myproject
This single command:
Output:
Learned codebase: myproject
Files: 47 analyzed (of 52 found)
Symbols: 1234 stored
Triplets: 2567 created
Modules: 15 bootstrapped
Hierarchical State Modules:
Mind @include/chitta/mind.hpp
Storage @include/chitta/storage.hpp
...
After learn_codebase runs, I add architectural understanding:
[LEARN] [myproject] Mind→orchestrator→recall/observe/grow API
[ε] Central class managing tiered storage + embeddings + graph. @mind.hpp:52
[TRIPLET] Mind uses TieredStorage
[TRIPLET] Mind uses HierarchicalState
[TRIPLET] Mind provides recall
[LEARN] [myproject] HierarchicalState→token compression→3-level injection
[ε] L0=ProjectEssence(50t) + L1=ModuleState(20t) + L2=PatternState(10t)
[TRIPLET] HierarchicalState contains ProjectEssence
[TRIPLET] injection_protocol saves tokens
SSL captures what AST can't:
When code changes:
# Re-learn (only re-analyzes changed files with incremental: true default)
chitta learn_codebase --path /path/to/project
# Force full re-index if needed
chitta learn_codebase --path /path/to/project --force true
Incremental tracking means:
--force true to re-index everythingTraditional: inject full code context (~thousands of tokens)
Smart context approach:
View codebase structure:
chitta codebase_overview --project myproject
# Step 1: C++ tool does the heavy lifting
chitta learn_codebase --path /path/to/cc-soul/chitta --project cc-soul
# Step 2: I add architectural SSL
[LEARN] [cc-soul] chitta→semantic memory substrate→tiered storage + SSL + triplets
[ε] C++ daemon: hot/warm/cold storage, JSON-RPC socket, Hebbian learning.
[TRIPLET] chitta contains Mind
[TRIPLET] Mind orchestrates recall
[TRIPLET] Mind orchestrates observe
[LEARN] [cc-soul] provenance→staleness tracking→source_path+hash→Fresh|MaybeStale|Stale
[ε] Two-phase: immediate MaybeStale marking, background verification.
[TRIPLET] Node has provenance
[TRIPLET] provenance tracks staleness
The daemon automatically generates semantic descriptions for symbols using a local LLM (Ollama/vLLM):
# Check enrichment status
chitta soul_context # Shows pending count at startup
# Query described symbols
chitta recall --query "memory storage class" --tag code-intel
Enrichment progress:
ChittaField @store.rs:29Daemon options:
chittad daemon --enrich-interval 2 --enrich-batch 10 # defaults
chittad daemon --no-enrich # disable enrichment
After running:
recall("Mind architecture") → finds Symbol nodes AND architectural SSLrecall("memory storage") → finds enriched code descriptionscodebase_overview --project cc-soul → see full structure at a glancequery --subject Mind → find all Mind relationshipssearch_symbols --query "storage" → semantic search across symbolsThe soul knows both structure (symbols) and meaning (SSL + semantic descriptions).
development
Build, convert, and fine-tune the Qwen3-0.6B hint model for personal fact extraction. Covers corpus generation, ChatML conversion, LoRA fine-tuning with unsloth, GGUF export, and Ollama registration.
development
Build, convert, and fine-tune the Qwen3-0.6B hint model for personal fact extraction. Covers corpus generation, ChatML conversion, LoRA fine-tuning with unsloth, GGUF export, and Ollama registration.
tools
Browse and resume tasks, threads, and background jobs across sessions
tools
Resume a thread by loading its ~800-token context capsule