orienting-codebases/SKILL.md
Interactive codebase orientation for human learning. Companion to exploring-codebases (which builds Claude's understanding); this skill builds the user's understanding through guided exercises grounded in learning science. Uses the same tree-sitting + featuring pipeline but synthesizes into interactive teaching via HTML artifacts rather than analysis documents. Triggers on "orient me to this repo", "teach me this codebase", "help me understand this code", "learning orientation", or when the user wants to build genuine comprehension of an unfamiliar codebase rather than just getting work done in it.
npx skillsauth add oaustegard/claude-skills orienting-codebasesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Interactive codebase orientation for the human, not the agent. Same tree-sitting + featuring pipeline as exploring-codebases, but synthesizes into guided HTML exercises via composing-html instead of analysis documents. See README.md for design rationale.
uv venv /home/claude/.venv 2>/dev/null
uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python
export PYTHON=/home/claude/.venv/bin/python
export TREESIT=/mnt/skills/user/tree-sitting/scripts/treesit.py
export GATHER=/mnt/skills/user/featuring/scripts/gather.py
export COMPOSE=/mnt/skills/user/composing-html/scripts/build.py
OWNER=... REPO=... REF=main
curl -sL -H "Authorization: Bearer $GH_TOKEN" \
"https://api.github.com/repos/$OWNER/$REPO/tarball/$REF" -o /tmp/$REPO.tar.gz
mkdir -p /tmp/$REPO && tar -xzf /tmp/$REPO.tar.gz -C /tmp/$REPO --strip-components=1
For local repos, skip the curl — point directly at the path.
$PYTHON $TREESIT /tmp/$REPO --stats
$PYTHON $GATHER /tmp/$REPO \
--skip tests,.github,node_modules --source-budget 8000
For each exercise target identified from gather output, extract the actual source using treesit queries:
# Entry point source
$PYTHON $TREESIT /tmp/$REPO --no-tree 'source:main'
# Specific function for an exercise
$PYTHON $TREESIT /tmp/$REPO --no-tree 'source:validate_token'
# Imports for a hub file
$PYTHON $TREESIT /tmp/$REPO --no-tree 'imports:src/api.py'
This source feeds directly into the HTML artifact — the user sees real code, syntax-highlighted, without having to navigate the repo.
Do not show raw pipeline output to the user. It's material for exercise design. The user sees HTML artifacts and conversation.
After the pipeline steps, synthesize into an interactive orientation. Three phases.
Summarize the repo in ONE sentence (what it does, who it's for). Then:
I can walk you through a hands-on orientation — about 15 minutes, two exercises that'll give you a working mental model of this codebase. Want to try it?
Do not start exercises without confirmation.
For each exercise, generate an HTML artifact using composing-html's
freeform template, then continue the conversation around it.
Build a spec with body_html containing:
<section class="stack">
<div class="eyebrow">EXERCISE 1 OF 2</div>
<h2>[Exercise title]</h2>
<div class="card">
<div class="eyebrow">CONTEXT</div>
<p>[Brief explanation of what this code does in the system
and why it matters for orientation.]</p>
</div>
<div class="card">
<div class="eyebrow">CODE</div>
<pre><code>[Actual source extracted by treesit — entry point,
function, config, imports, or test names. Escaped properly.]</code></pre>
</div>
<details class="card">
<summary>
<span class="badge badge--clay">Your turn</span>
[Specific comprehension/synthesis question]
</summary>
<div style="margin-top:1rem">
<div class="eyebrow">KEY POINTS</div>
<p>[Pre-generated feedback covering what the code reveals about
the system's architecture, design decisions, or workflow.
This is the "answer key" — hidden until clicked.]</p>
</div>
</details>
</section>
Answers live inside <details> — the DOM hides them until the user
clicks. Do not preview or summarize the key points in chat after
presenting the artifact; that defeats the exercise.
Build and present the artifact:
python3 $COMPOSE build freeform --spec /tmp/exercise_N.json --out /tmp/exercise_N.html
After presenting the artifact, tell the user:
Take a look at the code and the question. You can either:
- Tell me your answer in chat and I'll give you specific feedback
- Click to reveal the key points in the artifact when you're ready
If they respond in chat: give personalized feedback based on what they actually said — confirm what's right, be specific about gaps, explore misconceptions. This is pedagogically richer than pre-canned feedback.
If they click through: that's fine too — the artifact's hidden feedback covers the essential points. Move to the next exercise.
Each exercise type maps to specific pipeline output:
Entry-point walkthrough — gather found clear entry points: Show the main function/handler source. Ask what the program does on startup and what the 2-3 most important operations are.
Architecture synthesis — treesit --stats shows clear directory structure: Show the directory tree with file counts and symbol density. Ask what the system's main components are and how they relate.
Dependency detective — gather found import clusters: Show a hub file's imports. Ask what the import list reveals about the file's role — integration point, orchestrator, leaf node?
Config reader — manifest/config files are rich: Show the config file. Ask which 2-3 settings they'd change first for a new project and why.
Test-as-spec — test files present and readable: Show test names (not bodies). Ask what the tests tell them about what the module is supposed to do.
After both exercises, ask in chat (no artifact needed):
What's one thing about this codebase that surprised you, or that you want to dig into further?
Use their answer to either:
If the user asks for a persistent orientation document, or if the session produced insights worth preserving, generate a standalone HTML artifact that any teammate can open in a browser without tooling.
Build via composing-html freeform with this body structure:
<section class="stack">
<div class="card">
<div class="eyebrow">PURPOSE</div>
<p>[One-line: what this repo does and why it exists.]</p>
</div>
<div class="card">
<div class="eyebrow">LANGUAGES</div>
<p>[From treesit --stats.]</p>
</div>
<h2 class="rule">Key files</h2>
<div class="grid grid--2">
[For each of 6-10 key files from gather's density ranking:]
<div class="card card--soft">
<code>[path/to/file]</code>
<p>[What it does — why a new developer should read it.]</p>
</div>
</div>
<h2 class="rule">Core concepts</h2>
[For each of 3-5 concepts:]
<details class="card">
<summary><strong>[Concept name]</strong> — [one-line summary]</summary>
<p>[Where it lives in the code. Why it matters.]</p>
</details>
<h2 class="rule">Orientation exercises</h2>
<p>Two exercises to build a working mental model. Read the code,
answer the question, then click to check your understanding.</p>
[Exercise 1 — same card+details structure as Phase B artifacts]
[Exercise 2 — same structure]
</section>
Spec keys:
{
"title": "Orientation: [repo name]",
"eyebrow": "CODEBASE ORIENTATION",
"subtitle": "[one-line purpose]",
"show_masthead": true,
"page_class": "page page--narrow",
"body_html": "..."
}
Build and present:
python3 $COMPOSE build freeform --spec /tmp/orientation_spec.json \
--out /mnt/user-data/outputs/orientation.html
When the user responds in chat (rather than clicking the reveal):
Adjust the amount of code shown and question specificity based on demonstrated familiarity — but always keep the answer as the user's responsibility.
| Level | Artifact shows | Question asks | Use when | |-------|---------------|---------------|----------| | High | Full function source, line numbers, file path | "What does this function check?" | First exercise, unfamiliar language | | Medium | Function signature + key lines only | "What's the validation logic here?" | Second exercise, or user nailed the first | | Low | File path only, no source in artifact | "Where would you look to change how auth works?" | Follow-up if user wants more |
Fading adjusts the difficulty of finding and reading the code, not explaining it. At every level the user still produces the synthesis.
If the user struggles, move UP the ladder (show more code, be more specific), not sideways (hint at the answer).
| Situation | Use | |-----------|-----| | "I just cloned this, what is it?" (Claude needs to understand) | exploring-codebases | | "Help me understand this codebase" (user needs to understand) | orienting-codebases (this skill) | | "Where is the retry logic?" | searching-codebases | | "I want to set a learning goal" | learning-goal | | "Help me learn [specific concept] from my code" | learning-opportunities (if installed) |
This skill is the orientation layer — first-encounter mental model building. For ongoing learning during development (exercises after commits, retrieval check-ins, prediction drills), pair with learning-opportunities from DrCatHicks/learning-opportunities.
testing
Disciplined, validation-gated revision of an EXISTING skill so each edit is a measured improvement rather than a guess. Use when editing, revising, or tuning a skill that already exists and there is evidence it underperforms (observed failures, drift, complaints) — invoke by name, or have versioning-skills / creating-skill defer to it before applying edits. Not for authoring a brand-new skill from scratch (use creating-skill) or one-off prose.
development
Skill-aware orchestration with context routing. Decomposes complex tasks into skill-typed subtasks, extracts targeted context subsets, executes subagents in parallel, and synthesizes results. Self-answers trivial lookups inline. No SDK dependency — uses raw HTTP via httpx. Use when tasks require multiple analytical perspectives, when context is large and subtasks only need portions, or when orchestrating-agents spawns too many redundant subagents.
tools
Orchestrates parallel API instances, delegated sub-tasks, and multi-agent workflows with streaming and tool-enabled delegation patterns. Use for parallel analysis, multi-perspective reviews, or complex task decomposition.
development
Invokes Google Gemini models for structured outputs, image generation, multi-modal tasks, and Google-specific features. Use when users request Gemini, image generation, structured JSON output, Google API integration, or cost-effective parallel processing.