building-github-index-v2/SKILL.md
Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".
npx skillsauth add oaustegard/claude-skills building-github-indexInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Create markdown indexes of GitHub repositories optimized for Claude project knowledge. Indexes enable retrieval via GitHub API with semantic descriptions for effective matching.
# Documentation repos (markdown/notebooks)
python scripts/github_index.py owner/repo -o index.md
# Code repos (extract symbols via tree-sitter)
python scripts/github_index.py owner/repo --code-symbols -o index.md
# Multiple repos combined
python scripts/github_index.py owner/repo1 owner/repo2 -o combined.md
| Flag | Description |
|------|-------------|
| -o, --output | Output file (default: github_index.md) |
| --token | GitHub PAT; also reads GITHUB_TOKEN env |
| --include-patterns | Only index matching globs: "docs/**" "src/**" |
| --exclude-patterns | Skip matching globs: "test/**" |
| --max-files | Cap files per repo (default: 200) |
| --skip-fetch | Tree only, no content fetch (fast, filename-only descriptions) |
| --code-symbols | Include code files, extract function/class names via tree-sitter |
title: and description: fields--code-symbols)Some repos have stub files (links to external docs, empty readmes). In these cases:
Manual curation recommended. Use the tree output and domain knowledge:
# Get tree structure only (fast)
python scripts/github_index.py owner/repo --skip-fetch -o skeleton.md
# Then manually enhance descriptions based on domain knowledge
For code-heavy repos with embedded apps:
acc_wav_gen → "ACC waveform generation"# {Repo} - Content Index
**Repository:** {url}
**Branch:** `{branch}`
## Retrieval Method
{API curl commands}
---
## {Category}
| Description | Path |
|-------------|------|
| {What this covers} | `{path/file.md}` |
Description column leads (relevance matching), path follows (retrieval key).
Enumerate files:
curl -sL "https://api.github.com/repos/OWNER/REPO/git/trees/BRANCH?recursive=1"
Fetch content:
curl -s "https://api.github.com/repos/OWNER/REPO/contents/PATH?ref=BRANCH" \
-H "Accept: application/vnd.github+json" | \
python3 -c "import sys,json,base64; print(base64.b64decode(json.load(sys.stdin)['content']).decode())"
Allowlist: api.github.com, raw.githubusercontent.com
accessing-github-repos - Private repos, PAT setup, tarball downloadmapping-codebases - Detailed code structure (methods, imports, line numbers)For token-constrained project knowledge, use the condensed script:
python scripts/pk_index.py owner/repo -o repo_pk.md
Produces ~80% smaller output:
path — descriptionIdeal when adding multiple repo indexes to project knowledge.
development
--- name: verifying-claims description: Check that a document's claims about code are actually true by reading the prose, the code, and the tests and reporting (or fixing) where they disagree. Use whenever the user wants to verify a README, guide, spec, or docstring still matches the code; whenever they mention documentation drift, doc-code sync, "is this still accurate", stale docs, or keeping docs/tests/code consistent; before publishing or merging a docs change; or as a periodic doc-accuracy
tools
Query, filter, and transform Markdown structurally with mq — a jq-like CLI for Markdown. Use to extract headings/sections/code-blocks/links from .md files, build a table of contents, pull code blocks of a given language, slice or reshape LLM prompt/output Markdown, or batch-transform docs. Triggers on "extract sections from this markdown", "get all the code blocks", "jq for markdown", "mq", or any structural query over Markdown that grep/Read can't do cleanly.
development
Composes single-file HTML artifacts (PR review writeups, status reports, incident postmortems, slide decks, design systems, prototypes, flowcharts, module maps, feature explainers, kanban boards, prompt tuners) from a small JSON spec instead of hand-written HTML/CSS/JS. Use when the user asks to "compare options side-by-side", requests an HTML version of a report or review or deck, asks for a flowchart, status update, postmortem, design system reference, interactive prototype, custom editor — or explicitly says "HTML artifact", "single HTML file", "self-contained HTML". Skip for ad-hoc HTML snippets (forms, emails, embedded widgets) where there's no template fit.
development
DAG workflow runner that encodes control flow in code, not prose. Use when a procedure has 3+ steps with branching, retries, or validation that must be enforced — gates as `when=`, edge contracts as `validate=`, predicate loops as `retry_until=`. The runner owns the graph; the LLM provides leaves. Also covers parallel execution, checkpoint resume, detached side-effects.