graphify/SKILL.md
Use for any question about a codebase, its architecture, file relationships, or project content — especially when graphify-out/ exists, where the question should be treated as a graphify query first. Turns any input (code, docs, papers, images, videos) into a persistent knowledge graph with god nodes, community detection, and query/path/explain tools.
npx skillsauth add safishamsi/graphify graphifyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
4 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Turn any folder of files into a navigable knowledge graph with community detection, an honest audit trail, and three outputs: interactive HTML, GraphRAG-ready JSON, and a plain-language GRAPH_REPORT.md.
/graphify # full pipeline on current directory → Obsidian vault
/graphify <path> # full pipeline on specific path
/graphify https://github.com/<owner>/<repo> # clone repo then run full pipeline on it
/graphify https://github.com/<owner>/<repo> --branch <branch> # clone a specific branch
/graphify <url1> <url2> ... # clone multiple repos, build each, merge into one cross-repo graph
/graphify <path> --mode deep # thorough extraction, richer INFERRED edges
/graphify <path> --update # incremental - re-extract only new/changed files
/graphify <path> --directed # build directed graph (preserves edge direction: source→target)
/graphify <path> --whisper-model medium # use a larger Whisper model for better transcription accuracy
/graphify <path> --cluster-only # rerun clustering on existing graph
/graphify <path> --no-viz # skip visualization, just report + JSON
/graphify <path> --html # (HTML is generated by default - this flag is a no-op)
/graphify <path> --svg # also export graph.svg (embeds in Notion, GitHub)
/graphify <path> --graphml # export graph.graphml (Gephi, yEd)
/graphify <path> --neo4j # generate graphify-out/cypher.txt for Neo4j
/graphify <path> --neo4j-push bolt://localhost:7687 # push directly to Neo4j
/graphify <path> --mcp # start MCP stdio server for agent access
/graphify <path> --watch # watch folder, auto-rebuild on code changes (no LLM needed)
/graphify <path> --wiki # build agent-crawlable wiki (index.md + one article per community)
/graphify <path> --obsidian --obsidian-dir ~/vaults/my-project # write vault to custom path (e.g. existing vault)
/graphify add <url> # fetch URL, save to ./raw, update graph
/graphify add <url> --author "Name" # tag who wrote it
/graphify add <url> --contributor "Name" # tag who added it to the corpus
/graphify query "<question>" # BFS traversal - broad context
/graphify query "<question>" --dfs # DFS - trace a specific path
/graphify query "<question>" --budget 1500 # cap answer at N tokens
/graphify path "AuthModule" "Database" # shortest path between two concepts
/graphify explain "SwinTransformer" # plain-language explanation of a node
Drop any folder of code, docs, papers, images, or video into graphify and get a queryable knowledge graph. Persistent across sessions, honest audit trail (EXTRACTED/INFERRED/AMBIGUOUS), community detection surfaces cross-document connections you wouldn't think to ask about.
If the user invoked /graphify --help or /graphify -h (with no other arguments), print the contents of the ## Usage section above verbatim and stop. Do not run any commands, do not detect files, do not default the path to .. Just print the Usage block and return.
Fast path — existing graph: Before doing anything else, check whether graphify-out/graph.json exists. The expected location is graphify-out/graph.json relative to the current working directory (i.e. the project root where you are running commands). If it exists AND the user's request is a natural-language question about the codebase (e.g. "How does X work?", "What calls Y?", "Trace the data flow through Z") and NOT an explicit rebuild command (--update, --cluster-only, or a bare path/URL that implies fresh extraction): skip Steps 1–5 entirely and jump straight to ## For /graphify query. Run graphify query "<question>" immediately. Do not run detect. Do not check corpus size. Do not ask the user to narrow. The graph is already built — use it.
If no path was given, use . (current directory). Do not ask the user for a path.
If the path argument starts with https://github.com/ or http://github.com/, treat it as a GitHub URL - run Step 0 before anything else, then continue with the resolved local path.
Follow these steps in order. Do not skip steps.
Only when the path is one or more https://github.com/... URLs, or several local subfolders to merge. See references/github-and-merge.md for the clone, cross-repo merge, and monorepo flow, then continue with the resolved local path. A plain local path skips this step.
# Detect the correct Python interpreter (handles uv tool, pipx, venv, system installs)
PYTHON=""
GRAPHIFY_BIN=$(which graphify 2>/dev/null)
# 1. uv tool installs — most reliable on modern Mac/Linux
if [ -z "$PYTHON" ] && command -v uv >/dev/null 2>&1; then
_UV_PY=$(uv tool run graphifyy python -c "import sys; print(sys.executable)" 2>/dev/null)
if [ -n "$_UV_PY" ]; then PYTHON="$_UV_PY"; fi
fi
# 2. Read shebang from graphify binary (pipx and direct pip installs)
if [ -z "$PYTHON" ] && [ -n "$GRAPHIFY_BIN" ]; then
_SHEBANG=$(head -1 "$GRAPHIFY_BIN" | tr -d '#!')
case "$_SHEBANG" in
*[!a-zA-Z0-9/_.-]*) ;;
*) "$_SHEBANG" -c "import graphify" 2>/dev/null && PYTHON="$_SHEBANG" ;;
esac
fi
# 3. Fall back to python3
if [ -z "$PYTHON" ]; then PYTHON="python3"; fi
if ! "$PYTHON" -c "import graphify" 2>/dev/null; then
if command -v uv >/dev/null 2>&1; then
uv tool install --upgrade graphifyy -q 2>&1 | tail -3
_UV_PY=$(uv tool run graphifyy python -c "import sys; print(sys.executable)" 2>/dev/null)
if [ -n "$_UV_PY" ]; then PYTHON="$_UV_PY"; fi
else
"$PYTHON" -m pip install graphifyy -q 2>/dev/null \
|| "$PYTHON" -m pip install graphifyy -q --break-system-packages 2>&1 | tail -3
fi
fi
# Write interpreter path for all subsequent steps (persists across invocations)
mkdir -p graphify-out
"$PYTHON" -c "import sys; open('graphify-out/.graphify_python', 'w', encoding='utf-8').write(sys.executable)"
# Save scan root so `graphify update` (no args) knows where to look next time
echo "$(cd INPUT_PATH && pwd)" > graphify-out/.graphify_root
If the import succeeds, print nothing and move straight to Step 2.
In every subsequent bash block, replace python3 with $(cat graphify-out/.graphify_python) to use the correct interpreter.
$(cat graphify-out/.graphify_python) -c "
import json
from graphify.detect import detect
from pathlib import Path
result = detect(Path('INPUT_PATH'))
print(json.dumps(result, ensure_ascii=False))
" > graphify-out/.graphify_detect.json
Replace INPUT_PATH with the actual path the user provided. Do NOT cat or print the JSON - read it silently and present a clean summary instead:
Corpus: X files · ~Y words
code: N files (.py .ts .go ...)
docs: N files (.md .txt ...)
papers: N files (.pdf ...)
images: N files
video: N files (.mp4 .mp3 ...)
Omit any category with 0 files from the summary.
Then act on it:
total_files is 0: stop with "No supported files found in [path]."skipped_sensitive is non-empty: mention file count skipped, not the file names.total_words > 2,000,000 OR total_files > 500: show the warning. Then compute the top 5 first-level subdirectories by file count:
scan_root from the detect JSON (always an absolute path to the resolved INPUT_PATH).code, document, paper, image, video).scan_root + "/graphify-out/" to exclude converted sidecars.scan_root prefix and take the first path component. Files directly in scan_root with no subdirectory count as (root).(root) with no subdirectories, do not ask to narrow — no subfolders exist. Instead suggest --no-cluster to skip the expensive clustering step and proceed.Skip this step entirely if detect returned zero video files. When the corpus has video or audio, see references/transcribe.md to transcribe them to text first, then treat the transcripts as doc files in Step 3.
Before starting: note whether --mode deep was given. You must pass DEEP_MODE=true to every subagent in Step B2 if it was. Track this from the original invocation - do not lose it.
This step has two parts: structural extraction (deterministic, free) and semantic extraction (LLM, costs tokens).
Before dispatching subagents: check whether GEMINI_API_KEY or GOOGLE_API_KEY is set. If neither is set, print this one-liner to the user:
Tip: set
GEMINI_API_KEYorGOOGLE_API_KEYto use Gemini for semantic extraction (pip install 'graphifyy[gemini]').
Print it once, then continue. If GEMINI_API_KEY or GOOGLE_API_KEY IS set, use graphify.llm.extract_corpus_parallel(files, backend="gemini") for semantic extraction instead of dispatching Claude subagents. The default Gemini model is gemini-3-flash-preview; set GRAPHIFY_GEMINI_MODEL or pass --model in headless CLI flows to override it.
No other API keys are read. If
GEMINI_API_KEY/GOOGLE_API_KEYare unset, fall straight through to Claude Code subagent dispatch (Part B below) — the host session itself is the LLM. graphify does not readANTHROPIC_API_KEY,OPENAI_API_KEY, or any other provider key from the environment. If a host agent prompts the user forANTHROPIC_API_KEYto run extraction, that prompt is a misread of this skill — ignore it and dispatch subagents as written.
Run Part A (AST) and Part B (semantic) in parallel. Dispatch all semantic subagents AND start AST extraction in the same message. Both can run simultaneously since they operate on different file types. Merge results in Part C as before.
Note: Parallelizing AST + semantic saves 5-15s on large corpora. AST is deterministic and fast; start it while subagents are processing docs/papers.
For any code files detected, run AST extraction in parallel with Part B subagents:
$(cat graphify-out/.graphify_python) -c "
import sys, json
from graphify.extract import collect_files, extract
from pathlib import Path
import json
code_files = []
detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
for f in detect.get('files', {}).get('code', []):
code_files.extend(collect_files(Path(f)) if Path(f).is_dir() else [Path(f)])
if code_files:
result = extract(code_files, cache_root=Path('.'))
Path('graphify-out/.graphify_ast.json').write_text(json.dumps(result, indent=2, ensure_ascii=False), encoding=\"utf-8\")
print(f'AST: {len(result[\"nodes\"])} nodes, {len(result[\"edges\"])} edges')
else:
Path('graphify-out/.graphify_ast.json').write_text(json.dumps({'nodes':[],'edges':[],'input_tokens':0,'output_tokens':0}, ensure_ascii=False), encoding=\"utf-8\")
print('No code files - skipping AST extraction')
"
Fast path: If detection found zero docs, papers, and images (code-only corpus), skip Part B entirely and go straight to Part C. AST handles code - there is nothing for semantic subagents to do.
MANDATORY: You MUST use the Agent tool here. Reading files yourself one-by-one is forbidden - it is 5-10x slower. If you do not use the Agent tool you are doing this wrong.
Before dispatching subagents, print a timing estimate:
total_words and file counts from graphify-out/.graphify_detect.jsonceil(uncached_non_code_files / 22) (chunk size is 20-25)Step B0 - Check extraction cache first
Before dispatching any subagents, check which files already have cached extraction results:
$(cat graphify-out/.graphify_python) -c "
import json
from graphify.cache import check_semantic_cache
from pathlib import Path
detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
all_files = [f for files in detect['files'].values() for f in files]
cached_nodes, cached_edges, cached_hyperedges, uncached = check_semantic_cache(all_files)
if cached_nodes or cached_edges or cached_hyperedges:
Path('graphify-out/.graphify_cached.json').write_text(json.dumps({'nodes': cached_nodes, 'edges': cached_edges, 'hyperedges': cached_hyperedges}, ensure_ascii=False), encoding=\"utf-8\")
Path('graphify-out/.graphify_uncached.txt').write_text('\n'.join(uncached), encoding=\"utf-8\")
print(f'Cache: {len(all_files)-len(uncached)} files hit, {len(uncached)} files need extraction')
"
Only dispatch subagents for files listed in graphify-out/.graphify_uncached.txt. If all files are cached, skip to Part C directly.
Step B1 - Split into chunks
Load files from graphify-out/.graphify_uncached.txt. Split into chunks of 20-25 files each. Each image gets its own chunk (vision needs separate context). When splitting, group files from the same directory together so related artifacts land in the same chunk and cross-file relationships are more likely to be extracted.
Step B2 - Dispatch ALL subagents in a single message
Call the Agent tool multiple times IN THE SAME RESPONSE - one call per chunk. This is the only way they run in parallel. If you make one Agent call, wait, then make another, you are doing it sequentially and defeating the purpose.
IMPORTANT - subagent type: Always use subagent_type="general-purpose". Do NOT use Explore - it is read-only and cannot write chunk files to disk, which silently drops extraction results. General-purpose has Write and Bash access which the subagent needs.
Concrete example for 3 chunks:
[Agent tool call 1: files 1-15, subagent_type="general-purpose"]
[Agent tool call 2: files 16-30, subagent_type="general-purpose"]
[Agent tool call 3: files 31-45, subagent_type="general-purpose"]
All three in one message. Not three separate messages.
Each subagent receives this exact prompt (substitute FILE_LIST, CHUNK_NUM, TOTAL_CHUNKS, DEEP_MODE, and CHUNK_PATH).
CHUNK_PATH must be an absolute path — derive it before dispatching:
PROJECT_ROOT=$(cat graphify-out/.graphify_root)
# Then for chunk N: CHUNK_PATH="${PROJECT_ROOT}/graphify-out/.graphify_chunk_0N.json"
Subagent prompt template:
See references/extraction-spec.md for the exact subagent prompt (JSON schema, node-ID rules, confidence rubric, frontmatter, hyperedge, and vision rules). Load it only here, only when at least one chunk holds a doc, paper, or image; a pure-code corpus has skipped Part B and never reads it. Pass each subagent that prompt verbatim with FILE_LIST, CHUNK_NUM, TOTAL_CHUNKS, DEEP_MODE, and CHUNK_PATH substituted, and have it write the result to CHUNK_PATH.
Step B3 - Collect, cache, and merge
Wait for all subagents. For each result:
graphify-out/.graphify_chunk_NN.json exists on disk — this is the success signalnodes and edges, include it and save to cacheIf more than half the chunks failed or are missing, stop and tell the user to re-run and ensure subagent_type="general-purpose" is used.
Merge all chunk files into .graphify_semantic_new.json. After each Agent call completes, read the real token counts from the Agent tool result's usage field and write them back into the chunk JSON before merging — the chunk JSON itself always has placeholder zeros. Then run:
$(cat graphify-out/.graphify_python) -c "
import json, glob
from pathlib import Path
chunks = sorted(glob.glob('graphify-out/.graphify_chunk_*.json'))
all_nodes, all_edges, all_hyperedges = [], [], []
total_in, total_out = 0, 0
for c in chunks:
d = json.loads(Path(c).read_text(encoding=\"utf-8\"))
all_nodes += d.get('nodes', [])
all_edges += d.get('edges', [])
all_hyperedges += d.get('hyperedges', [])
total_in += d.get('input_tokens', 0)
total_out += d.get('output_tokens', 0)
Path('graphify-out/.graphify_semantic_new.json').write_text(json.dumps({
'nodes': all_nodes, 'edges': all_edges, 'hyperedges': all_hyperedges,
'input_tokens': total_in, 'output_tokens': total_out,
}, indent=2, ensure_ascii=False), encoding=\"utf-8\")
print(f'Merged {len(chunks)} chunks: {total_in:,} in / {total_out:,} out tokens')
"
Save new results to cache:
$(cat graphify-out/.graphify_python) -c "
import json
from graphify.cache import save_semantic_cache
from pathlib import Path
new = json.loads(Path('graphify-out/.graphify_semantic_new.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_semantic_new.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
saved = save_semantic_cache(new.get('nodes', []), new.get('edges', []), new.get('hyperedges', []))
print(f'Cached {saved} files')
"
Merge cached + new results into graphify-out/.graphify_semantic.json:
$(cat graphify-out/.graphify_python) -c "
import json
from pathlib import Path
cached = json.loads(Path('graphify-out/.graphify_cached.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_cached.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
new = json.loads(Path('graphify-out/.graphify_semantic_new.json').read_text(encoding=\"utf-8\")) if Path('graphify-out/.graphify_semantic_new.json').exists() else {'nodes':[],'edges':[],'hyperedges':[]}
all_nodes = cached['nodes'] + new.get('nodes', [])
all_edges = cached['edges'] + new.get('edges', [])
all_hyperedges = cached.get('hyperedges', []) + new.get('hyperedges', [])
seen = set()
deduped = []
for n in all_nodes:
if n['id'] not in seen:
seen.add(n['id'])
deduped.append(n)
merged = {
'nodes': deduped,
'edges': all_edges,
'hyperedges': all_hyperedges,
'input_tokens': new.get('input_tokens', 0),
'output_tokens': new.get('output_tokens', 0),
}
Path('graphify-out/.graphify_semantic.json').write_text(json.dumps(merged, indent=2, ensure_ascii=False), encoding=\"utf-8\")
print(f'Extraction complete - {len(deduped)} nodes, {len(all_edges)} edges ({len(cached[\"nodes\"])} from cache, {len(new.get(\"nodes\",[]))} new)')
"
Clean up temp files: rm -f graphify-out/.graphify_cached.json graphify-out/.graphify_uncached.txt graphify-out/.graphify_semantic_new.json
$(cat graphify-out/.graphify_python) -c "
import sys, json
from pathlib import Path
ast = json.loads(Path('graphify-out/.graphify_ast.json').read_text(encoding=\"utf-8\"))
sem = json.loads(Path('graphify-out/.graphify_semantic.json').read_text(encoding=\"utf-8\"))
# Merge: AST nodes first, semantic nodes deduplicated by id
seen = {n['id'] for n in ast['nodes']}
merged_nodes = list(ast['nodes'])
for n in sem['nodes']:
if n['id'] not in seen:
merged_nodes.append(n)
seen.add(n['id'])
merged_edges = ast['edges'] + sem['edges']
merged_hyperedges = sem.get('hyperedges', [])
merged = {
'nodes': merged_nodes,
'edges': merged_edges,
'hyperedges': merged_hyperedges,
'input_tokens': sem.get('input_tokens', 0),
'output_tokens': sem.get('output_tokens', 0),
}
Path('graphify-out/.graphify_extract.json').write_text(json.dumps(merged, indent=2, ensure_ascii=False), encoding=\"utf-8\")
total = len(merged_nodes)
edges = len(merged_edges)
print(f'Merged: {total} nodes, {edges} edges ({len(ast[\"nodes\"])} AST + {len(sem[\"nodes\"])} semantic)')
"
Before starting: note whether --directed was given. If so, pass directed=True to build_from_json() in the code block below. This builds a DiGraph that preserves edge direction (source→target) instead of the default undirected Graph.
mkdir -p graphify-out
$(cat graphify-out/.graphify_python) -c "
import sys, json
from graphify.build import build_from_json
from graphify.cluster import cluster, score_all
from graphify.analyze import god_nodes, surprising_connections, suggest_questions
from graphify.report import generate
from graphify.export import to_json
from pathlib import Path
extraction = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
detection = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
G = build_from_json(extraction)
communities = cluster(G)
cohesion = score_all(G, communities)
tokens = {'input': extraction.get('input_tokens', 0), 'output': extraction.get('output_tokens', 0)}
gods = god_nodes(G)
surprises = surprising_connections(G, communities)
labels = {cid: 'Community ' + str(cid) for cid in communities}
# Placeholder questions - regenerated with real labels in Step 5
questions = suggest_questions(G, communities, labels)
report = generate(G, communities, cohesion, labels, gods, surprises, detection, tokens, '.', suggested_questions=questions)
Path('graphify-out/GRAPH_REPORT.md').write_text(report, encoding=\"utf-8\")
to_json(G, communities, 'graphify-out/graph.json')
analysis = {
'communities': {str(k): v for k, v in communities.items()},
'cohesion': {str(k): v for k, v in cohesion.items()},
'gods': gods,
'surprises': surprises,
'questions': questions,
}
Path('graphify-out/.graphify_analysis.json').write_text(json.dumps(analysis, indent=2, ensure_ascii=False), encoding=\"utf-8\")
if G.number_of_nodes() == 0:
print('ERROR: Graph is empty - extraction produced no nodes.')
print('Possible causes: all files were skipped, binary-only corpus, or extraction failed.')
raise SystemExit(1)
print(f'Graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges, {len(communities)} communities')
"
If this step prints ERROR: Graph is empty, stop and tell the user what happened - do not proceed to labeling or visualization.
Replace INPUT_PATH with the actual path.
Read graphify-out/.graphify_analysis.json. For each community key, look at its node labels and write a 2-5 word plain-language name (e.g. "Attention Mechanism", "Training Pipeline", "Data Loading").
Then regenerate the report and save the labels for the visualizer:
$(cat graphify-out/.graphify_python) -c "
import sys, json
from graphify.build import build_from_json
from graphify.cluster import score_all
from graphify.analyze import god_nodes, surprising_connections, suggest_questions
from graphify.report import generate
from pathlib import Path
extraction = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
detection = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
analysis = json.loads(Path('graphify-out/.graphify_analysis.json').read_text(encoding=\"utf-8\"))
G = build_from_json(extraction)
communities = {int(k): v for k, v in analysis['communities'].items()}
cohesion = {int(k): v for k, v in analysis['cohesion'].items()}
tokens = {'input': extraction.get('input_tokens', 0), 'output': extraction.get('output_tokens', 0)}
# LABELS - replace these with the names you chose above
labels = LABELS_DICT
# Regenerate questions with real community labels (labels affect question phrasing)
questions = suggest_questions(G, communities, labels)
report = generate(G, communities, cohesion, labels, analysis['gods'], analysis['surprises'], detection, tokens, '.', suggested_questions=questions)
Path('graphify-out/GRAPH_REPORT.md').write_text(report, encoding=\"utf-8\")
Path('graphify-out/.graphify_labels.json').write_text(json.dumps({str(k): v for k, v in labels.items()}, ensure_ascii=False), encoding=\"utf-8\")
print('Report updated with community labels')
"
Replace LABELS_DICT with the actual dict you constructed (e.g. {0: "Attention Mechanism", 1: "Training Pipeline"}).
Replace INPUT_PATH with the actual path.
Generate HTML always (unless --no-viz). Obsidian vault only if --obsidian was explicitly given — skip it otherwise, it generates one file per node.
If --obsidian was given:
--obsidian-dir <path> was also given, pass it via --dir. Otherwise defaults to graphify-out/obsidian.graphify export obsidian
# or with custom dir: graphify export obsidian --dir ~/vaults/my-project
Generate the HTML graph (always, unless --no-viz):
graphify export html # auto-aggregates to community view if graph > 5000 nodes
# or: graphify export html --no-viz
These run only when their flag is present (--wiki, --neo4j/--neo4j-push, --svg, --graphml, --mcp) or, for the token-reduction benchmark, when total_words exceeds 5,000. A default run with no export flags skips all of them. See references/exports.md for each one. Run any --wiki export before Step 9 cleanup so .graphify_labels.json is still available.
$(cat graphify-out/.graphify_python) -c "
import json
from pathlib import Path
from datetime import datetime, timezone
from graphify.detect import save_manifest
# Save manifest for --update
detect = json.loads(Path('graphify-out/.graphify_detect.json').read_text(encoding=\"utf-8\"))
# In --update mode, 'all_files' carries the full corpus; 'files' is the changed
# subset. Full-rebuild mode populates only 'files', so the fallback handles that.
save_manifest(detect.get('all_files') or detect['files'])
# Update cumulative cost tracker
extract = json.loads(Path('graphify-out/.graphify_extract.json').read_text(encoding=\"utf-8\"))
input_tok = extract.get('input_tokens', 0)
output_tok = extract.get('output_tokens', 0)
cost_path = Path('graphify-out/cost.json')
if cost_path.exists():
cost = json.loads(cost_path.read_text(encoding=\"utf-8\"))
else:
cost = {'runs': [], 'total_input_tokens': 0, 'total_output_tokens': 0}
cost['runs'].append({
'date': datetime.now(timezone.utc).isoformat(),
'input_tokens': input_tok,
'output_tokens': output_tok,
'files': detect.get('total_files', 0),
})
cost['total_input_tokens'] += input_tok
cost['total_output_tokens'] += output_tok
cost_path.write_text(json.dumps(cost, indent=2, ensure_ascii=False), encoding=\"utf-8\")
print(f'This run: {input_tok:,} input tokens, {output_tok:,} output tokens')
print(f'All time: {cost[\"total_input_tokens\"]:,} input, {cost[\"total_output_tokens\"]:,} output ({len(cost[\"runs\"])} runs)')
"
rm -f graphify-out/.graphify_detect.json graphify-out/.graphify_extract.json graphify-out/.graphify_ast.json graphify-out/.graphify_semantic.json graphify-out/.graphify_analysis.json graphify-out/.graphify_chunk_*.json
rm -f graphify-out/.needs_update 2>/dev/null || true
Tell the user (omit the obsidian line unless --obsidian was given):
Graph complete. Outputs in PATH_TO_DIR/graphify-out/
graph.html - interactive graph, open in browser
GRAPH_REPORT.md - audit report
graph.json - raw graph data
obsidian/ - Obsidian vault (only if --obsidian was given)
If graphify saved you time, consider supporting it: https://github.com/sponsors/safishamsi
Replace PATH_TO_DIR with the actual absolute path of the directory that was processed.
Then paste these sections from GRAPH_REPORT.md directly into the chat:
Do NOT paste the full report - just those three sections. Keep it concise.
Then immediately offer to explore. Pick the single most interesting suggested question from the report - the one that crosses the most community boundaries or has the most surprising bridge node - and ask:
"The most interesting question this graph can answer: [question]. Want me to trace it?"
If the user says yes, run /graphify query "[question]" on the graph and walk them through the answer using the graph structure - which nodes connect, which community boundaries get crossed, what the path reveals. Keep going as long as they want to explore. Each answer should end with a natural follow-up ("this connects to X - want to go deeper?") so the session feels like navigation, not a one-shot report.
The graph is the map. Your job after the pipeline is to be the guide.
Before running any subcommand below (--update, --cluster-only, query, path, explain, add), check that .graphify_python exists. If it's missing (e.g. user deleted graphify-out/), re-resolve the interpreter first:
if [ ! -f graphify-out/.graphify_python ]; then
GRAPHIFY_BIN=$(which graphify 2>/dev/null)
if [ -n "$GRAPHIFY_BIN" ]; then
PYTHON=$(head -1 "$GRAPHIFY_BIN" | tr -d '#!')
case "$PYTHON" in *[!a-zA-Z0-9/_.-]*) PYTHON="python3" ;; esac
else
PYTHON="python3"
fi
mkdir -p graphify-out
"$PYTHON" -c "import sys; open('graphify-out/.graphify_python', 'w', encoding='utf-8').write(sys.executable)"
fi
Both are non-default subcommands. --update re-extracts only new or changed files; --cluster-only reruns clustering on the existing graph. See references/update.md for both flows.
When graphify-out/graph.json already exists and the user asks a question about the corpus, run the query directly:
graphify query "<question>"
Answer using only what the graph output contains, and quote source_location when citing a specific fact. Before traversal, expand the question against the graph's own vocabulary so a wording mismatch does not collapse the answer to noise. For that vocab-expansion step, the --dfs / --budget modes, save-result feedback, and the /graphify path and /graphify explain flows, see references/query.md.
Neither is part of the default build. When the user runs /graphify add <url> to fetch a URL into the corpus, or passes --watch to auto-rebuild on file changes, see references/add-watch.md.
When the user asks to install the post-commit auto-rebuild hook or wire graphify into a project's CLAUDE.md, see references/hooks.md.
tools
Use when work should span one or more detached tasks but still behave like one job with a single owner context. TaskFlow is the durable flow substrate under authoring layers like Lobster, ACPX, plugins, or plain code. Keep conditional logic in the caller; use TaskFlow for flow identity, child-task linkage, waiting state, revision-checked mutations, and user-facing emergence.
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------
tools
# Lobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (send, post, delete) - Multiple tool calls should run as one deterministic operation ## When to use Lobster | User intent | Use Lobster? | | ------------------------------------------------------ | --------------------------
tools
A CLI tool for making authenticated requests to the X (Twitter) API. Use this skill when you need to post tweets, reply, quote, search, read posts, manage followers, send DMs, upload media, or interact with any X API v2 endpoint.