plugins/claude-memory/skills/get-token-insights/SKILL.md
Use this skill when the user wants to analyze Claude token usage, understand Claude API spending, check cache hit rates, review Claude Code workflow patterns (skills, agents, hooks), or get cost optimization recommendations.
npx skillsauth add gupsammy/claudest get-token-insightsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Parse JSONL conversation files from ~/.claude/projects/*/ into per-turn analytics tables, then analyze both cost-optimization opportunities and Claude Code workflow patterns (skills, agents, hooks).
Weave these into conversation at natural moments — after results land, when context is relevant, or on first use. One or two per run, not all at once.
python3 ${CLAUDE_PLUGIN_ROOT}/skills/get-token-insights/scripts/ingest_token_data.py
First run processes all files (~100s for ~2500 files) — warn the user about the wait before running. Incremental runs complete in under 5s. The script populates analytics tables, deploys an interactive dashboard to ~/.claude-memory/dashboard.html (built from templates/dashboard.html), and prints a slim JSON blob to stdout (full data goes to dashboard only).
If the script exits non-zero, report the error and stop.
After parsing the JSON stdout from Step 1, construct a personalized prompt for a claude-code-guide agent using the actual data — not generic descriptions. For each of the top 3 insights (by waste_usd), include verbatim: the finding text, root_cause text, waste_usd value, solution.action, and solution.detail. Also include the specific project names, counts, and numbers mentioned in the insight (e.g. "meta-ads-cli: 75 cliffs across 53 sessions") so the agent's response is grounded in the user's real usage patterns.
Spawn the agent with subagent_type: "claude-code-guide" in foreground (do not use run_in_background). Wait for the agent to return before proceeding to Step 2. Weave its suggestions into the analysis in Step 2.
Capture the JSON stdout from Step 1 as the analysis input. Structure the analysis in two parts:
State the total spend, session count, date range, and average cost per session in one paragraph.
For each insight from the insights array (sorted by waste_usd):
If cache_bust_ttl_impact.material == true, weave into the Priority Insights narrative: "Your 5-min cache-bust costs average $X.XX/day — a protective hook set can surface a warning before each rebuild fires (Step 4 at the end of this run will offer to install it)."
Compare cost across models. If one model dominates spend, call it out and estimate savings from switching routine tasks to a cheaper model.
List top 3 projects by dollar spend. For the most expensive project, identify what drives the cost.
Summarize which skills are invoked most, error rates per skill, and any skills that appear underused relative to the user's workflow.
Show which subagent types are spawned, how often, and whether model overrides are being used. Flag if subagent_type is frequently omitted (defaults to general-purpose when Explore would suffice).
Identify the slowest hooks by total runtime and average latency. Flag any hooks with high error rates.
If the trends object in the JSON output is non-empty, present a week-on-week comparison:
State the current and prior window session counts and total cost.
For each item in trends.improved, state the metric and its percentage change. Explain why it likely improved if you can infer from context (e.g., hook fix, retired skill, CLAUDE.md rule).
For each item in trends.regressed, flag it and suggest what might have caused it.
List any new or retired skills and hooks. For new items, note whether they appear intentional. For retired items, confirm they are no longer needed.
Highlight the hooks with the biggest latency changes (from trends.hook_trends). For hooks that improved significantly, credit the fix. For hooks that got slower, flag for investigation.
If trends is empty or has no current_window, skip Part C and note that not enough historical data exists for comparison yet.
Present the full analysis as markdown with the sections above. Do not pause or ask questions — proceed immediately to Step 3.
python3 -c "import webbrowser, pathlib; webbrowser.open((pathlib.Path.home() / '.claude-memory' / 'dashboard.html').as_uri())"
Note the dashboard is available for deeper exploration — Section 2 (Context Management) shows the cache-bust cost charts and an amber alert banner if costs are material. Section 6 (Claude Code Ecosystem) has the skill, agent, and hook charts.
If cache_bust_ttl_impact.material == true, tell the user: "One last thing coming up — I'll offer to install the cache-bust warning hooks." Then proceed immediately to Step 4 without pausing.
Run this step only if cache_bust_ttl_impact.material == true in the JSON from Step 1.
First run the status check:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/get-token-insights/scripts/install_cache_hooks.py --status
Then use AskUserQuestion with exactly three options:
~/.claude/hooks/ and wires them into ~/.claude/settings.json. Backs up settings.json before any write.Full explanation text (for option 2): These hooks create a 3-step warning ladder when you go idle. When Claude stops responding, a timestamp is written. When you start a resumed session, a flag is set. When you next type a prompt after 5+ minutes of idle time, the prompt is blocked once with a cost warning — you see the message, your text stays in the box. From there:
One warning per idle gap. Not per prompt. After you confirm once, subsequent prompts in the same idle gap go through without interruption. There is no nag loop.
If user selects Install:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/get-token-insights/scripts/install_cache_hooks.py
If the script reports "All hooks already installed", confirm that to the user and skip. If it installs, tell the user: "Restart Claude Code (quit + reopen) for the hooks to activate."
If cache_bust_ttl_impact.material == false, skip Step 4 entirely.
After the AskUserQuestion in Step 4 resolves — regardless of which option the user chose — ask: "Want to dive deeper into any specific project, skill, or insight from the analysis?"
tools
This skill should be used when the user asks to "design a CLI", "help me design command-line flags", "what flags should my tool have", "create a CLI spec", "refactor my CLI interface", "design a CLI my agent can call", or wants to design command-line UX (args/flags/subcommands/help/output/errors/config) before implementation or audit an existing CLI surface for consistency and composability.
testing
Recall, search, continue, or analyze past conversations. Triggers on recall phrases ("what did we discuss", "continue where we left off", "we decided"), retrospective phrases ("do a retro", "post-mortem", "what went well", "lessons learned", "find antipatterns"), and implicit signals (past-tense references, possessives without context, assumptive questions like "do you remember").
data-ai
Persist learnings to memory or maintain existing memories. Triggers on "extract learnings", "save this for next time", "remember this pattern", "consolidate memories", "dream", "clean up memories".
development
Use for any image creation or editing request — logo, sticker, product mockup, nano banana, t2i, i2i, multi-reference compositing via generate.py. Not for HTML/CSS mockups, diagrams, or coded UI.