skills/mnemos/SKILL.md
Task-scoped memory lifecycle — typed MnemoGraph prevents lossy context compaction by treating facts/decisions/code-refs/handoffs as distinct node types with per-type eviction policies
npx skillsauth add alinaqi/claude-bootstrap mnemosInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Mnemos prevents lossy context compaction from destroying the structured knowledge you need most. It treats your working memory as a typed graph (MnemoGraph) where different types of knowledge have different eviction policies:
Mnemos monitors 4 dimensions of "agent fatigue" — all passively observed from hook data, no manual input needed:
| Dimension | Weight | Signal Source | What It Measures | |-----------|--------|--------------|-----------------| | Token utilization | 0.40 | Statusline JSON | How full the context window is | | Scope scatter | 0.25 | PreToolUse file paths | How many directories the agent is bouncing between | | Re-read ratio | 0.20 | PreToolUse Read calls | How often the agent re-reads files it already read (context loss) | | Error density | 0.15 | PostToolUse outcomes | What fraction of tool calls are failing (agent struggling) |
Fatigue states and actions:
| State | Score | Action | |-------|-------|--------| | FLOW | 0.0–0.4 | Normal operation | | COMPRESS | 0.4–0.6 | Micro-consolidation runs (compress 3 ResultNodes, evict 1 cold ContextNode) | | PRE-SLEEP | 0.6–0.75 | Checkpoint written, consolidation runs | | REM | 0.75–0.9 | Emergency checkpoint, consider wrapping up | | EMERGENCY | 0.9+ | Checkpoint written, hand off immediately |
fatigue.json on every API callWhen Claude Code compacts the context (~83% full), Mnemos uses three layers:
.mnemos/just-compacted marker.mnemos-post-compact-inject.sh which detects the marker and injects. Safety net only.The result: after compaction, you'll see a "CONTEXT RESTORED AFTER COMPACTION" block with your goal, constraints, what you were working on, and progress. Resume from there.
mnemos init # Initialize .mnemos/
mnemos status # Show node counts + fatigue
mnemos fatigue # Detailed fatigue breakdown
mnemos checkpoint --force # Write checkpoint now
mnemos resume # Output checkpoint for context
mnemos consolidate # Run micro-consolidation
mnemos nodes --type goal # List active GoalNodes
mnemos add goal "Build auth" # Add a GoalNode
mnemos bridge-icpg # Import iCPG ReasonNodes
mnemos ingest-claude --all # Ingest Claude Code transcripts (see below)
mnemos haze --recent 10 # Show per-session haziness scores
Mnemos can ingest Claude Code session transcripts (the per-session JSONL under
~/.claude/projects/) and score each session's haziness — a measure of how
much the agent struggled. The Stop hook does this automatically on session
exit; it is also available manually.
What's stored: only structural fields (roles, tool names, file paths, error flags, timestamps) plus a redacted, 200-char preview of each turn. Full content is never persisted, and secrets (API keys, tokens, PEM blocks, JWTs, credentials) are redacted before anything touches disk.
Haziness is a weighted score over five dimensions, each in [0,1]:
| Dimension | Weight | What it measures |
|-----------|--------|------------------|
| correction_density | 0.30 | User corrections per eligible user turn |
| redo_ratio | 0.25 | Edits re-touched after an error |
| first_try_error_rate | 0.20 | Edits followed by errors within 3 turns |
| orphan_tool_use_rate | 0.15 | Tool calls with no matching result |
| backtrack_norm | 0.10 | git revert/reset --hard/restore calls |
The composite maps to a band: clear < 0.25 ≤ cloudy < 0.50 ≤ hazy < 0.75 ≤ lost.
mnemos ingest-claude --all # ingest every transcript + score
mnemos ingest-claude --session <id> # one session by id
mnemos ingest-claude --transcript <f> # a specific JSONL file
mnemos haze --recent 10 # table of recent sessions
mnemos haze --session <id> # per-dimension breakdown
Ingestion is idempotent (resumes via last_line_offset). Opt out per project
with touch .mnemos/claude-log.disabled.
When working on a task:
mnemos add goal "what you're trying to achieve" --task-id session-1mnemos add constraint "API backward compatibility" --scope src/api/mnemos fatiguemnemos checkpointMnemos bridges with iCPG (Intent-Augmented Code Property Graph):
mnemos bridge-icpg imports active ReasonNodes as GoalNodesEverything lives in .mnemos/ (gitignored):
mnemo.db — SQLite MnemoGraphfatigue.json — Live token metrics (updated per API call by statusline)signals.jsonl — Behavioral signal log (appended by PreToolUse + PostToolUse hooks)checkpoint-latest.json — Most recent checkpointcheckpoints/ — Archived checkpointstesting
Multi-model validation council — auto-validate plans, architecture changes, and PRs via validate-plan/review before executing
development
Mandatory code reviews via /code-review before commits and deploys
development
# Visual Validation — Autonomous Screenshot Verification ## Philosophy Every UI change should be visually verified before it ships. Peekaboo captures pixel-accurate screenshots. The system compares before/after and flags visual regressions. No manual "looks good to me" — the machine verifies what the machine built. ## Autonomous Flow ``` static/* files modified (detected by auto-review-hook or E2E testkit) ↓ peekaboo image --mode screen → ~/.maggy/visual-verify/after-{ts}.png ↓ Compa
tools
# Model Routing System ## How Routing Decisions Are Made Every user prompt goes through a 9-tier classification pipeline before any AI model processes it. The system answers three questions: 1. **Which model should handle this?** — 9-tier cost/complexity classification 2. **Is the classifier itself working?** — Cascading fallback (qwen3 → kimi → deepseek → cache) 3. **Can we verify the result?** — Tool-level fallback + auto-evaluation ### The Pipeline ``` User types prompt ↓ UserPromptS