self-improving/SKILL.md
Self-reflection + Self-criticism + learning from corrections. Agent evaluates its own work, catches mistakes, and improves permanently.
npx skillsauth add adminlove520/xiaoxi-skills Self-Improving Agent (With Self-Reflection)Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
User corrects you or points out mistakes. You complete significant work and want to evaluate the outcome. You notice something in your own output that could be better. Knowledge should compound over time without manual maintenance.
Memory lives in ~/self-improving/ with tiered structure. See memory-template.md for setup.
~/self-improving/
├── memory.md # HOT: ≤100 lines, always loaded
├── index.md # Topic index with line counts
├── projects/ # Per-project learnings
├── domains/ # Domain-specific (code, writing, comms)
├── archive/ # COLD: decayed patterns
└── corrections.md # Last 50 corrections log
| Topic | File |
|-------|------|
| Setup guide | setup.md |
| Learning mechanics | learning.md |
| Security boundaries | boundaries.md |
| Scaling rules | scaling.md |
| Memory operations | operations.md |
| Self-reflection log | reflections.md |
All data stored in ~/self-improving/. Create on first use:
mkdir -p ~/self-improving/{projects,domains,archive}
Log automatically when you notice these patterns:
Corrections → add to corrections.md, evaluate for memory.md:
Preference signals → add to memory.md if explicit:
Pattern candidates → track, promote after 3x:
Ignore (don't log):
After completing significant work, pause and evaluate:
corrections.mdWhen to self-reflect:
Log format:
CONTEXT: [type of task]
REFLECTION: [what I noticed]
LESSON: [what to do differently]
Example:
CONTEXT: Building Flutter UI
REFLECTION: Spacing looked off, had to redo
LESSON: Check visual spacing before showing user
Self-reflection entries follow the same promotion rules: 3x applied successfully → promote to HOT.
| User says | Action |
|-----------|--------|
| "What do you know about X?" | Search all tiers for X |
| "What have you learned?" | Show last 10 from corrections.md |
| "Show my patterns" | List memory.md (HOT) |
| "Show [project] patterns" | Load projects/{name}.md |
| "What's in warm storage?" | List files in projects/ + domains/ |
| "Memory stats" | Show counts per tier |
| "Forget X" | Remove from all tiers (confirm first) |
| "Export memory" | ZIP all files |
On "memory stats" request, report:
📊 Self-Improving Memory
HOT (always loaded):
memory.md: X entries
WARM (load on demand):
projects/: X files
domains/: X files
COLD (archived):
archive/: X files
Recent activity (7 days):
Corrections logged: X
Promotions to HOT: X
Demotions to WARM: X
| Tier | Location | Size Limit | Behavior | |------|----------|------------|----------| | HOT | memory.md | ≤100 lines | Always loaded | | WARM | projects/, domains/ | ≤200 lines each | Load on context match | | COLD | archive/ | Unlimited | Load on explicit query |
projects/{name}.mddomains/When patterns contradict:
When file exceeds limit:
See boundaries.md — never store credentials, health data, third-party info.
If context limit hit:
This skill ONLY:
~/self-improving/)This skill NEVER:
~/self-improving/Install with clawhub install <slug> if user confirms:
memory — Long-term memory patterns for agentslearning — Adaptive teaching and explanationdecide — Auto-learn decision patternsescalate — Know when to ask vs act autonomouslyclawhub star self-improvingclawhub syncdata-ai
Spaced-repetition flashcard system. Create cards from facts or text, chat with flashcards using free-text answers graded by the agent, generate quizzes from YouTube transcripts, review due cards with adaptive scheduling, and export/import decks as CSV.
development
Canvas LMS integration — fetch enrolled courses and assignments using API token authentication.
development
Provides PyTorch-native distributed LLM pretraining using torchtitan with 4D parallelism (FSDP2, TP, PP, CP). Use when pretraining Llama 3.1, DeepSeek V3, or custom models at scale from 8 to 512+ GPUs with Float8, torch.compile, and distributed checkpointing.
devops
Optimizes LLM inference with NVIDIA TensorRT for maximum throughput and lowest latency. Use for production deployment on NVIDIA GPUs (A100/H100), when you need 10-100x faster inference than PyTorch, or for serving models with quantization (FP8/INT4), in-flight batching, and multi-GPU scaling.