skills/self-healing/SKILL.md
Self-healing codebase system. Monitors runtime errors, test failures, and build breaks. Automatically diagnoses root cause, generates fix, validates fix, and applies if safe. Zero human intervention for routine failures.
npx skillsauth add rubicanjr/FinCognis self-healingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Automated fault detection and repair. When a test fails, a build breaks, or a runtime error is logged, the healing pipeline wakes up, finds the root cause, generates a minimal fix, validates it, and applies it -- all without human intervention, as long as confidence is above the safety threshold.
The healing pipeline activates on any of these events:
| Trigger | Signal | Source |
|---------|--------|--------|
| Test failure | Any test exits non-zero | CI logs, npm test output |
| Build break | Compiler/bundler error | CI logs, npm run build output |
| Runtime error | Unhandled exception, 5xx response | Application logs, Vercel logs |
| Type error | tsc --noEmit fails | Pre-commit hook, CI |
| Lint failure | ESLint/Prettier error blocking CI | CI logs |
The following always require human review -- do NOT auto-heal:
replay agent instead)DETECT -> DIAGNOSE -> FIX -> VALIDATE
| | | |
Trigger Root cause Minimal Re-run
fires identified patch test suite
confidence applied Pass? DONE
scored to branch Fail? ROLLBACK
Capture the full failure context:
- Error message (exact, not summarized)
- Stack trace (all frames, not truncated)
- File and line number of origin
- Recent git changes touching that file (last 3 commits)
- Environment (Node version, OS, CI vs local)
Pass all context to the Diagnose phase. Incomplete context = low confidence = no auto-fix.
The sleuth agent performs root cause analysis:
git log -5 -- <file>)| Type | Examples | Confidence Boost |
|------|----------|-----------------|
| Regression | Line added in last commit causes error | +20% |
| Null reference | Cannot read property X of undefined | +10% |
| Import error | Module not found, wrong path | +15% |
| Type mismatch | TS type error at known location | +15% |
| Config error | Missing env var, wrong format | +10% |
| Logic error | Wrong condition, off-by-one | 0% (harder to auto-fix) |
Confidence score = base 50% + type bonus + evidence quality bonus (0-30%).
If final confidence < 80%: log the diagnosis, notify in thoughts/HEALING.md, stop. Do not attempt fix.
The spark agent generates the minimal fix:
Before touching any file:
git stash push -u -m "self-healing: pre-fix stash for <error-id>"
Fix rules:
// self-healing fix: <short reason>Fix size limits:
The verifier agent runs the relevant test suite:
# For test failures: run the specific failing test + its file's full suite
npm test -- --testPathPattern="<failing-file>"
# For build breaks: re-run the build
npm run build
# For type errors: re-run type check
npx tsc --noEmit
# For runtime errors: run integration tests if available
npm run test:integration
Decision:
Validation PASS:
-> git stash drop (discard the safety stash)
-> Commit the fix: "fix: self-healing repair for <error-id>"
-> Log to thoughts/HEALING.md
-> Update MTTR metric
Validation FAIL:
-> git stash pop --index (restore pre-fix state exactly)
-> Log failure to thoughts/HEALING.md
-> Escalate to human with full diagnosis report
-> Do NOT retry automatically (prevents loop)
| Agent | Phase | Responsibility |
|-------|-------|----------------|
| sleuth | Diagnose | Root cause analysis, confidence scoring |
| spark | Fix | Minimal patch generation |
| verifier | Validate | Test suite execution, pass/fail decision |
| self-learner | Post-heal | Record the pattern for future prevention |
| coroner | Post-heal | Check if the same bug exists elsewhere in the codebase |
Append each healing event to thoughts/HEALING.md:
## Healing Event: 2026-04-07T10:15:00Z
### Trigger
- Type: Test failure
- File: src/api/users.test.ts
- Error: "Cannot read properties of undefined (reading 'id')"
- Stack: users.test.ts:42 -> users.ts:87 -> db/query.ts:31
### Diagnosis (sleuth)
- Root cause: db/query.ts:31 returns `null` when no row found; callers assume non-null
- Regression: commit abc123 (2026-04-06) removed the null guard
- Classification: Null reference (confidence: +10%)
- Evidence: stack trace + git blame confirm exact line + recent commit
- Final confidence: 85%
### Fix (spark)
- File: src/api/users.ts line 87
- Change: Added null check before accessing .id
- Lines changed: 3
- Stash: self-healing-20260407-101500
### Validation (verifier)
- Command: npm test -- --testPathPattern="src/api/users"
- Result: PASS (14 tests, 0 failures)
- Decision: KEEP
### Outcome
- Status: HEALED
- Time to repair: 4 minutes
- MTTR contribution: recorded
---
## Healing Event: 2026-04-07T14:22:00Z
### Trigger
- Type: Build break
- Error: Module '"./config"' has no exported member 'DATABASE_URL'
### Diagnosis (sleuth)
- Root cause: config.ts was refactored, export renamed to DB_URL
- Confidence: 91%
### Fix (spark)
- File: src/api/db.ts line 3
- Change: import { DB_URL as DATABASE_URL } from './config'
- Lines changed: 1
### Validation (verifier)
- Command: npm run build
- Result: PASS
### Outcome
- Status: HEALED
- Time to repair: 2 minutes
Track in thoughts/HEALING.md header section (update after each event):
# Self-Healing Metrics (updated: 2026-04-07)
| Metric | Value |
|--------|-------|
| Total healing events | 23 |
| Successful auto-fixes | 19 (82.6%) |
| Escalated to human | 4 (17.4%) |
| Average MTTR (auto-fixed) | 3.8 minutes |
| Average MTTR (escalated) | 47 minutes |
| Most common trigger | Test failure (61%) |
| Most common fix type | Null reference guard |
The healing pipeline will NEVER auto-fix:
- Files matching: **/auth/**, **/security/**, **/middleware/auth*
- Files matching: **/migrations/**, **/schema.*
- Environment variables or secrets
- Package versions in package.json / requirements.txt
- CI/CD configuration (.github/**, vercel.json, Dockerfile)
- Any file touched by a security-related commit message
If the failing code is in a scoped-out path, the pipeline logs the trigger and immediately escalates with the full diagnosis -- no fix attempt.
This skill is referenced by:
sentinel agent during incident responseverifier agent when a final validation failsTo invoke manually:
Use self-healing to diagnose and fix the failing test in src/api/users.test.ts.
Error: "Cannot read properties of undefined (reading 'id')"
Stack trace: [paste full trace]
development
Goal-based workflow orchestration - routes tasks to specialist agents based on user goals
tools
Wiring Verification
development
Connection management, room patterns, reconnection strategies, message buffering, and binary protocol design.
development
Screenshot comparison QA for frontend development. Takes a screenshot of the current implementation, scores it across multiple visual dimensions, and returns a structured PASS/REVISE/FAIL verdict with concrete fixes. Use when implementing UI from a design reference or verifying visual correctness.