skills/aeo-core/SKILL.md
Main AEO skill - calculates confidence scores and decides execution path. Auto-loads on /aeo command.
npx skillsauth add ivzc07/aeo-skills aeo-coreInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Purpose: Calculate confidence scores (0-1) and decide whether to execute autonomously or involve the human.
Loads when user types /aeo in Claude Code.
// Invoke spec-validator to check if task is well-defined
spec_score = aeo_spec_validator.validate(task)
if spec_score < 40:
return REFUSE("Spec too unclear - need more details")
// Continue with confidence calculation
Start with base confidence of 0.50, then adjust:
Add for Clarity (+0.15 each):
Add for Context (+0.10 each):
Subtract for Risk (-0.10 each):
base_confidence = clamp(0.50 + clarity_score + context_score - risk_score, 0.0, 1.0)
// Adjust based on spec quality
if spec_score >= 80: base_confidence += 0.10 // Excellent spec
elif spec_score < 60: base_confidence -= 0.10 // Poor spec
// Critical areas get confidence penalty
if task.touches_payments: base_confidence *= 0.5 // Payments need human oversight
elif task.touches_auth: base_confidence *= 0.7 // Auth needs review
If you have uncertainty about the task, adjust ±0.10:
final_confidence = clamp(base_confidence + llm_adjustment, 0.0, 1.0)
Based on final_confidence, decide execution path:
⚠️ CONFIDENCE BELOW THRESHOLD
Confidence: 0.XX
Threshold: 0.70
Concerns:
• [Spec] Missing acceptance criteria
• [Risk] Touching authentication
• [Context] No similar tasks in memory
Options:
1. Proceed with assumptions
2. Clarify spec first
3. Break into smaller tasks
Please confirm (1-3):
❌ CANNOT EXECUTE - INSUFFICIENT CONFIDENCE
Confidence: 0.XX
Why:
• Spec score: 35/100 (below 40 threshold)
• Touching security without clear requirements
• No acceptance criteria defined
What's needed:
1. Clear acceptance criteria
2. Security requirements specified
3. Test requirements defined
Please improve spec and try again.
After task completes, write signal to memory:
# Append to signal log
echo '{
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"task_id": "unique-id",
"task_description": "brief description",
"predicted_confidence": 0.85,
"actual_difficulty": "easy|medium|hard",
"success": true,
"adjustment": +0.05
}' >> ~/.claude/MEMORY/aeo-signals.jsonl
Actual Difficulty Rating:
Confidence Adjustment:
Rolling Window: Keep last 100 signals, calculate adjustment average
On startup, read recent signals to calibrate:
# Get last 100 signals
tail -100 ~/.claude/MEMORY/aeo-signals.jsonl | jq -s '. | map(.adjustment) | add / length'
Apply average adjustment as offset to all confidence calculations.
/aeo$PAI_DIR/MEMORY/aeo-signals.jsonl$PAI_DIR/MEMORY/aeo-escalations.jsonl$PAI_DIR/MEMORY/aeo-failure-patterns.jsonInvoke aeo-escalation skill when:
User: /aeo
User: Add user authentication with email verification
AEO: [Invoking aeo-spec-validator]
AEO: Spec score: 72/100
AEO: Calculating confidence...
- Base: 0.50
- Clarity: +0.30 (acceptance criteria, tech stack)
- Context: +0.10 (similar task in memory)
- Risk: -0.10 (touching auth)
- Spec adj: -0.10 (spec < 80)
- Security mult: ×0.7
- Final: 0.49
AEO: [Invokes aeo-escalation]
AEO: ❌ CONFIDENCE BELOW THRESHOLD
Confidence: 0.49
Concerns:
• [Spec] Missing security requirements
• [Risk] Touching authentication
• [Context] Need email service details
Options:
1. Add security requirements and proceed
2. Provide email service details
3. Break into smaller tasks
Please clarify (1-3):
User: 2
User: We use Resend for emails, API key in .env
AEO: Recalculating confidence with added context...
Final: 0.71
AEO: ⚡ ADVISORY - Confidence: 0.71
[Acceptance criteria defined]
[Tech stack: Node.js, bcrypt, jwt]
[Email: Resend, API key in .env]
Proceeding with implementation. I'll pause if issues arise.
[Implementation proceeds]
If same task done successfully 3+ times:
Never reach full autonomy for:
If task causes test failures or errors:
testing
Validate that tasks are sufficiently defined before execution. Returns score 0-100.
development
Internal code reviewer with veto power. Reviews changes before commit, blocks security issues.
testing
--- name: aeo-failure-patterns description: Recognize common errors and apply known fixes automatically. Hybrid: core patterns + project-specific learning. --- # AEO Failure Patterns **Purpose:** Recognize common errors and apply known fixes automatically. Uses hybrid architecture: core curated patterns + project-specific learned patterns. ## Architecture **Core Patterns (in this SKILL.md):** - 20-30 curated patterns with high-confidence fixes - Portable across projects - Confidence ≥ 0.85 →
testing
Human-AI interface for when to interrupt and involve humans. Presents clear options and records decisions.