ace/cli/skills/kayba-pipeline/SKILL.md
End-to-end agent evaluation and improvement pipeline. Takes a traces folder and optional HITL flag, then orchestrates sub-agents through 7 stages — each stage is its own skill invoked by a dedicated sub-agent. Trigger when the user says "run the pipeline", "kayba pipeline", "evaluate and fix", "full eval", "analyze traces and fix", or provides a traces folder with intent to improve their agent.
npx skillsauth add kayba-ai/agentic-context-engine kayba-pipelineInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
End-to-end pipeline: analyze traces → define metrics → build rubric → plan fixes → implement fixes.
Each stage is a separate skill file that can be run independently or as part of this pipeline.
The user provides two things:
TRACES_FOLDER — path to a directory containing trace JSON filesHITL — true or false — whether to pause for human review before implementing fixesIf the user doesn't specify HITL, default to true (safe default).
┌─────────────────────────────────────────────────────────────────────┐
│ Stage 1: Kayba API Analysis → skill: kayba-pipeline:stage-1-api-analysis │
│ Stage 2: Domain Context Gathering → skill: kayba-pipeline:stage-2-domain-context │
│ ─── stages 1 & 2 run in parallel ─── │
│ Stage 3: Metrics & Analysis → skill: kayba-pipeline:stage-3-metrics │
│ Stage 4: Rubric Definition → skill: kayba-pipeline:stage-4-rubric │
│ Stage 5: Action Plan → skill: kayba-pipeline:stage-5-action-plan │
│ Stage 6: HITL Gate → skill: kayba-pipeline:stage-6-hitl │
│ Stage 7: Fix Implementation → skill: kayba-pipeline:stage-7-fixer │
└─────────────────────────────────────────────────────────────────────┘
You are the orchestrator. Your job is to:
eval/ directory and eval/pipeline_log.mdCreate eval/ directory and initialize eval/pipeline_log.md:
# Pipeline Log
| Stage | Name | Status | Started | Completed | Notes |
|-------|------|--------|---------|-----------|-------|
| 1 | Kayba API Analysis | pending | | | |
| 2 | Domain Context | pending | | | |
| 3 | Metrics & Analysis | pending | | | |
| 4 | Rubric Definition | pending | | | |
| 5 | Action Plan | pending | | | |
| 6 | HITL Gate | pending | | | |
| 7 | Fix Implementation | pending | | | |
Spawn two sub-agents in parallel using the Agent tool:
Agent 1:
api-analystgeneral-purposeInvoke the skill "kayba-pipeline:stage-1-api-analysis" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely.Agent 2:
domain-scoutgeneral-purposeInvoke the skill "kayba-pipeline:stage-2-domain-context" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely.Wait for both to complete before proceeding.
Spawn one sub-agent after stages 1 & 2 complete:
metric-engineergeneral-purposeInvoke the skill "kayba-pipeline:stage-3-metrics" using the Skill tool. The traces folder is: {TRACES_FOLDER}. Follow the skill instructions completely — this includes iterating on the metrics until you're satisfied.Spawn one sub-agent after stage 3 completes:
rubric-buildergeneral-purposeInvoke the skill "kayba-pipeline:stage-4-rubric" using the Skill tool. Follow the skill instructions completely.Spawn one sub-agent after stage 4 completes:
action-plannergeneral-purposeInvoke the skill "kayba-pipeline:stage-5-action-plan" using the Skill tool. Follow the skill instructions completely.If HITL is true:
Spawn one sub-agent after stage 5 completes:
hitl-reviewergeneral-purposeInvoke the skill "kayba-pipeline:stage-6-hitl" using the Skill tool. Follow the skill instructions completely. Present the full review to the user and collect their decision before proceeding.Wait for the sub-agent to complete. Check eval/stage6_decision.md for the outcome:
eval/stage6_decision.md, then re-run Stage 6If HITL is false:
eval/pipeline_log.mdSpawn one sub-agent after stage 6 completes (or is skipped):
fixergeneral-purposeInvoke the skill "kayba-pipeline:stage-7-fixer" using the Skill tool. Follow the skill instructions completely.eval/pipeline_log.md with the stage number and errorUpdate eval/pipeline_log.md with final status for all stages. Report to the user:
development
# ACE — Learn from Traces This skill ships `learn_from_traces.py`, a script that reads OpenClaw session transcripts, feeds them through the ACE learning pipeline, and writes an updated skillbook to disk. ## Usage ```bash python learn_from_traces.py [OPTIONS] [FILES...] ``` The script auto-discovers new sessions from `~/.openclaw/agents/<agent>/sessions/` and only processes files that haven't been processed before. Processed filenames are tracked in `ace_processed.txt`. ## Options | Flag |
devops
Implement the approved fixes from the action plan and log all changes. Trigger when the user says "run stage 7", "implement fixes", "apply action plan", or when invoked by the kayba-pipeline orchestrator. Requires eval/action_plan.md to exist.
testing
Human-In-The-Loop gate that presents the action plan with full context, collects an informed approval/modification/rejection decision, and records the outcome. Trigger when the user says "run stage 6", "HITL review", "approve action plan", or when invoked by the kayba-pipeline orchestrator. Requires eval/action_plan.md and eval/baseline_metrics.md to exist.
development
Triage each insight into discard/code-fix/prompt-fix and produce a prioritized action plan with specific recommendations. Trigger when the user says "run stage 5", "make action plan", "triage skills", or when invoked by the kayba-pipeline orchestrator. Requires eval outputs from stages 1-4.