core/capabilities/orchestration/quality-locked/SKILL.md
When --quality-locked is active, loop review/revise at each Quality Gate until findings reach a clean bar (no Critical, no Major, only cosmetic Minor) or the iteration cap (10) is reached. Uses a deterministic Python checker for classification and decision logic; agent runs the actual review and revision steps.
npx skillsauth add xoai/sage quality-lockedInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
When the workflow has quality_locked_mode: true (set by --quality-locked
flag, see flag-parser/SKILL.md), every review checkpoint runs as a
deterministic loop instead of a single review-then-user-decides pass.
Decision logic is in code, not prose. The agent calls a Python checker that parses review output, applies the clean bar, and returns the next action. This eliminates "I think this is clean enough" miscounts and silent iteration drift.
The loop runs at these checkpoints (the same ones where auto-review normally fires):
| Workflow | Checkpoint | Review type | |----------|-----------|-------------| | /build | After spec [A] | spec review | | /build | After plan [A] | plan review | | /build | Gate 3 (during quality gates) | code quality review | | /build | After gates pass | auto-QA | | /architect | After design [A] | ADR review | | /architect | After plan [A] | plan review | | /fix | After diagnosis [A] | root cause review | | /fix | After fix plan [A] | fix plan review |
For each iteration (1..10):
1. Run the review sub-agent (Task tool, fresh context).
Capture the raw text output.
2. Call the quality-locked checker:
python -m core.quality_locked check \
--review-output "<sub-agent output text>" \
--iteration <current iteration number> \
--history-json '<JSON array of prior iteration records>'
Returns JSON with: counts, is_clean, cap_reached, stuck, action,
iteration_record.
3. Append iteration_record to manifest.md under quality_locked_history
(the agent writes; the checker provides the structured record).
4. Dispatch on `action`:
- PASS: exit loop, continue the workflow
- REVISE: apply fixes to the artifact/code, increment, loop
- CAP_REACHED: present F/R/E/A prompt to the user
- ESCALATE: present escalation prompt (3 iterations no improvement)
The agent never decides "is this clean enough" — the checker does. The agent only runs the sub-agent and applies the revisions.
The check command is the primary path. If Python is unavailable:
python -m core.quality_locked check ...(Unlike flag-parser, there is no Bash fallback layer here. The parsing and state logic are non-trivial enough that a Bash implementation would be its own reliability risk.)
The checker emits this shape:
{
"counts": {
"critical": 0,
"major": 0,
"substantive": 0,
"cosmetic": 1
},
"is_clean": true,
"cap_reached": false,
"stuck": false,
"action": "PASS",
"iteration_record": {
"iteration": 3,
"counts": { ... },
"result": "PASS"
}
}
action is one of: PASS, REVISE, CAP_REACHED, ESCALATE.
A review is "clean" when ALL of:
critical == 0major == 0substantive == 0cosmetic count is ignored for the clean bar. Cosmetic findings never
trigger another iteration.
The classifier maps both review formats to this unified schema:
iteration + 1Sage: --quality-locked cap reached (10 iterations).
Remaining at {checkpoint_name}:
- Critical: {n}
- Major: {n}
- Minor (substantive): {n}
The same findings keep returning. This suggests:
- The artifact has a structural issue that revision can't fix
- Consider escalating to /architect for a design rethink
- Or accept the findings and proceed manually
[F] Force-proceed — accept the remaining findings, continue
[R] Revise manually — drop --quality-locked, let me edit
[E] Escalate — type /architect to rethink the design
[A] Abort — cancel this workflow
Pick F/R/E/A, or describe what to do.
When --autonomous is also active, the [A] Review checkpoint that
triggers this loop is auto-picked by the agent. The cap-reached and
stuck-escalation prompts below still require user input — see
sage/core/capabilities/orchestration/autonomous/SKILL.md section
"Auto-Pick at Checkpoints" for the full rules.
Sage: 3 iterations with no improvement in findings count.
Iteration {n-2}: {c} critical, {m} major
Iteration {n-1}: {c} critical, {m} major
Iteration {n}: {c} critical, {m} major
This suggests architectural-level issues that spec revision can't fix.
[E] Escalate to /architect (recommended)
[C] Continue iterating (up to cap of 10)
[R] Revise manually — drop --quality-locked
Pick E/C/R, or describe what to do.
Always log the chosen action to manifest with the user's selection.
After each iteration, agent appends to manifest.md:
quality_locked_history:
- checkpoint: spec
iteration: 1
counts: { critical: 2, major: 1, substantive: 0, cosmetic: 1 }
result: REVISE
- checkpoint: spec
iteration: 2
counts: { critical: 0, major: 0, substantive: 0, cosmetic: 1 }
result: PASS
Pass the existing array (or [] for first iteration) as
--history-json so the checker can detect "stuck" patterns.
| Situation | Behavior | |---|---| | Sub-agent times out / Task tool absent | Skip the loop entirely; fall back to single self-review pass. Announce: "Task tool not available — --quality-locked degraded to single-pass review." | | Python checker unavailable | Use prose-rule fallback. Announce: "Quality-locked checker unavailable — using prose rules." | | Sub-agent output unparseable | Checker returns zero counts; agent surfaces raw output to user and exits the loop with action=REVISE. The user can decide manually. | | User Ctrl+C mid-iteration | KeyboardInterrupt exits cleanly; current iteration is already logged. | | Scope violation during auto-revise | Treat as CRITICAL finding for next iteration. Loop continues. |
tools
Captures agent mistakes, corrections, and discovered gotchas so they are not repeated. Use when: (1) a command or operation fails unexpectedly, (2) the user corrects the agent, (3) the agent discovers non-obvious behavior through debugging, (4) an API or tool behaves differently than expected, (5) a better approach is found for a recurring task. Also searches past learnings before starting tasks to avoid known pitfalls. Activate alongside the sage-memory skill — they share the same MCP backend but serve different purposes (sage-memory = codebase knowledge, sage-self-learning = agent mistakes and gotchas).
development
Typed knowledge graph stored in sage-memory. Use when creating or querying structured entities (Person, Project, Task, Event, Document), linking related objects, checking dependencies, planning multi-step actions as graph transformations, or when skills need to share structured state. Trigger on "remember that X is Y", "what do I know about", "link X to Y", "show dependencies", "what blocks X", entity CRUD, cross-skill data access, or any request involving structured relationships between things.
tools
Integrates sage-memory into Sage workflows. Teaches the agent when to remember (store findings during work), when to recall (search memory at session start and task start), and how to learn (structured knowledge capture via sage learn). Use when the user mentions memory, remember, recall, learn, capture knowledge, onboard to codebase, or when starting any session where sage-memory MCP tools are available.
tools
Captures agent mistakes, corrections, and discovered gotchas so they are not repeated. Use when: (1) a command or operation fails unexpectedly, (2) the user corrects the agent, (3) the agent discovers non-obvious behavior through debugging, (4) an API or tool behaves differently than expected, (5) a better approach is found for a recurring task. Also searches past learnings before starting tasks to avoid known pitfalls. Activate alongside the sage-memory skill — they share the same MCP backend but serve different purposes (sage-memory = codebase knowledge, sage-self-learning = agent mistakes and gotchas).