skills/omni-plan-nth/SKILL.md
Nth-iteration omni-plan — recursive orchestration that chains ALL ProductionOS skills and agents, evaluates strictly per iteration, and loops until 10/10 is achieved. Each iteration can invoke any command or skill in the system.
npx skillsauth add ShaheerKhawaja/ProductionOS omni-plan-nthInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are the Omni-Plan Nth orchestrator. Unlike standard /omni-plan which runs a fixed 13-step pipeline, you run an unbounded recursive loop that invokes ANY skill or command in the ProductionOS ecosystem until the codebase scores 10/10 across all dimensions.
Target: 10/10. No exceptions. No "good enough."
target — Target directory, repo URL, or idea description. Optional.max_iterations — Maximum iterations before forced exit (default: 20, hard cap: 50). Optional.focus — Focus area: architecture | security | ux | performance | full (default: full). Optional.max_cost — Maximum accumulated cost in USD before halting (default: 20). Optional.Before executing, run the shared ProductionOS preamble:
.productionos/ for existing outputWhen dispatching agents:
run_in_background: trueSKIP: {skill} not available.productionos/After each agent completes, dispatch the self-evaluator agent. Apply the 7-question protocol:
.productionos/self-eval/Check what already exists from prior commands:
ls .productionos/ 2>/dev/null
Read every existing artifact in .productionos/. Build a context map:
Rule: NEVER redo work that already exists. If /deep-research already produced findings, consume them.
Scan agents/ directory. For each agent, read only the YAML frontmatter (name + description). Build an agent capability map:
AVAILABLE AGENTS:
REVIEW: code-reviewer, ux-auditor, adversarial-reviewer, security-hardener
EXECUTE: refactoring-agent, self-healer, dynamic-planner
RESEARCH: deep-researcher, research-pipeline, comparative-analyzer
DESIGN: frontend-designer, asset-generator
OPS: gitops, comms-assistant, reverse-engineer
JUDGE: llm-judge, persona-orchestrator, convergence-monitor
Identify which external skills are available. For each of these, log YES/NO:
/plan-ceo-review, /plan-eng-review, /qa, /browse, /review, /ship/deep-research, /auto-swarm, /production-upgrade, /security-audit, /agentic-evalLog available skills to .productionos/SKILL-MAP.md.
Run the LLM judge on the current codebase. Score all 10 dimensions (1-10):
Save baseline to .productionos/SCORE-BASELINE.md.
EXIT CONDITION: ALL 10 dimensions = 10/10. If any dimension < 10, continue iterating. Maximum iterations: max_iterations (default: 20).
Each iteration follows this structure:
ITERATION N
PHASE 0: COST CHECK — Mandatory budget enforcement
PHASE 1: ASSESS — What dimensions are below 10?
PHASE 2: PLAN — Which skills/commands address those dimensions?
PHASE 3: EXECUTE — Run the selected skills/commands
PHASE 4: EVALUATE — Re-score all 10 dimensions
PHASE 5: DECIDE — Continue, pivot, or deliver
OUTPUT: .productionos/ITERATION-{N}.md
Before any work in this iteration, enforce the cost ceiling:
.productionos/TOKEN-BUDGET.md to get accumulated_cost.productionos/OMNI-NTH-COST-HALT.md.productionos/CONVERGENCE-LOG.mdThis check is non-negotiable. No iteration may begin without passing the cost ceiling check.
Read the latest score (from previous iteration or baseline). Identify:
Based on the weak dimensions, select which skills and commands to invoke THIS iteration:
| Weak Dimension | Skills to Invoke | Agents to Deploy |
|----------------|------------------|------------------|
| Code Quality | /plan-eng-review, code-reviewer, refactoring-agent, naming-enforcer | Read diff, apply fixes, re-lint |
| Security | /security-audit, security-hardener, vulnerability-explorer, adversarial-reviewer | OWASP scan, dependency audit |
| Performance | performance-profiler, database-auditor | N+1 detection, index analysis |
| UX/UI | frontend-designer, ux-auditor, frontend-scraper | Design audit, component review |
| Test Coverage | test-architect, /qa | Generate tests, run coverage |
| Accessibility | ux-auditor, frontend-scraper | WCAG audit, contrast check |
| Documentation | comms-assistant, /plan-ceo-review | README accuracy, API docs |
| Error Handling | code-reviewer, adversarial-reviewer | Error path mapping |
| Observability | code-reviewer, performance-profiler | Logging, tracing, metrics |
| Deployment Safety | gitops, dependency-scanner, migration-planner | CI/CD, rollback plan |
Focus narrowing rule: Each iteration focuses on the 2-3 LOWEST scoring dimensions. Do not spread effort across all 10 — concentrate force.
For each selected skill/command:
.productionos/[ -f "package.json" ] && bun test 2>/dev/null
[ -f "pyproject.toml" ] && python -m pytest 2>/dev/null
[ -f "package.json" ] && npx eslint . --fix 2>/dev/null
[ -f "pyproject.toml" ] && ruff check --fix . 2>/dev/null
Re-invoke the tri-tiered judge panel:
Judge 1 (Correctness): Does the code do what it claims? Are all tests passing? Are all types correct? Judge 2 (Completeness): Are ALL edge cases handled? ALL error paths? ALL loading/empty/error states? Judge 3 (Adversarial): How would I break this? What is the weakest assumption? What did the fixes miss?
Scoring rules:
Consensus: All 3 judges must agree within 0.5 points. If they disagree, trigger a debate round where each judge sees the others' reasoning and re-evaluates.
Save iteration results to .productionos/ITERATION-{N}.md:
## Iteration N Results
### Scores
| Dimension | Before | After | Delta | Evidence |
|-----------|--------|-------|-------|----------|
### Skills Invoked
- [list of skills/commands run this iteration]
### Fixes Applied
- [list of changes with file:line]
### Regressions
- [any dimensions that dropped — MUST investigate]
### Remaining Gaps
- [what prevents each dimension from being 10]
IF all_dimensions == 10:
DELIVER (proceed to delivery protocol)
IF any_dimension_regressed AND regression > 0.5:
ROLLBACK last batch, investigate regression
Re-plan with regression prevention constraint
IF overall_grade_improving AND iteration < max:
CONTINUE to next iteration
Focus on lowest 2-3 dimensions
IF overall_grade_stalled (delta < 0.1 for 2 iterations):
PIVOT strategy
Try different skills/agents than previous iterations
If already pivoted twice: accept plateau, document remaining gaps
IF iteration >= max:
FORCED EXIT
Document final state and remaining gaps
Log to .productionos/OMNI-NTH-FINAL.md
When 10/10 achieved or plateau accepted:
/review or code-reviewer — final pre-merge review/qa or /browse — visual verification if frontend exists.productionos/OMNI-NTH-REPORT.mdThis is how omni-plan-nth calls other commands within an iteration:
/omni-plan-nth (YOU)
INVOKE /deep-research "security best practices for {stack}"
Produces: .productionos/RESEARCH-security-*.md
INVOKE /auto-swarm "fix all P0 security findings" --mode fix
Produces: .productionos/SWARM-REPORT.md
INVOKE /security-audit
Produces: .productionos/AUDIT-SECURITY.md
INVOKE /agentic-eval
Produces: .productionos/EVAL-CLEAR.md
INVOKE /auto-swarm-nth "achieve 100% test coverage" --mode build
Produces: .productionos/SWARM-NTH-REPORT.md
You can invoke /auto-swarm-nth as a sub-command for execution-heavy phases. You can invoke /deep-research for any topic that needs investigation. You can invoke ANY skill from the skill map.
Constraint: Never invoke /omni-plan-nth recursively. You ARE the top-level orchestrator. Use /auto-swarm-nth for parallel execution within your iterations.
FAIL: {agent} — {error}. Degrade gracefully. Continue pipeline.SKIP: {command} not available. Continue without it.Escalate when:
Format:
STATUS: BLOCKED | NEEDS_CONTEXT
REASON: [what went wrong]
ATTEMPTED: [what was tried, with results]
RECOMMENDATION: [what to do next]
.productionos/
SKILL-MAP.md — Available skills/agents (from preliminary)
SCORE-BASELINE.md — Initial 10-dimension score
ITERATION-{N}.md — Per-iteration results
CONVERGENCE-LOG.md — Grade progression across iterations
TOKEN-BUDGET.md — Accumulated cost tracking
OMNI-NTH-REPORT.md — Final delivery report
OMNI-NTH-FINAL.md — Forced exit state (if max iterations reached)
OMNI-NTH-COST-HALT.md — Cost ceiling halt state
self-eval/ — Self-evaluation logs per agent
[all artifacts from invoked sub-commands]
tools
Implementation planning workflow that turns approved ideas into dependency-aware execution plans.
development
Local RAG and Graph RAG over the SecondBrain wiki vault. Progressive context loading (hot cache -> index -> domain -> entity). Graph traversal via wikilink resolution. Use when agents need cross-project context, when answering questions that span multiple domains, or when building context for planning tasks. Triggers on: "wiki context", "cross-project context", "what do we know about", "check the wiki", "graph context", "/wiki-rag".
devops
UX improvement pipeline — creates user stories from UI guidelines, maps user journeys, identifies friction, dispatches fix agents. The user-experience equivalent of /production-upgrade.
development
Test-driven development workflow that writes failing tests first, implements minimally, and refactors safely.