Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ShaheerKhawaja/productionos-auto-optimize

Name: productionos-auto-optimize
Author: ShaheerKhawaja

codex-skills/productionos-auto-optimize/SKILL.md

npx skillsauth add ShaheerKhawaja/ProductionOS productionos-auto-optimize

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

productionos-auto-optimize

Use this alias when you want the same workflow through a top-level Codex-safe name without the productionos: namespace.

Overview

This is the Codex-native workflow wrapper for .claude/commands/auto-optimize.md.

Use it when the user wants this exact ProductionOS workflow, not just the umbrella productionos router.

Source of Truth

Read the source command spec at .claude/commands/auto-optimize.md.
Use CODEX-PARITY-HANDOFF.md to confirm runtime support and parity expectations.
Preserve the source workflow's guardrails, scope, artifacts, and verification intent.
Translate Claude-only slash-command and hook semantics into Codex-native execution instead of copying them literally.

Codex Behavior

Summary: Self-improving agent optimization — generates challenger variants of any agent/command, benchmarks against baseline, promotes winners, logs learnings to instincts. Inspired by Karpathy's autoresearch pattern.
Use the source command as the behavioral spec, then execute the same intent with Codex-native tools and constraints.

Inputs

target — Agent or command to optimize (e.g., 'code-reviewer', 'security-hardener', '/production-upgrade') Required.
challengers — Number of challenger variants to generate (default: 3) Default: 3 Optional.
benchmark — Benchmark to evaluate against: 'self-eval' (default) | 'test-suite' | 'llm-judge' | path to custom benchmark Default: self-eval Optional.
hypothesis — Specific hypothesis to test (e.g., 'add chain-of-thought to security-hardener'). If omitted, auto-generates hypotheses. Optional.
max_cost — Maximum cost in USD for the optimization run (default: 5) Default: 5 Optional.
mode — Optimization mode: prompt (modify agent instructions) | model (test different models) | layers (test prompt composition layers) | params (test convergence parameters) Default: prompt Optional.

Execution Outline

Preamble

Agents And Assets

Agents: metaclaw-learner, prompt-optimizer, rubric-evolver
Templates: PREAMBLE.md, PROMPT-COMPOSITION.md
Artifacts: .productionos/AUTO-OPTIMIZE-BASELINE.md, .productionos/AUTO-OPTIMIZE-HARVEST.md, .productionos/AUTO-OPTIMIZE-HYPOTHESES.md, .productionos/AUTO-OPTIMIZE-REPORT.md, .productionos/AUTO-OPTIMIZE-RESULTS.md, .productionos/analytics/skill-usage.jsonl, .productionos/calibration/, .productionos/challengers/challenger-{N}.md, .productionos/instincts/, .productionos/instincts/project/

Workflow

Load only the agents, templates, prompts, and docs referenced by the source command.
Execute the workflow intent with Codex-native tools.
If the source command implies parallel agent work, only delegate when the user explicitly wants that overhead.
Verify with the smallest relevant checks before concluding.
Summarize what changed, what was verified, and what still needs human approval.

Guardrails

Do not claim that Claude-only marketplace, hook, or slash-command behavior runs directly in Codex.
Keep the scope faithful to the source command rather than broadening into a generic repo audit.
Prefer concrete outputs and validation over describing the workflow abstractly.
Cost ceiling: $ARGUMENTS.max_cost (default $5). Hard halt when exceeded.
No regression allowed: If ALL challengers score lower than baseline, keep baseline.
Prompt length limit: Challenger prompts cannot exceed 2x the baseline length.
Model safety: Model changes require human approval before promotion.
Idempotent: Running auto-optimize twice with the same inputs produces the same baseline measurement.
Rollback: If promoted winner causes test failures in subsequent runs, revert automatically.

ShaheerKhawaja/productionos-auto-optimize

codex-skills/productionos-auto-optimize/SKILL.md

Self-improving agent optimization — generates challenger variants of any agent/command, benchmarks against baseline, promotes winners, logs learnings to instincts. Inspired by Karpathy's autoresearch pattern.

7 stars

data-ai

Updated Apr 23, 2026

$ install --global

skillsauth

npx skillsauth add ShaheerKhawaja/ProductionOS productionos-auto-optimize

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 23, 2026, 2:09 AM25.7s1 file scanned

SKILL.md

name:: productionos-auto-optimize
description:: Self-improving agent optimization — generates challenger variants of any agent/command, benchmarks against baseline, promotes winners, logs learnings to instincts. Inspired by Karpathy's autoresearch pattern.
argument-hint:: [repo path, target, or task context]

productionos-auto-optimize

Use this alias when you want the same workflow through a top-level Codex-safe name without the productionos: namespace.

Overview

This is the Codex-native workflow wrapper for .claude/commands/auto-optimize.md.

Use it when the user wants this exact ProductionOS workflow, not just the umbrella productionos router.

Source of Truth

Read the source command spec at .claude/commands/auto-optimize.md.
Use CODEX-PARITY-HANDOFF.md to confirm runtime support and parity expectations.
Preserve the source workflow's guardrails, scope, artifacts, and verification intent.
Translate Claude-only slash-command and hook semantics into Codex-native execution instead of copying them literally.

Codex Behavior

Summary: Self-improving agent optimization — generates challenger variants of any agent/command, benchmarks against baseline, promotes winners, logs learnings to instincts. Inspired by Karpathy's autoresearch pattern.
Use the source command as the behavioral spec, then execute the same intent with Codex-native tools and constraints.

Inputs

target — Agent or command to optimize (e.g., 'code-reviewer', 'security-hardener', '/production-upgrade') Required.
challengers — Number of challenger variants to generate (default: 3) Default: 3 Optional.
benchmark — Benchmark to evaluate against: 'self-eval' (default) | 'test-suite' | 'llm-judge' | path to custom benchmark Default: self-eval Optional.
hypothesis — Specific hypothesis to test (e.g., 'add chain-of-thought to security-hardener'). If omitted, auto-generates hypotheses. Optional.
max_cost — Maximum cost in USD for the optimization run (default: 5) Default: 5 Optional.
mode — Optimization mode: prompt (modify agent instructions) | model (test different models) | layers (test prompt composition layers) | params (test convergence parameters) Default: prompt Optional.

Execution Outline

Preamble

Agents And Assets

Agents: metaclaw-learner, prompt-optimizer, rubric-evolver
Templates: PREAMBLE.md, PROMPT-COMPOSITION.md
Artifacts: .productionos/AUTO-OPTIMIZE-BASELINE.md, .productionos/AUTO-OPTIMIZE-HARVEST.md, .productionos/AUTO-OPTIMIZE-HYPOTHESES.md, .productionos/AUTO-OPTIMIZE-REPORT.md, .productionos/AUTO-OPTIMIZE-RESULTS.md, .productionos/analytics/skill-usage.jsonl, .productionos/calibration/, .productionos/challengers/challenger-{N}.md, .productionos/instincts/, .productionos/instincts/project/

Workflow

Load only the agents, templates, prompts, and docs referenced by the source command.
Execute the workflow intent with Codex-native tools.
If the source command implies parallel agent work, only delegate when the user explicitly wants that overhead.
Verify with the smallest relevant checks before concluding.
Summarize what changed, what was verified, and what still needs human approval.

Guardrails

Do not claim that Claude-only marketplace, hook, or slash-command behavior runs directly in Codex.
Keep the scope faithful to the source command rather than broadening into a generic repo audit.
Prefer concrete outputs and validation over describing the workflow abstractly.
Cost ceiling: $ARGUMENTS.max_cost (default $5). Hard halt when exceeded.
No regression allowed: If ALL challengers score lower than baseline, keep baseline.
Prompt length limit: Challenger prompts cannot exceed 2x the baseline length.
Model safety: Model changes require human approval before promotion.
Idempotent: Running auto-optimize twice with the same inputs produces the same baseline measurement.
Rollback: If promoted winner causes test failures in subsequent runs, revert automatically.

Related Skills

ShaheerKhawaja/writing-plans

tools

VerifiedTrustedCommunity

Implementation planning workflow that turns approved ideas into dependency-aware execution plans.

7SKILL.mdUpdated Apr 23, 2026

ShaheerKhawaja/writing-plans

ShaheerKhawaja/wiki-rag

development

VerifiedTrustedCommunity

Local RAG and Graph RAG over the SecondBrain wiki vault. Progressive context loading (hot cache -> index -> domain -> entity). Graph traversal via wikilink resolution. Use when agents need cross-project context, when answering questions that span multiple domains, or when building context for planning tasks. Triggers on: "wiki context", "cross-project context", "what do we know about", "check the wiki", "graph context", "/wiki-rag".

7SKILL.mdUpdated Apr 23, 2026

ShaheerKhawaja/wiki-rag

ShaheerKhawaja/ux-genie

devops

VerifiedTrustedCommunity

UX improvement pipeline — creates user stories from UI guidelines, maps user journeys, identifies friction, dispatches fix agents. The user-experience equivalent of /production-upgrade.

7SKILL.mdUpdated Apr 23, 2026

ShaheerKhawaja/ux-genie

ShaheerKhawaja/tdd

development

VerifiedTrustedCommunity

Test-driven development workflow that writes failing tests first, implements minimally, and refactors safely.

7SKILL.mdUpdated Apr 23, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ShaheerKhawaja/ProductionOS.git

# Copy into Claude Code skills folder (global)
cp -r ProductionOS/codex-skills/productionos-auto-optimize ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ShaheerKhawaja/ProductionOS

7 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT