.claude/skills/codex-cli/SKILL.md
OpenAI Codex CLI orchestration for AI-assisted development using gpt-5.3-codex model family. Model variants: gpt-5.3-codex (medium), gpt-5.3-codex-high, gpt-5.3-codex-xhigh. Capabilities: code generation, refactoring, automated editing, parallel task execution, session management, code review, architecture analysis, and MCP integration. Actions: analyze, implement, review, fix, refactor with Codex. Keywords: Codex CLI, gpt-5.3-codex, codex exec, code generation, refactoring, parallel execution, session resume, code review, second opinion, independent review, architecture validation, Context7 MCP. Use when: delegating complex code tasks to Codex, running multi-agent workflows, executing automated reviews, implementing features with AI assistance, resuming previous sessions, querying OpenAI documentation. Triggers: 'use codex', 'codex exec', 'run with codex', 'codex resume', 'implement with codex', 'review with codex', 'codex docs'.
npx skillsauth add alfredolopez80/multi-agent-ralph-loop codex-cliInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
gpt-5.3-codex with three reasoning tiers-m flagReasoning Tier Selection:
| Tier | Model | Use Case | Command |
|------|-------|----------|---------|
| medium | gpt-5.3-codex | Default, fast code tasks | codex exec "prompt" |
| high | gpt-5.3-codex-high | Complex analysis, security | codex exec --profile high "prompt" |
| xhigh | gpt-5.3-codex-xhigh | Architecture, critical review | codex exec --profile xhigh "prompt" |
ultrathink - Take a deep breath. We're not here to write code. We're here to make a dent in the universe.
Codex orchestration should feel inevitable: minimal risk, maximum clarity.
This skill enables Claude to orchestrate OpenAI's Codex CLI (v0.79+) with the gpt-5.3-codex model for code generation, review, analysis, and automated editing. Includes Context7 MCP integration for documentation access.
Ideal Use Cases:
Verify Codex CLI installation:
codex --version # Should show v0.50.0+
Authentication (first time):
codex # Interactive login via ChatGPT account
# Or: export CODEX_API_KEY=sk-...
Model family: gpt-5.3-codex with reasoning tiers
| Model | Reasoning | Use Case |
|-------|-----------|----------|
| gpt-5.3-codex | medium | Default, fast code tasks (recommended) |
| gpt-5.3-codex-high | high | Complex analysis, security review |
| gpt-5.3-codex-xhigh | xhigh | Architecture design, critical decisions |
| o3 | - | Highest reasoning capability |
| o4-mini | - | Fast, simple tasks |
Usage by tier:
# Medium (default) - fast iteration
codex exec "refactor authentication module"
# High - complex analysis
codex exec -m gpt-5.3-codex-high "security audit of payment flow"
# XHigh - architectural decisions
codex exec -m gpt-5.3-codex-xhigh "design microservices architecture"
| Mode | Permission | Use Case |
|------|------------|----------|
| read-only | Read files only (default) | Analysis, review |
| workspace-write | Read/write workspace | Code editing, refactoring |
| danger-full-access | Full system access | Install deps, network |
# Read-only analysis (default)
codex exec -m gpt-5.3-codex "analyze src/auth for security issues"
# Code editing (workspace-write)
codex exec -m gpt-5.3-codex --full-auto "fix bug in login.py"
# With reasoning effort
codex exec -m gpt-5.3-codex --config model_reasoning_effort=high "complex analysis"
# Skip git check (non-git directories)
codex exec --skip-git-repo-check "analyze code"
Add 2>/dev/null to suppress stderr (thinking tokens):
codex exec -m gpt-5.3-codex "review code" 2>/dev/null
# Resume last session (stdin for prompt - required due to CLI bug)
echo "continue with fixes" | codex exec resume --last 2>/dev/null
# Resume with full-auto
echo "apply fixes" | codex exec resume --last --full-auto 2>/dev/null
# Resume specific session
echo "follow-up" | codex exec resume SESSION_ID
Important: Resume inherits model, reasoning, and sandbox from original session.
# JSON Lines output
codex exec --json -m gpt-5.3-codex "analyze code" > output.jsonl
# Extract session ID
SID=$(grep -o '"thread_id":"[^"]*"' output.jsonl | head -1 | cut -d'"' -f4)
# Extract agent message
grep '"type":"agent_message"' output.jsonl | jq -r '.item.text'
Claude collects information first, injects into prompt for faster execution:
# Collect errors
ERRORS=$(npm run lint 2>&1 | grep error)
# Inject context
codex exec -m gpt-5.3-codex --full-auto "Fix these errors:
$ERRORS
Files: src/auth/login.ts, src/utils/token.ts
Constraint: Only modify listed files."
Related tasks reuse sessions for context preservation:
# First: analyze
codex exec -m gpt-5.3-codex "analyze src/auth for issues"
# Continue: fix (reuses context)
echo "fix the issues you found" | codex exec resume --last --full-auto
When to reuse:
Independent tasks run simultaneously:
# Parallel analysis
codex exec --json -m gpt-5.3-codex "analyze auth" > auth.jsonl 2>&1 &
codex exec --json -m gpt-5.3-codex "analyze api" > api.jsonl 2>&1 &
wait
# Parallel fixes with resume
AUTH_SID=$(grep -o '"thread_id":"[^"]*"' auth.jsonl | head -1 | cut -d'"' -f4)
echo "fix issues" | codex exec resume $AUTH_SID --full-auto &
# ...
wait
Parallelizable:
Must serialize:
Before running Codex tasks, confirm with user:
gpt-5.3-codex or gpt-5.2?low, medium, or high?| Task Type | Sandbox | Flags |
|-----------|---------|-------|
| Review/analysis | read-only | --sandbox read-only 2>/dev/null |
| Apply local edits | workspace-write | --full-auto 2>/dev/null |
| Network/deps | danger-full-access | --sandbox danger-full-access --full-auto |
| Resume session | Inherited | echo "prompt" \| codex exec resume --last |
Use Codex as second opinion on Claude's work:
codex exec -m gpt-5.3-codex --sandbox read-only "Review src/payment/processor.py for:
1. Race conditions in transaction processing
2. Proper error handling and rollback
3. Security issues with payment data
4. Edge cases that could cause data loss
Provide specific line numbers and severity ratings."
# Security audit
codex exec -m gpt-5.3-codex --sandbox read-only --config model_reasoning_effort=high \
"Perform security audit of src/auth. Check for:
- Authentication/authorization issues
- Input validation vulnerabilities
- Cryptographic weaknesses
- Sensitive data exposure"
# Performance review
codex exec -m gpt-5.3-codex --sandbox read-only \
"Analyze src/database for performance:
- N+1 query problems
- Missing indexes
- Blocking operations"
codex exec -m gpt-5.3-codex --sandbox read-only \
"Run 'git diff main...HEAD' to see changes.
Review for:
1. Breaking changes
2. Performance implications
3. Test coverage
4. Security concerns
Provide feedback by file with severity levels."
[Verb] + [Scope] + [Requirements] + [Output Format] + [Constraints]
| Read-only | Write | |-----------|-------| | analyze, review, find, explain | fix, refactor, implement, add |
Bad vs Good:
# Bad: vague
codex exec "review code"
# Good: specific
codex exec -m gpt-5.3-codex --sandbox read-only \
"Review src/auth for SQL injection, XSS.
Output: markdown with severity levels.
Format: file:line, description, fix suggestion."
# Consistent structure for aggregation
FORMAT="Output JSON: {category, items: [{file, line, description}]}"
codex exec -m gpt-5.3-codex "review security. $FORMAT" &
codex exec -m gpt-5.3-codex "review performance. $FORMAT" &
codex exec -m gpt-5.3-codex "review quality. $FORMAT" &
wait
# Phase 2: Codex validates Claude's plan
echo "Review this implementation plan for issues:
[Claude's plan here]
Check for:
- Logic errors
- Missing edge cases
- Architecture flaws
- Security concerns" | codex exec -m gpt-5.3-codex --sandbox read-only
# Phase 4: Codex reviews Claude's code
codex exec -m gpt-5.3-codex --sandbox read-only \
"Review implementation in src/feature for:
- Bugs
- Performance issues
- Best practices
- Security vulnerabilities"
--full-auto, --sandbox danger-full-accessAfter every Codex command:
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"
# Reasoning tier profiles
[profiles.medium]
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"
[profiles.high]
model = "gpt-5.3-codex-high"
model_reasoning_effort = "high"
[profiles.xhigh]
model = "gpt-5.3-codex-xhigh"
model_reasoning_effort = "high"
# Task-specific profiles
[profiles.review]
model = "gpt-5.3-codex-high"
model_reasoning_effort = "high"
sandbox_mode = "read-only"
[profiles.implement]
model = "gpt-5.3-codex"
model_reasoning_effort = "medium"
sandbox_mode = "workspace-write"
[profiles.architect]
model = "gpt-5.3-codex-xhigh"
model_reasoning_effort = "high"
sandbox_mode = "read-only"
Usage:
# Use reasoning tier profiles
codex exec --profile medium "quick fix"
codex exec --profile high "security audit"
codex exec --profile xhigh "architecture review"
# Use task-specific profiles
codex exec --profile review "analyze code"
codex exec --profile implement "add feature"
codex exec --profile architect "design system"
| Use Case | Command |
|----------|---------|
| Medium (default) | codex exec "prompt" 2>/dev/null |
| High reasoning | codex exec -m gpt-5.3-codex-high "prompt" 2>/dev/null |
| XHigh reasoning | codex exec -m gpt-5.3-codex-xhigh "prompt" 2>/dev/null |
| Edit files | codex exec --full-auto "prompt" 2>/dev/null |
| High effort | --config model_reasoning_effort=high |
| Resume last | echo "prompt" \| codex exec resume --last |
| JSON output | codex exec --json "prompt" > out.jsonl |
| Specific dir | codex exec -C /path "prompt" |
| Non-git dir | --skip-git-repo-check |
| Profile: medium | codex exec --profile medium "prompt" |
| Profile: high | codex exec --profile high "prompt" |
| Profile: xhigh | codex exec --profile xhigh "prompt" |
| Profile: review | codex exec --profile review "prompt" |
| Profile: architect | codex exec --profile architect "prompt" |
Both Claude and Codex have Context7 MCP configured. Use it to access OpenAI documentation:
| Library ID | Content | Snippets |
|------------|---------|----------|
| /websites/developers_openai_codex | Codex CLI docs | 614 |
| /websites/platform_openai | OpenAI API docs | 9,418 |
| /openai/openai-python | Python SDK | 429 |
| /openai/openai-node | Node.js SDK | 437 |
# Before running complex Codex commands, verify syntax
mcp__context7__query-docs:
libraryId: "/websites/developers_openai_codex"
query: "exec sandbox modes full-auto workspace-write"
# On errors, look up solutions
mcp__context7__query-docs:
libraryId: "/websites/developers_openai_codex"
query: "error troubleshooting session resume"
# ~/.codex/config.toml - Codex MCP configuration
# STDIO server (local command)
[mcp_servers.context7]
command = "npx"
args = ["-y", "@upstash/context7-mcp@latest"]
# Remote HTTP server
[mcp_servers.remote]
url = "https://example.com/mcp"
bearer_token_env_var = "API_TOKEN"
# With environment variables
[mcp_servers.server.env]
API_KEY = "value"
# List configured MCP servers
codex mcp list
# Add new MCP server
codex mcp add context7 -- npx -y @upstash/context7-mcp
# Test MCP server
npx @modelcontextprotocol/inspector codex mcp-server
/openai-docs - OpenAI documentation access skillreferences/cli_reference.md - Complete CLI argumentsreferences/prompt_patterns.md - Advanced prompt designreferences/parallel_execution.md - Parallel orchestration detailsdevelopment
Living knowledge base management. Actions: search (query vault), save (store learning), index (update indices), compile (raw->wiki->rules graduation), init (create vault structure). Follows Karpathy pipeline: ingest->compile->query. Use when: (1) searching accumulated knowledge, (2) saving learnings, (3) compiling raw notes into wiki, (4) initializing a new vault. Triggers: /vault, 'vault search', 'knowledge base', 'save learning'.
testing
Produce a verifiable technical specification before coding. 6 mandatory sections: Interfaces, Behaviors, Invariants (from Aristotle Phase 2), File Plan, Test Plan, Exit Criteria (executable bash commands + expected results). Use when: (1) before implementing features with complexity > 4, (2) as Step 1.5 in orchestrator workflow, (3) when requirements need formalization. Triggers: /spec, 'create spec', 'write specification', 'technical spec'.
testing
Pre-launch shipping checklist orchestrating /gates, /security, /browser-test, /perf. Ensures nothing ships without passing all quality checks. Use when: (1) before deploying, (2) before merging to main, (3) before release. Triggers: /ship, 'ship it', 'ready to deploy', 'pre-launch check'.
development
Performance optimization skill. Core Web Vitals via Lighthouse, bundle size analysis, metrics tracking over time. Use when: (1) optimizing frontend performance, (2) analyzing bundle size, (3) tracking metrics regression. Triggers: /perf, 'performance audit', 'core web vitals', 'bundle size'.