skill/SKILL.md
Heavy3 Code Audit - Multi-model code review for coding agents (Sponsored by Heavy3.ai)
npx skillsauth add heavy3-ai/code-audit h3Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Sponsored by Heavy3.ai - Multi-model AI for high-stakes decisions
You are helping the user get AI-powered code reviews via OpenRouter.
All features are free and open source:
$ARGUMENTS can contain:
Explicit targets (no confirmation needed):
pr <number> - Review a GitHub pull request by numberplan <path> - Review a specific plan file<file>.md - Shorthand for plan review (any .md file)<range> - Review a commit range (e.g., HEAD~3..HEAD, abc123..def456)Scope modifiers:
--staged - Force review of only staged changes--commit - Force review of the last commit onlyMode options:
--council - Use 3-model council (GPT 5.4 + Gemini 3.1 Pro + Grok 4)--free - Use rotating free model from config--model <name> - Override model (shortcuts: glm, gpt, kimi, deepseek, free)When /h3 is invoked without explicit targets, automatically detect intent and confirm with user.
| Priority | Condition | Action | |----------|-----------|--------| | 1 | Explicit argument provided | Execute directly, no confirmation | | 2 | Uncommitted changes exist | Confirm: review changes? | | 3 | No changes + plan detected | Confirm: review the plan? | | 4 | No changes + no plan | Ask: review commits or specify target? |
Step 1: Check for explicit arguments
If $ARGUMENTS contains any of these, skip detection and execute directly:
pr <number> → PR reviewplan <path> → Plan review of specific file<file>.md (any markdown file path) → Plan review<range> (commit range like HEAD~3..HEAD) → Code review of range--staged → Staged changes review--commit → Last commit reviewStep 2: Check for uncommitted changes
Run: git status --porcelain
If output is NOT empty (changes exist):
## Review Scope
I detected uncommitted changes:
- **Staged**: [X] files
- **Unstaged**: [Y] files
**Review all changes?** (y/n)
If user confirms, proceed with code review of all changes (git diff HEAD).
Step 3: Check for plan (if no changes)
Check these locations in order:
plan.md, PLAN.md, or *.plan.md exist?.md file in ~/.claude/plans/If plan found:
## Plan Detected
Found plan: `[path/to/plan.md]`
Last modified: [date]
**Review this plan?** (y/n)
If user confirms, proceed with plan review.
Step 4: No changes and no plan - ask user
## No Changes Detected
No uncommitted changes or plans found.
**What would you like to review?**
1. Latest commit (`HEAD~1..HEAD`)
2. Recent commits (specify range, e.g., `HEAD~3..HEAD`)
3. Specific file or folder
4. Cancel
Wait for user response and proceed accordingly.
For reviewing features/bug fixes spanning multiple commits:
| Input | Git Command | Description |
|-------|-------------|-------------|
| HEAD~1..HEAD | git diff HEAD~1..HEAD | Last 1 commit |
| HEAD~3..HEAD | git diff HEAD~3..HEAD | Last 3 commits |
| abc123..HEAD | git diff abc123..HEAD | From specific commit to HEAD |
| abc123..def456 | git diff abc123..def456 | Between two commits |
When user specifies a range, show commit summary before the review:
## Reviewing Commit Range: HEAD~3..HEAD
| Commit | Date | Author | Message |
|--------|------|--------|---------|
| abc123 | 2025-01-28 | John | feat: Add login |
| def456 | 2025-01-29 | John | fix: Handle edge case |
| ghi789 | 2025-01-30 | John | test: Add unit tests |
**3 commits, +150/-30 lines across 8 files**
Read the config from: ~/.claude/skills/h3/config.json
{
"model": "z-ai/glm-5",
"free_model": "nvidia/nemotron-3-nano-30b-a3b:free",
"reasoning": "high",
"docs_folder": "documents",
"max_context": 500000,
"enable_web_search": false
}
API key is stored in ~/.claude/skills/h3/.env:
OPENROUTER_API_KEY=your-key-here
!git status --short 2>/dev/null || echo "Not a git repo"
!git diff HEAD --name-only 2>/dev/null || echo "No changes"
!git diff HEAD 2>/dev/null | head -c 10000 || echo "No diff"
python3 ~/.claude/skills/h3/scripts/list-free-models.py --jsonpython3 -c "import json; ..." to update the JSON file)| Scope | Git Command | Use Case |
|-------|-------------|----------|
| Smart (default) | Auto-detected | Let /h3 figure out what to review |
| --staged | git diff --cached | Force review of only staged changes |
| --commit | git diff HEAD~1..HEAD | Force review of the last commit |
| <range> | git diff <range> | Review multiple commits (e.g., HEAD~3..HEAD) |
Error messages:
git add."| Max Context | Chars (approx) | |-------------|----------------| | 200K tokens | ~800K chars |
Before running a review, estimate and display the cost to the user.
| Model | Input | Output | Typical Review Cost | |-------|-------|--------|---------------------| | GLM 5 (default) | $1.00 | $3.20 | ~$0.008-0.02 | | GPT 5.4 (council) | $2.50 | $15.00 | ~$0.05-0.20 | | Gemini 3.1 Pro (council) | $2.00 | $12.00 | ~$0.05-0.18 | | Grok 4 (council) | $3.00 | $15.00 | ~$0.06-0.22 |
input_tokens = total_context_chars / 4
output_tokens = ~2500 (typical review length)
# Single model mode (GLM 5)
single_cost = (input_tokens * 1.00 + output_tokens * 3.20) / 1_000_000
# Council mode (all 3 models in parallel)
council_cost = (input_tokens * (2.50 + 2.00 + 3.00) + output_tokens * (15 + 12 + 15)) / 1_000_000
≈ input_tokens * 7.50/M + output_tokens * 42/M
IMPORTANT: Gather ALL context and save the temp JSON file FIRST, then calculate the cost estimate from the actual context size. This ensures an accurate estimate. The cost estimate is the ONLY user-facing prompt — do not interrupt the user for anything else.
After gathering context and saving to the unique context file ($H3_CONTEXT_FILE), but BEFORE calling the review API:
## Cost Estimate
| Metric | Value |
|--------|-------|
| Context size | ~[X]K chars |
| Est. input tokens | ~[X]K |
| Model(s) | [model name(s)] |
| **Est. cost** | **~$[X.XX]** |
**Proceed with review?** (y/n)
Wait for user to confirm before submitting.
If user declines, exit gracefully: "Review cancelled."
Examples:
Before executing any review workflow, check if changes are too large.
Quick size indicators (likely too large):
If estimated context exceeds limit, DO NOT proceed automatically. Present module options to user:
## Large Change Detected - Module Selection Required
| Metric | Value | Limit |
|--------|-------|-------|
| Changed files | [X] | ~50 |
| Lines changed | +[X]/-[Y] | ~10,000 |
| Est. tokens | ~[X]K | 200K |
I found [X] changed files across these areas:
| # | Module | Files | Est. Tokens | Description |
|---|--------|-------|-------------|-------------|
| 1 | src/components | 18 | ~25K | UI components |
| 2 | src/utils | 12 | ~15K | Utility functions |
| 3 | src/api | 10 | ~20K | API handlers |
| 4 | tests | 5 | ~8K | Test files |
**How would you like to proceed?**
1. Review modules separately (4 reviews, ~$X.XX total)
2. Combine modules 2+3 into one review (3 reviews)
3. Review all together (will truncate to fit limit)
4. Custom grouping (tell me which modules to combine)
After user selects grouping:
## Review Progress
| Module | Files | Status | Key Findings |
|--------|-------|--------|--------------|
| src/components | 18 | ⏳ In Progress | - |
| src/utils + src/api | 22 | ⏸️ Pending | - |
| tests | 5 | ⏸️ Pending | - |
Review each module group:
Update progress after each module:
## Review Progress (Updated)
| Module | Files | Status | Key Findings |
|--------|-------|--------|--------------|
| src/components | 18 | ✅ Complete | 2 security, 1 perf issue |
| src/utils + src/api | 22 | ⏳ In Progress | - |
| tests | 5 | ⏸️ Pending | - |
## Cross-Module Summary
### All Issues by Category
**Security Issues (across all modules):**
- [src/components:42] XSS vulnerability in user input
- [src/api:15] Missing authentication check
**Performance Issues (across all modules):**
- [src/components:88] N+1 query in list render
**Correctness Issues (across all modules):**
- [src/utils:23] Off-by-one error in pagination
### Recommended Fix Priority
1. **CRITICAL**: [Security issue from module 1]
2. **HIGH**: [Performance issue from module 2]
3. **MEDIUM**: [Other issues...]
### Cross-Module Concerns
- [Any issues that span multiple modules]
- [Architectural concerns from combined view]
Follow the Smart Detection workflow, then execute the appropriate review.
Check for explicit targets first (skip detection if found):
pr <number> → Go to PR Review workflowplan <path> or <file>.md → Go to Plan Review workflow<range> (e.g., HEAD~3..HEAD) → Go to Commit Range Review workflow--staged → Go to Staged Changes Review workflow--commit → Go to Last Commit Review workflowIf no explicit target, run Smart Detection:
git status --porcelain for uncommitted changesplan.md in cwd, ~/.claude/plans/)git diff HEAD (all changes)--staged: git diff --cached (staged only)python3 ~/.claude/skills/h3/scripts/review.py --type code --context-file "$H3_CONTEXT_FILE"python3 ~/.claude/skills/h3/scripts/council.py --type code --context-file "$H3_CONTEXT_FILE"--commit)git log -1 --oneline 2>/dev/nullgit log -1 --pretty=format:"%H|%s|%an|%ad" --date=short for hash, subject, author, dategit diff HEAD~1..HEADgit diff HEAD~1..HEAD --name-only"commit_metadata": {
"hash": "abc123...",
"subject": "feat: Add user authentication",
"author": "John Doe",
"date": "2025-01-25"
}
python3 ~/.claude/skills/h3/scripts/review.py --type code --context-file "$H3_CONTEXT_FILE"python3 ~/.claude/skills/h3/scripts/council.py --type code --context-file "$H3_CONTEXT_FILE"<range> like HEAD~3..HEAD)HEAD~3..HEAD, abc123..def456)git rev-parse <start> <end> 2>/dev/nullgit log --oneline --reverse <range>
git diff <range> --stat
git diff <range>git diff <range> --name-only"commit_range": {
"range": "HEAD~3..HEAD",
"commits": [
{"hash": "abc123", "subject": "feat: Add login", "date": "2025-01-28"},
{"hash": "def456", "subject": "fix: Edge case", "date": "2025-01-29"}
],
"total_commits": 2
}
--context-file "$H3_CONTEXT_FILE"plan <path> or <file>.md or detected plan)<file>.md provided → Use that file~/.claude/plans/plan_content fieldfile_contentsplan_content included:
{
"review_type": "plan",
"plan_content": "# Full plan content here\n\n## Overview\n...",
"file_contents": { ... },
"documentation": { ... },
"conversation_context": { ... }
}
python3 ~/.claude/skills/h3/scripts/review.py --type plan --context-file "$H3_CONTEXT_FILE"python3 ~/.claude/skills/h3/scripts/council.py --type plan --context-file "$H3_CONTEXT_FILE"IMPORTANT: Do NOT pass --plan-file as a separate argument. The plan content MUST be included directly in the context JSON file under the plan_content key.
pr <number>)gh pr view <number> --json title,body,author,baseRefName,headRefName,files,additions,deletionsgh pr diff <number>--context-file "$H3_CONTEXT_FILE"Review the conversation history and include relevant context that explains the developer's intent:
/h3 was run earlier in this session, summarize key findingsSelection criteria for relevant exchanges:
Limits:
For code and PR reviews, search for files that import or reference changed files. This helps reviewers catch breaking changes.
When to include: Only for code and PR reviews (not plan reviews).
How to gather:
changed_files, extract its basename (e.g., utils.py from src/utils.py)grep -rn --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" --include="*.py" --include="*.go" --include="*.rs" --include="*.java" \
-e "import.*<basename_without_ext>" -e "from.*<basename_without_ext>" -e "require.*<basename_without_ext>" \
--exclude-dir=node_modules --exclude-dir=.git --exclude-dir=build --exclude-dir=dist --exclude-dir=__pycache__ --exclude-dir=.next --exclude-dir=venv --exclude-dir=.venv \
--exclude-dir=locales --exclude-dir=locale --exclude-dir=i18n --exclude-dir=translations --exclude-dir=generated --exclude-dir=.generated \
.
file_contents (already being reviewed)test_files"(+N more files)" listing just the remaining file paths, one per line (no content snippets)dependent_files dict (keyed by file path, values are the relevant snippets)dependent_files key entirelyIMPORTANT: All content must be included IN the context JSON file. Do NOT pass separate file arguments.
{
"review_type": "plan" | "code" | "pr",
"conversation_context": {
"original_request": "Brief summary of what user originally asked for",
"approach_notes": "Key decisions made during implementation",
"relevant_exchanges": [
{"role": "user", "content": "Can you add validation to the form?"},
{"role": "assistant", "content": "I'll add Zod validation. Using inline validation rather than form-level because..."}
],
"previous_review_findings": "Summary of any prior /h3 review in this session"
},
"plan_content": "# Full plan markdown content (REQUIRED for plan reviews)",
"diff": "git diff output (for code/pr reviews)",
"changed_files": ["path1", "path2"],
"file_contents": {
"path1": "full file content...",
"path2": "full file content..."
},
"documentation": {
"CLAUDE.md": "...",
"documents/feature.md": "..."
},
"test_files": {
"path1.test.ts": "..."
},
"dependent_files": {
"src/components/UserList.tsx": "import { validateEmail } from '../utils';\n...\nconst isValid = validateEmail(user.email);",
"src/api/handlers.ts": "import { calculateTotal } from '../utils';\n...\nreturn calculateTotal(cart.items);"
},
"pr_metadata": {
"number": 123,
"title": "...",
"body": "...",
"author": "...",
"base_branch": "main",
"head_branch": "feature",
"additions": 100,
"deletions": 50
},
"commit_metadata": {
"hash": "abc123...",
"subject": "feat: Add user authentication",
"author": "John Doe",
"date": "2025-01-25"
}
}
CRITICAL: Use Bash to write the context JSON — NEVER use the Write tool.
CRITICAL: Use a UNIQUE filename per review session to prevent concurrent reviews from overwriting each other.
Generate a unique context file path at the start of each review using a timestamp and random suffix:
H3_CONTEXT_FILE="/tmp/h3-context-$(date +%s)-$RANDOM.json"
Then use $H3_CONTEXT_FILE for all subsequent commands in that review session. Example:
H3_CONTEXT_FILE="/tmp/h3-context-$(date +%s)-$RANDOM.json"
cat > "$H3_CONTEXT_FILE" << 'CONTEXT_EOF'
{...the JSON...}
CONTEXT_EOF
This ensures the only user-facing prompt is the cost estimate confirmation. Do NOT use the Write tool for this file.
## Heavy3 Code Audit [Single/Council] (from [model name(s)])
[Output from the review script]
If council mode, show all 3 reviews clearly labeled with their roles.
For council reviews, YOU (Claude) MUST synthesize with a comparison table showing all 3 reviews:
## Claude's Synthesis
### Comparison of All Three Reviews
| Aspect | Correctness (GPT 5.4) | Performance (Gemini 3.1) | Security (Grok 4) |
|--------|----------------------|----------------------|---------------------|
| **Focus** | Bugs, Logic, Edge Cases | Scaling, Memory, N+1 | Vulnerabilities, Auth |
| **Findings** | ❌ 1 bug: null check missing | ⚠️ Potential N+1 query | ✅ No XSS, SQL injection |
| **Verdict** | REQUEST CHANGES | APPROVE WITH NOTES | APPROVE |
Legend: ✅ = No issues | ⚠️ = Warning/Concern | ❌ = Critical issue
### Consensus Issues (Flagged by 2+ reviewers)
- [Issue that multiple reviewers agree on]
### Notable Findings (From individual reviewers)
- **Correctness Expert**: [Specific finding]
- **Security Analyst**: [Specific finding]
- **Performance Critic**: [Specific finding]
### Final Recommendation
[Your overall assessment: APPROVE / APPROVE WITH CHANGES / REQUEST CHANGES]
**Priority Actions:**
1. [Most important fix]
2. [Second priority]
3. [Lower priority]
CRITICAL REQUIREMENT: The 3-column comparison table is Heavy3's TRADEMARK FEATURE.
You MUST ALWAYS include this table for council reviews. This is what differentiates Heavy3 from single-model reviews and provides unique value to users.
Checklist for Council Synthesis:
DO NOT just list the three reviews sequentially without synthesis. DO NOT skip the comparison table even if reviews are similar. DO actively identify where reviewers agree or disagree.
The comparison table is just the start. What makes Heavy3 valuable is turning diverse perspectives into actionable next steps:
The goal isn't just to show three opinions. It's to synthesize them into one clear action plan the developer can execute immediately.
## My Assessment
| # | Issue | Reviewer Says | My Take | Action |
|---|-------|---------------|---------|--------|
| 1 | [Brief] | [Concern] | ✅/⚠️/❌ | [What to do] |
## Proposed Actions
**Immediate fixes I can make:**
1. [Fix with file:line]
**Needs your decision:**
1. [Tradeoff to discuss]
**No action needed:**
1. [Why disagree]
**What would you like me to do?**
1. **Fix all** - Apply all immediate fixes
2. **Fix specific items** - Tell me which (e.g., "fix 1, 3")
3. **Discuss first** - Talk through items
4. **Skip** - No changes
After presenting results and user makes a choice (or after any review completes), display:
---
**Like Heavy3 Code Audit?** ⭐ [Star on GitHub](https://github.com/heavy3-ai/code-audit) | 🤝 [Contribute](https://github.com/heavy3-ai/code-audit/issues) | 📢 Share with your team
When to show:
Keep it brief - one line, non-intrusive.
/tmp/ with a unique name per session (see Temp File Handling) — the Write tool is not available.development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.
development
Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.
development
End-to-end Parallels smoke, upgrade, and rerun workflow for OpenClaw across macOS, Windows, and Linux guests. Use when Codex needs to run, rerun, debug, or interpret VM-based install, onboarding, gateway smoke tests, latest-release-to-main upgrade checks, fresh snapshot retests, or optional Discord roundtrip verification under Parallels.