Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

b-open-io/hunter-skeptic-referee

Name: hunter-skeptic-referee
Author: b-open-io

skills/hunter-skeptic-referee/SKILL.md

npx skillsauth add b-open-io/prompts hunter-skeptic-referee

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Hunter / Skeptic / Referee

An adversarial code review workflow designed by danpeguine (@danpeguine). Three agents run in isolated contexts — no agent sees what any other agent "wants" to hear. This eliminates sycophantic confirmation bias and produces ground-truth bug reports.

User command: /bug-hunt [path | -b branch [--base base]]

Why Isolated Contexts

When a single agent both finds bugs and evaluates them, it anchors on its own earlier judgments. By resetting context between phases and giving each agent only what it needs, every verdict is genuinely independent. The Skeptic cannot see the Hunter's enthusiasm. The Referee cannot see the Skeptic's skepticism.

The Three Agents

| Phase | Agent | Subagent Type | Role | |-------|-------|---------------|------| | 1. Hunter | Jerry | bopen-tools:code-auditor | Find every possible bug. Maximize recall. False positives OK. | | 2. Skeptic | Kayle | bopen-tools:architecture-reviewer | Challenge every finding. Risk/EV calculation. 2x penalty for wrong dismissals. | | 3. Referee | Jason | bopen-tools:tester | Final arbiter. Read code independently. Produce ground truth. |

Target Resolution

The skill supports two modes:

Path mode (default): Scan a file, directory, or the entire project.

/bug-hunt              # Entire project
/bug-hunt src/         # Directory
/bug-hunt lib/auth.ts  # Specific file

Branch diff mode (-b): Scan only files changed between branches. Reads full file contents, not just diffs.

/bug-hunt -b feature-xyz              # vs main
/bug-hunt -b feature-xyz --base dev   # vs dev

For branch diff mode: git diff --name-only <base>...<branch> to get the file list.

Scoring Systems

Hunter (+1/+5/+10)

| Score | Meaning | |-------|---------| | +1 | Low — minor edge case, cosmetic, code smell | | +5 | Medium — functional issue, data inconsistency, missing validation | | +10 | Critical — security vulnerability, data loss, race condition, crash |

Skeptic (risk-calibrated)

Disprove a false positive: +[bug's original points]
Wrongly dismiss a real bug: -2x [bug's original points]
EV formula: EV = (confidence% × points) - ((100 - confidence%) × 2 × points)
Only DISPROVE when EV is positive (confidence > 67%)

Referee (symmetric ±1)

Correct judgment: +1
Incorrect judgment: -1
Framed against "known ground truth" to induce precision

Structured Output Format

All three agents use a consistent BUG-ID format for cross-phase traceability:

Hunter output:

**BUG-[N]** | Severity: [Low/Medium/Critical] | Points: [1/5/10]
- **File:** [path]
- **Line(s):** [number or range]
- **Category:** [logic|security|error-handling|concurrency|edge-case|performance|data-integrity|type-safety|other]
- **Claim:** [one sentence]
- **Evidence:** [code quote]

Skeptic output:

**BUG-[N]** | Original: [points] pts
- **Counter-argument:** [technical argument citing code]
- **Evidence:** [code quote]
- **Confidence:** [0-100]%
- **Risk calc:** EV = ...
- **Decision:** DISPROVE / ACCEPT

Referee output:

**BUG-[N]**
- **Hunter's claim:** [summary]
- **Skeptic's response:** [DISPROVE/ACCEPT + summary]
- **Your analysis:** [independent assessment]
- **VERDICT: REAL BUG / NOT A BUG**
- **Confidence:** High / Medium / Low
- **True severity:** [Low/Medium/Critical]
- **Suggested fix:** [brief direction]

Execution Protocol

Step 1 — Resolve target

Parse arguments for path mode vs branch diff mode. In branch diff mode, run git diff --name-only to get the file list.

Step 2 — Spawn the Hunter (Jerry)

Dispatch bopen-tools:code-auditor with the target scope. The Hunter uses Glob/Read/Grep to examine actual code. Must NOT speculate about unread files.

Step 2b — Early exit check

If Hunter reports TOTAL FINDINGS: 0, skip Skeptic and Referee. Present a clean report directly.

Step 3 — Spawn the Skeptic (Kayle)

Dispatch bopen-tools:architecture-reviewer with ONLY the structured bug list (BUG-IDs, files, lines, claims, evidence, severity). Do NOT pass the full codebase or any narrative text. The Skeptic reads code independently.

Step 4 — Spawn the Referee (Jason)

Dispatch bopen-tools:tester with the Hunter's full report AND the Skeptic's full report. The Referee reads code independently.

Step 5 — Present the report

Display the Referee's verified report:

Summary stats (found / dismissed / confirmed by severity)
Confirmed bugs table sorted by severity
Low-confidence items flagged for manual review
Collapsed <details> section with dismissed bugs for transparency

A clean report (zero confirmed bugs) is a valid result — say so clearly.

Context Boundary Rules

| Phase | Gets access to | |-------|---------------| | Hunter (Jerry) | Full codebase (or changed files in branch diff mode) | | Skeptic (Kayle) | Structured bug list + referenced file paths only | | Referee (Jason) | Hunter findings + Skeptic verdicts only |

Violating these boundaries reintroduces the sycophancy problem. If the Skeptic sees the Hunter's confidence, it anchors on it. If the Referee sees either agent's emotional register, it drifts toward consensus rather than truth.

When to Use

Pre-release security audits
Reviewing unfamiliar or legacy codebases
High-stakes modules (auth, payments, data integrity)
Pull requests with broad scope or architectural changes
Branch review before merge (-b mode)

For quick informal reviews, just use Jerry directly in normal mode.

b-open-io/hunter-skeptic-referee

skills/hunter-skeptic-referee/SKILL.md

This skill should be used when the user asks to 'find bugs', 'do a thorough code review', 'run a security audit', 'hunt for bugs', 'check for correctness issues', or 'review this code for edge cases'. Orchestrates a three-phase adversarial review using three isolated agents — Jerry (Hunter), Kayle (Skeptic), Jason (Referee) — to neutralize sycophancy and produce high-fidelity bug reports. User-facing command: /bug-hunt

13 stars

development

Updated Jul 10, 2026

$ install --global

skillsauth

npx skillsauth add b-open-io/prompts hunter-skeptic-referee

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 10, 2026, 3:52 AM128.7s4 files scanned

SKILL.md

name:: hunter-skeptic-referee
description:: This skill should be used when the user asks to 'find bugs', 'do a thorough code review', 'run a security audit', 'hunt for bugs', 'check for correctness issues', or 'review this code for edge cases'. Orchestrates a three-phase adversarial review using three isolated agents — Jerry (Hunter), Kayle (Skeptic), Jason (Referee) — to neutralize sycophancy and produce high-fidelity bug reports. User-facing command: /bug-hunt
version:: 1.1.2
user-invocable:: false

Hunter / Skeptic / Referee

User command: /bug-hunt [path | -b branch [--base base]]

Why Isolated Contexts

The Three Agents

Target Resolution

The skill supports two modes:

Path mode (default): Scan a file, directory, or the entire project.

/bug-hunt              # Entire project
/bug-hunt src/         # Directory
/bug-hunt lib/auth.ts  # Specific file

Branch diff mode (-b): Scan only files changed between branches. Reads full file contents, not just diffs.

/bug-hunt -b feature-xyz              # vs main
/bug-hunt -b feature-xyz --base dev   # vs dev

For branch diff mode: git diff --name-only <base>...<branch> to get the file list.

Scoring Systems

Hunter (+1/+5/+10)

Skeptic (risk-calibrated)

Disprove a false positive: +[bug's original points]
Wrongly dismiss a real bug: -2x [bug's original points]
EV formula: EV = (confidence% × points) - ((100 - confidence%) × 2 × points)
Only DISPROVE when EV is positive (confidence > 67%)

Referee (symmetric ±1)

Correct judgment: +1
Incorrect judgment: -1
Framed against "known ground truth" to induce precision

Structured Output Format

All three agents use a consistent BUG-ID format for cross-phase traceability:

Hunter output:

**BUG-[N]** | Severity: [Low/Medium/Critical] | Points: [1/5/10]
- **File:** [path]
- **Line(s):** [number or range]
- **Category:** [logic|security|error-handling|concurrency|edge-case|performance|data-integrity|type-safety|other]
- **Claim:** [one sentence]
- **Evidence:** [code quote]

Skeptic output:

**BUG-[N]** | Original: [points] pts
- **Counter-argument:** [technical argument citing code]
- **Evidence:** [code quote]
- **Confidence:** [0-100]%
- **Risk calc:** EV = ...
- **Decision:** DISPROVE / ACCEPT

Referee output:

**BUG-[N]**
- **Hunter's claim:** [summary]
- **Skeptic's response:** [DISPROVE/ACCEPT + summary]
- **Your analysis:** [independent assessment]
- **VERDICT: REAL BUG / NOT A BUG**
- **Confidence:** High / Medium / Low
- **True severity:** [Low/Medium/Critical]
- **Suggested fix:** [brief direction]

Execution Protocol

Step 1 — Resolve target

Parse arguments for path mode vs branch diff mode. In branch diff mode, run git diff --name-only to get the file list.

Step 2 — Spawn the Hunter (Jerry)

Dispatch bopen-tools:code-auditor with the target scope. The Hunter uses Glob/Read/Grep to examine actual code. Must NOT speculate about unread files.

Step 2b — Early exit check

If Hunter reports TOTAL FINDINGS: 0, skip Skeptic and Referee. Present a clean report directly.

Step 3 — Spawn the Skeptic (Kayle)

Step 4 — Spawn the Referee (Jason)

Dispatch bopen-tools:tester with the Hunter's full report AND the Skeptic's full report. The Referee reads code independently.

Step 5 — Present the report

Display the Referee's verified report:

Summary stats (found / dismissed / confirmed by severity)
Confirmed bugs table sorted by severity
Low-confidence items flagged for manual review
Collapsed <details> section with dismissed bugs for transparency

A clean report (zero confirmed bugs) is a valid result — say so clearly.

Context Boundary Rules

When to Use

Pre-release security audits
Reviewing unfamiliar or legacy codebases
High-stakes modules (auth, payments, data integrity)
Pull requests with broad scope or architectural changes
Branch review before merge (-b mode)

For quick informal reviews, just use Jerry directly in normal mode.

Related Skills

b-open-io/claudex

tools

VerifiedTrustedCommunity

This skill should be used when a Claude Code session needs to keep working after Anthropic usage runs out, or when the user asks to run the Claude Code harness on GPT-5.6 Sol. Trigger phrases include "my Anthropic usage ran out", "I'm out of Claude usage", "usage limit reached, what now", "keep working on another model", "run Claude Code on GPT-5.6 Sol", "use GPT-5.6 Sol as the model", "set up claudex", "claudex isn't working", "route the harness through CLIProxyAPI", or "bill against my ChatGPT/Codex subscription". It stands up a local proxy so the Claude Code CLI runs on OpenAI's Codex backend as an escape hatch, and diagnoses that setup when it drifts. macOS + Homebrew.

14SKILL.mdUpdated Jul 17, 2026

b-open-io/visual-wayfinder

testing

VerifiedTrustedCommunity

This skill should be used when the user asks to "open Visual Wayfinder", "answer a Wayfinder ticket visually", "turn this decision into a configurator", "show Wayfinder choices as a dashboard", "prototype the Wayfinder questionnaire", or wants interactive choice cards, tradeoff controls, rankings, ranges, toggles, and consequence previews for one active Wayfinder decision. It wraps the Wayfinder skill and JSON Render; it never replaces the tracker or resolves more than the active decision.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/visual-wayfinder

b-open-io/visual-proposal

development

VerifiedTrustedCommunity

This skill should be used when the user asks to "make a visual proposal", "write this up so I can share it", "present these options visually", "diagram the trade-offs", "turn this plan into something reviewable", or requests a shareable design pitch, architecture proposal, RFC, options comparison, or visual roadmap for work that has not been built. It produces one self-contained, theme-aware HTML page led by grounded diagrams. Use visual-review instead for completed code changes; do not use this skill for internal task tracking.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/visual-proposal

b-open-io/plugin-settings

tools

VerifiedTrustedCommunity

This skill should be used when the user asks to "add plugin settings", "make a plugin configurable", "store per-project plugin configuration", "use settings.local.json", "create a plugin state file", "expose skill settings in Agent Master", or "add a skill interface". Distinguishes official Claude Code settings from project-owned configuration and documents bOpen Agent Master skill interface discovery.

14SKILL.mdUpdated Jul 16, 2026

b-open-io/plugin-settings

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/b-open-io/prompts.git

# Copy into Claude Code skills folder (global)
cp -r prompts/skills/hunter-skeptic-referee ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

b-open-io/prompts

13 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT