Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

in-the-loop-labs/review-roulette

Name: review-roulette
Author: in-the-loop-labs

.pi/skills/review-roulette/SKILL.md

npx skillsauth add in-the-loop-labs/pair-review review-roulette

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Review Roulette

When this skill is active, your ONLY job is orchestration — you do NOT perform any review analysis yourself. You randomly select 3 reasoning models, dispatch the review to all of them in parallel, and merge the results.

Step 1: Discover Available Reasoning Models

Run ${PI_CMD:-pi} --list-models via bash to get the current list of models with valid API keys. Eligible models are those that show thinking: yes in the output — these are the reasoning-capable / premium models.

Excluded models: Never select openai/o3-pro — it is prohibitively expensive. If it appears in the model list, skip it entirely.

Example reasoning models you might see (provider/model format):

anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-5 (with thinking)
openai/o3
openai/o4-mini
openai/gpt-5-pro
openai/gpt-5.2-pro
google/gemini-2.5-pro (with thinking)
google/gemini-2.5-flash (with thinking)
google/gemini-3.1-pro-preview
xai/grok-4

The exact list depends on which API keys are configured. Always check — do not assume models are available.

Step 2: Randomly Select 3 Models

From the eligible reasoning models, pick exactly 3 at random.

CRITICAL — true randomness and diversity:

Do NOT always pick the same 3 models. The entire point of review roulette is variety of perspectives across runs.
Prefer different providers when possible. If you have reasoning models from Anthropic, OpenAI, Google, and xAI, pick from 3 different providers. Only double up on a provider if fewer than 3 providers have eligible models.
Shuffle or randomize your selection each time. Do not default to alphabetical order or any fixed preference.

Step 3: Dispatch the Review in Parallel

Use the task tool with the tasks array to dispatch all 3 reviews simultaneously. Each task object must include:

model: The selected model in provider/model format.
task: The FULL original review prompt/instructions. Each subtask starts fresh with NO conversation history and NO context from the parent. You must forward EVERYTHING you were asked to do — the complete prompt, all instructions, the diff, file contents, any constraints or formatting requirements, the expected JSON output schema, etc. Do not summarize or abbreviate. Pass it all through verbatim.

Example structure:

{
  "tasks": [
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "anthropic/claude-opus-4-6"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "openai/o3"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "google/gemini-2.5-pro"
    }
  ]
}

Step 4: Merge Results

Each subtask will return a review result containing a summary string and a suggestions array (the standard review JSON format).

Collect the results from all 3 subtask responses and merge them:

Suggestions: Concatenate all suggestions arrays into a single array.
Summary: Concatenate all summaries with model attribution. Format the merged summary as:
```
<provider/model1>:
<summary1>

<provider/model2>:
<summary2>

<provider/model3>:
<summary3>
```
This attributed format also serves as a record of which models were used in the review.

Return the merged result as the final JSON response.

Do NOT:

Deduplicate suggestions — let the consumer decide what overlaps
Synthesize, summarize, or editorialize on the combined results
Perform any review analysis yourself

Do:

Concatenate all suggestion arrays: [...model1, ...model2, ...model3]
Concatenate all summaries with provider/model: attribution as shown above
Return the merged result as the final JSON response in the same schema the subtasks used

Summary

You (parent)                    Subtask 1 (model A)
    │                               │
    ├── pick 3 random models        ├── receive full prompt
    ├── forward full prompt ──────► ├── perform review
    │                               └── return suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 2 (model B) ──► suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 3 (model C) ──► suggestions JSON
    │
    └── merge all summaries (with model attribution) + suggestions[] ──► final JSON response

The parent does zero analysis. It is purely a dispatcher and merger. Each model's summary is attributed so the final output records which models contributed.

in-the-loop-labs/review-roulette

.pi/skills/review-roulette/SKILL.md

Dispatch a review task to 3 randomly-selected reasoning models in parallel for diverse perspectives, then merge all suggestions into a single result.

28 stars

data-ai

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add in-the-loop-labs/pair-review review-roulette

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 17, 2026, 11:16 AM50.0s1 file scanned

SKILL.md

name:: review-roulette
description:: Dispatch a review task to 3 randomly-selected reasoning models in parallel for diverse perspectives, then merge all suggestions into a single result.

Review Roulette

Step 1: Discover Available Reasoning Models

Excluded models: Never select openai/o3-pro — it is prohibitively expensive. If it appears in the model list, skip it entirely.

Example reasoning models you might see (provider/model format):

anthropic/claude-opus-4-6
anthropic/claude-sonnet-4-5 (with thinking)
openai/o3
openai/o4-mini
openai/gpt-5-pro
openai/gpt-5.2-pro
google/gemini-2.5-pro (with thinking)
google/gemini-2.5-flash (with thinking)
google/gemini-3.1-pro-preview
xai/grok-4

The exact list depends on which API keys are configured. Always check — do not assume models are available.

Step 2: Randomly Select 3 Models

From the eligible reasoning models, pick exactly 3 at random.

CRITICAL — true randomness and diversity:

Do NOT always pick the same 3 models. The entire point of review roulette is variety of perspectives across runs.
Prefer different providers when possible. If you have reasoning models from Anthropic, OpenAI, Google, and xAI, pick from 3 different providers. Only double up on a provider if fewer than 3 providers have eligible models.
Shuffle or randomize your selection each time. Do not default to alphabetical order or any fixed preference.

Step 3: Dispatch the Review in Parallel

Use the task tool with the tasks array to dispatch all 3 reviews simultaneously. Each task object must include:

model: The selected model in provider/model format.
task: The FULL original review prompt/instructions. Each subtask starts fresh with NO conversation history and NO context from the parent. You must forward EVERYTHING you were asked to do — the complete prompt, all instructions, the diff, file contents, any constraints or formatting requirements, the expected JSON output schema, etc. Do not summarize or abbreviate. Pass it all through verbatim.

Example structure:

{
  "tasks": [
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "anthropic/claude-opus-4-6"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "openai/o3"
    },
    {
      "task": "<the ENTIRE original review prompt and instructions>",
      "model": "google/gemini-2.5-pro"
    }
  ]
}

Step 4: Merge Results

Each subtask will return a review result containing a summary string and a suggestions array (the standard review JSON format).

Collect the results from all 3 subtask responses and merge them:

Suggestions: Concatenate all suggestions arrays into a single array.
Summary: Concatenate all summaries with model attribution. Format the merged summary as:
```
<provider/model1>:
<summary1>

<provider/model2>:
<summary2>

<provider/model3>:
<summary3>
```
This attributed format also serves as a record of which models were used in the review.

Return the merged result as the final JSON response.

Do NOT:

Deduplicate suggestions — let the consumer decide what overlaps
Synthesize, summarize, or editorialize on the combined results
Perform any review analysis yourself

Do:

Concatenate all suggestion arrays: [...model1, ...model2, ...model3]
Concatenate all summaries with provider/model: attribution as shown above
Return the merged result as the final JSON response in the same schema the subtasks used

Summary

You (parent)                    Subtask 1 (model A)
    │                               │
    ├── pick 3 random models        ├── receive full prompt
    ├── forward full prompt ──────► ├── perform review
    │                               └── return suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 2 (model B) ──► suggestions JSON
    │
    ├── forward full prompt ──────► Subtask 3 (model C) ──► suggestions JSON
    │
    └── merge all summaries (with model attribution) + suggestions[] ──► final JSON response

The parent does zero analysis. It is purely a dispatcher and merger. Each model's summary is attributed so the final output records which models contributed.

Related Skills

in-the-loop-labs/user-critic

development

VerifiedTrustedCommunity

Fetch human review comments from pair-review and make code changes to address them. Use when the user says "address review feedback", "fix review comments", "address comments", or wants to iterate on code based on feedback left by a human reviewer in pair-review.

28SKILL.mdUpdated Apr 20, 2026

in-the-loop-labs/user-critic

in-the-loop-labs/review-requests

development

VerifiedTrustedCommunity

Open outstanding GitHub review requests in pair-review for AI-powered code review. Finds open PRs where my review is pending from the past week and starts pair-review analysis for each. Use when the user says "review requests", "review my PRs", "check review requests", "open review requests", "pair-review my requests", or wants to batch-review their outstanding GitHub review requests.

28SKILL.mdUpdated Apr 20, 2026

in-the-loop-labs/review-requests

in-the-loop-labs/pr

tools

VerifiedTrustedCommunity

Open the GitHub pull request for the current branch in the pair-review web UI. This only opens the browser — it does not run AI analysis or generate suggestions. Once open, the user can browse the diff, leave comments, and trigger analysis from the web UI themselves. Use when the user says "review this PR", "review pull request", "open PR review", or wants to open a pair-review session for the current branch's pull request. If the user wants automated AI analysis of the PR rather than just opening the browser, use the `code-critic:analyze` skill (standalone, requires code-critic plugin) or `pair-review:analyze` skill (requires MCP server) instead. Note that the user can also trigger AI analysis from within the pair-review web UI after opening it.

28SKILL.mdUpdated Apr 20, 2026

in-the-loop-labs/local

tools

VerifiedTrustedCommunity

Open local uncommitted changes for review in the pair-review web UI. This only opens the browser — it does not run AI analysis or generate suggestions. Once open, the user can browse the diff, leave comments, and trigger analysis from the web UI themselves. Use when the user says "review my local changes", "review local", "open local review", or wants to open a pair-review session for uncommitted work in the current directory. If the user wants automated AI analysis of their local changes rather than just opening the browser, use the `code-critic:analyze` skill (standalone, requires code-critic plugin) or `pair-review:analyze` skill (requires MCP server) instead. Note that the user can also trigger AI analysis from within the pair-review web UI after opening it.

28SKILL.mdUpdated Apr 20, 2026

in-the-loop-labs/local

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/in-the-loop-labs/pair-review.git

# Copy into Claude Code skills folder (global)
cp -r pair-review/.pi/skills/review-roulette ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

in-the-loop-labs/pair-review

28 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT