Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

roocodeinc/evals-context

Name: evals-context
Author: roocodeinc

.roo/skills/evals-context/SKILL.md

npx skillsauth add roocodeinc/roo-code evals-context

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Evals Codebase Context

When to Use This Skill

Use this skill when the task involves:

Modifying or debugging the evals execution infrastructure
Adding new eval exercises or languages
Working with the evals web interface (apps/web-evals)
Modifying the public evals display page on roocode.com
Understanding where evals code lives in this monorepo

When NOT to Use This Skill

Do NOT use this skill when:

Working on unrelated parts of the codebase (extension, webview-ui, etc.)
The task is purely about the VS Code extension's core functionality
Working on the main website pages that don't involve evals

Key Disambiguation: Two "Evals" Locations

This monorepo has two distinct evals-related locations that can cause confusion:

| Component | Path | Purpose | | --------------------------- | -------------------------------------------------------------- | -------------------------------------------------------------- | | Evals Execution System | packages/evals/ | Core eval infrastructure: CLI, DB schema, Docker configs | | Evals Management UI | apps/web-evals/ | Next.js app for creating/monitoring eval runs (localhost:3446) | | Website Evals Page | apps/web-roo-code/src/app/evals/ | Public roocode.com page displaying eval results | | External Exercises Repo | Roo-Code-Evals | Actual coding exercises (NOT in this monorepo) |

Directory Structure Reference

`packages/evals/` - Core Evals Package

packages/evals/
├── ARCHITECTURE.md          # Detailed architecture documentation
├── ADDING-EVALS.md          # Guide for adding new exercises/languages
├── README.md                # Setup and running instructions
├── docker-compose.yml       # Container orchestration
├── Dockerfile.runner        # Runner container definition
├── Dockerfile.web           # Web app container
├── drizzle.config.ts        # Database ORM config
├── src/
│   ├── index.ts             # Package exports
│   ├── cli/                 # CLI commands for running evals
│   │   ├── runEvals.ts      # Orchestrates complete eval runs
│   │   ├── runTask.ts       # Executes individual tasks in containers
│   │   ├── runUnitTest.ts   # Validates task completion via tests
│   │   └── redis.ts         # Redis pub/sub integration
│   ├── db/
│   │   ├── schema.ts        # Database schema (runs, tasks)
│   │   ├── queries/         # Database query functions
│   │   └── migrations/      # SQL migrations
│   └── exercises/
│       └── index.ts         # Exercise loading utilities
└── scripts/
    └── setup.sh             # Local macOS setup script

`apps/web-evals/` - Evals Management Web App

apps/web-evals/
├── src/
│   ├── app/
│   │   ├── page.tsx         # Home page (runs list)
│   │   ├── runs/
│   │   │   ├── new/         # Create new eval run
│   │   │   └── [id]/        # View specific run status
│   │   └── api/runs/        # SSE streaming endpoint
│   ├── actions/             # Server actions
│   │   ├── runs.ts          # Run CRUD operations
│   │   ├── tasks.ts         # Task queries
│   │   ├── exercises.ts     # Exercise listing
│   │   └── heartbeat.ts     # Controller health checks
│   ├── hooks/               # React hooks (SSE, models, etc.)
│   └── lib/                 # Utilities and schemas

`apps/web-roo-code/src/app/evals/` - Public Website Evals Page

apps/web-roo-code/src/app/evals/
├── page.tsx      # Fetches and displays public eval results
├── evals.tsx     # Main evals display component
├── plot.tsx      # Visualization component
└── types.ts      # EvalRun type (extends packages/evals types)

This page displays eval results on the public roocode.com website. It imports types from @roo-code/evals but does NOT run evals.

Architecture Overview

The evals system is a distributed evaluation platform that runs AI coding tasks in isolated VS Code environments:

┌─────────────────────────────────────────────────────────────┐
│  Web App (apps/web-evals)  ──────────────────────────────── │
│        │                                                    │
│        ▼                                                    │
│  PostgreSQL ◄────► Controller Container                     │
│        │               │                                    │
│        ▼               ▼                                    │
│     Redis ◄───► Runner Containers (1-25 parallel)           │
└─────────────────────────────────────────────────────────────┘

Key components:

Controller: Orchestrates eval runs, spawns runners, manages task queue (p-queue)
Runner: Isolated Docker container with VS Code + Roo Code extension + language runtimes
Redis: Pub/sub for real-time events (NOT task queuing)
PostgreSQL: Stores runs, tasks, metrics

Common Tasks Quick Reference

Adding a New Eval Exercise

Add exercise to Roo-Code-Evals repo (external)
See packages/evals/ADDING-EVALS.md for structure

Modifying Eval CLI Behavior

Edit files in packages/evals/src/cli/:

runEvals.ts - Run orchestration
runTask.ts - Task execution
runUnitTest.ts - Test validation

Modifying the Evals Web Interface

Edit files in apps/web-evals/src/:

app/runs/new/new-run.tsx - New run form
actions/runs.ts - Run server actions

Modifying the Public Evals Display Page

Edit files in apps/web-roo-code/src/app/evals/:

evals.tsx - Display component
plot.tsx - Charts

Database Schema Changes

Edit packages/evals/src/db/schema.ts
Generate migration: cd packages/evals && pnpm drizzle-kit generate
Apply migration: pnpm drizzle-kit migrate

Running Evals Locally

# From repo root
pnpm evals

# Opens web UI at http://localhost:3446

Ports (defaults):

PostgreSQL: 5433
Redis: 6380
Web: 3446

Testing

# packages/evals tests
cd packages/evals && npx vitest run

# apps/web-evals tests
cd apps/web-evals && npx vitest run

Key Types/Exports from `@roo-code/evals`

The package exports are defined in packages/evals/src/index.ts:

Database queries: getRuns, getTasks, getTaskMetrics, etc.
Schema types: Run, Task, TaskMetrics
Used by both apps/web-evals and apps/web-roo-code

roocodeinc/evals-context

.roo/skills/evals-context/SKILL.md

Provides context about the Roo Code evals system structure in this monorepo. Use when tasks mention "evals", "evaluation", "eval runs", "eval exercises", or working with the evals infrastructure. Helps distinguish between the evals execution system (packages/evals, apps/web-evals) and the public website evals display page (apps/web-roo-code/src/app/evals).

23,083 stars

development

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add roocodeinc/roo-code evals-context

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 4, 2026, 11:17 PM30.2s1 file scanned

SKILL.md

name:: evals-context
description:: Provides context about the Roo Code evals system structure in this monorepo. Use when tasks mention "evals", "evaluation", "eval runs", "eval exercises", or working with the evals infrastructure. Helps distinguish between the evals execution system (packages/evals, apps/web-evals) and the public website evals display page (apps/web-roo-code/src/app/evals).

Evals Codebase Context

When to Use This Skill

Use this skill when the task involves:

Modifying or debugging the evals execution infrastructure
Adding new eval exercises or languages
Working with the evals web interface (apps/web-evals)
Modifying the public evals display page on roocode.com
Understanding where evals code lives in this monorepo

When NOT to Use This Skill

Do NOT use this skill when:

Working on unrelated parts of the codebase (extension, webview-ui, etc.)
The task is purely about the VS Code extension's core functionality
Working on the main website pages that don't involve evals

Key Disambiguation: Two "Evals" Locations

This monorepo has two distinct evals-related locations that can cause confusion:

Directory Structure Reference

`packages/evals/` - Core Evals Package

packages/evals/
├── ARCHITECTURE.md          # Detailed architecture documentation
├── ADDING-EVALS.md          # Guide for adding new exercises/languages
├── README.md                # Setup and running instructions
├── docker-compose.yml       # Container orchestration
├── Dockerfile.runner        # Runner container definition
├── Dockerfile.web           # Web app container
├── drizzle.config.ts        # Database ORM config
├── src/
│   ├── index.ts             # Package exports
│   ├── cli/                 # CLI commands for running evals
│   │   ├── runEvals.ts      # Orchestrates complete eval runs
│   │   ├── runTask.ts       # Executes individual tasks in containers
│   │   ├── runUnitTest.ts   # Validates task completion via tests
│   │   └── redis.ts         # Redis pub/sub integration
│   ├── db/
│   │   ├── schema.ts        # Database schema (runs, tasks)
│   │   ├── queries/         # Database query functions
│   │   └── migrations/      # SQL migrations
│   └── exercises/
│       └── index.ts         # Exercise loading utilities
└── scripts/
    └── setup.sh             # Local macOS setup script

`apps/web-evals/` - Evals Management Web App

apps/web-evals/
├── src/
│   ├── app/
│   │   ├── page.tsx         # Home page (runs list)
│   │   ├── runs/
│   │   │   ├── new/         # Create new eval run
│   │   │   └── [id]/        # View specific run status
│   │   └── api/runs/        # SSE streaming endpoint
│   ├── actions/             # Server actions
│   │   ├── runs.ts          # Run CRUD operations
│   │   ├── tasks.ts         # Task queries
│   │   ├── exercises.ts     # Exercise listing
│   │   └── heartbeat.ts     # Controller health checks
│   ├── hooks/               # React hooks (SSE, models, etc.)
│   └── lib/                 # Utilities and schemas

`apps/web-roo-code/src/app/evals/` - Public Website Evals Page

apps/web-roo-code/src/app/evals/
├── page.tsx      # Fetches and displays public eval results
├── evals.tsx     # Main evals display component
├── plot.tsx      # Visualization component
└── types.ts      # EvalRun type (extends packages/evals types)

This page displays eval results on the public roocode.com website. It imports types from @roo-code/evals but does NOT run evals.

Architecture Overview

The evals system is a distributed evaluation platform that runs AI coding tasks in isolated VS Code environments:

┌─────────────────────────────────────────────────────────────┐
│  Web App (apps/web-evals)  ──────────────────────────────── │
│        │                                                    │
│        ▼                                                    │
│  PostgreSQL ◄────► Controller Container                     │
│        │               │                                    │
│        ▼               ▼                                    │
│     Redis ◄───► Runner Containers (1-25 parallel)           │
└─────────────────────────────────────────────────────────────┘

Key components:

Controller: Orchestrates eval runs, spawns runners, manages task queue (p-queue)
Runner: Isolated Docker container with VS Code + Roo Code extension + language runtimes
Redis: Pub/sub for real-time events (NOT task queuing)
PostgreSQL: Stores runs, tasks, metrics

Common Tasks Quick Reference

Adding a New Eval Exercise

Add exercise to Roo-Code-Evals repo (external)
See packages/evals/ADDING-EVALS.md for structure

Modifying Eval CLI Behavior

Edit files in packages/evals/src/cli/:

runEvals.ts - Run orchestration
runTask.ts - Task execution
runUnitTest.ts - Test validation

Modifying the Evals Web Interface

Edit files in apps/web-evals/src/:

app/runs/new/new-run.tsx - New run form
actions/runs.ts - Run server actions

Modifying the Public Evals Display Page

Edit files in apps/web-roo-code/src/app/evals/:

evals.tsx - Display component
plot.tsx - Charts

Database Schema Changes

Edit packages/evals/src/db/schema.ts
Generate migration: cd packages/evals && pnpm drizzle-kit generate
Apply migration: pnpm drizzle-kit migrate

Running Evals Locally

# From repo root
pnpm evals

# Opens web UI at http://localhost:3446

Ports (defaults):

PostgreSQL: 5433
Redis: 6380
Web: 3446

Testing

# packages/evals tests
cd packages/evals && npx vitest run

# apps/web-evals tests
cd apps/web-evals && npx vitest run

Key Types/Exports from `@roo-code/evals`

The package exports are defined in packages/evals/src/index.ts:

Database queries: getRuns, getTasks, getTaskMetrics, etc.
Schema types: Run, Task, TaskMetrics
Used by both apps/web-evals and apps/web-roo-code

Related Skills

roocodeinc/roo-translation

tools

VerifiedTrustedCommunity

Provides comprehensive guidelines for translating and localizing Roo Code extension strings. Use when tasks involve i18n, translation, localization, adding new languages, or updating existing translation files. This skill covers both core extension (src/i18n/locales/) and WebView UI (webview-ui/src/i18n/locales/) localization.

23,083SKILL.mdUpdated Apr 11, 2026

roocodeinc/roo-translation

roocodeinc/roo-conflict-resolution

development

VerifiedTrustedCommunity

Provides comprehensive guidelines for resolving merge conflicts intelligently using git history and commit context. Use when tasks involve merge conflicts, rebasing, PR conflicts, or git conflict resolution. This skill analyzes commit messages, git blame, and code intent to make intelligent resolution decisions.

23,083SKILL.mdUpdated Apr 11, 2026

roocodeinc/roo-conflict-resolution

openclaw/openclaw-secret-scanning-maintainer

development

VerifiedTrustedCommunity

Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.

357,764SKILL.mdUpdated Apr 15, 2026

openclaw/openclaw-secret-scanning-maintainer

openclaw/openclaw-release-maintainer

development

VerifiedTrustedCommunity

Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-release-maintainer

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/roocodeinc/roo-code.git

# Copy into Claude Code skills folder (global)
cp -r roo-code/.roo/skills/evals-context ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

roocodeinc/roo-code

23,083 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

roocodeinc/evals-context

$ install --global

Security Scan Results

SKILL.md

Evals Codebase Context

When to Use This Skill

When NOT to Use This Skill

Key Disambiguation: Two "Evals" Locations

Directory Structure Reference

packages/evals/ - Core Evals Package

apps/web-evals/ - Evals Management Web App

apps/web-roo-code/src/app/evals/ - Public Website Evals Page

Architecture Overview

Common Tasks Quick Reference

Adding a New Eval Exercise

Modifying Eval CLI Behavior

Modifying the Evals Web Interface

Modifying the Public Evals Display Page

Database Schema Changes

Running Evals Locally

Testing

Key Types/Exports from @roo-code/evals

Related Skills

roocodeinc/roo-translation

roocodeinc/roo-conflict-resolution

openclaw/openclaw-secret-scanning-maintainer

openclaw/openclaw-release-maintainer

roocodeinc/evals-context

$ install --global

Security Scan Results

SKILL.md

Evals Codebase Context

When to Use This Skill

When NOT to Use This Skill

Key Disambiguation: Two "Evals" Locations

Directory Structure Reference

packages/evals/ - Core Evals Package

apps/web-evals/ - Evals Management Web App

apps/web-roo-code/src/app/evals/ - Public Website Evals Page

Architecture Overview

Common Tasks Quick Reference

Adding a New Eval Exercise

Modifying Eval CLI Behavior

Modifying the Evals Web Interface

Modifying the Public Evals Display Page

Database Schema Changes

Running Evals Locally

Testing

Key Types/Exports from @roo-code/evals

Related Skills

roocodeinc/roo-translation

roocodeinc/roo-conflict-resolution

openclaw/openclaw-secret-scanning-maintainer

openclaw/openclaw-release-maintainer

`packages/evals/` - Core Evals Package

`apps/web-evals/` - Evals Management Web App

`apps/web-roo-code/src/app/evals/` - Public Website Evals Page

Key Types/Exports from `@roo-code/evals`

`packages/evals/` - Core Evals Package

`apps/web-evals/` - Evals Management Web App

`apps/web-roo-code/src/app/evals/` - Public Website Evals Page

Key Types/Exports from `@roo-code/evals`