.roo/skills/evals-context/SKILL.md
Provides context about the Roo Code evals system structure in this monorepo. Use when tasks mention "evals", "evaluation", "eval runs", "eval exercises", or working with the evals infrastructure. Helps distinguish between the evals execution system (packages/evals, apps/web-evals) and the public website evals display page (apps/web-roo-code/src/app/evals).
npx skillsauth add roocodeinc/roo-code evals-contextInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when the task involves:
Do NOT use this skill when:
This monorepo has two distinct evals-related locations that can cause confusion:
| Component | Path | Purpose |
| --------------------------- | -------------------------------------------------------------- | -------------------------------------------------------------- |
| Evals Execution System | packages/evals/ | Core eval infrastructure: CLI, DB schema, Docker configs |
| Evals Management UI | apps/web-evals/ | Next.js app for creating/monitoring eval runs (localhost:3446) |
| Website Evals Page | apps/web-roo-code/src/app/evals/ | Public roocode.com page displaying eval results |
| External Exercises Repo | Roo-Code-Evals | Actual coding exercises (NOT in this monorepo) |
packages/evals/ - Core Evals Packagepackages/evals/
├── ARCHITECTURE.md # Detailed architecture documentation
├── ADDING-EVALS.md # Guide for adding new exercises/languages
├── README.md # Setup and running instructions
├── docker-compose.yml # Container orchestration
├── Dockerfile.runner # Runner container definition
├── Dockerfile.web # Web app container
├── drizzle.config.ts # Database ORM config
├── src/
│ ├── index.ts # Package exports
│ ├── cli/ # CLI commands for running evals
│ │ ├── runEvals.ts # Orchestrates complete eval runs
│ │ ├── runTask.ts # Executes individual tasks in containers
│ │ ├── runUnitTest.ts # Validates task completion via tests
│ │ └── redis.ts # Redis pub/sub integration
│ ├── db/
│ │ ├── schema.ts # Database schema (runs, tasks)
│ │ ├── queries/ # Database query functions
│ │ └── migrations/ # SQL migrations
│ └── exercises/
│ └── index.ts # Exercise loading utilities
└── scripts/
└── setup.sh # Local macOS setup script
apps/web-evals/ - Evals Management Web Appapps/web-evals/
├── src/
│ ├── app/
│ │ ├── page.tsx # Home page (runs list)
│ │ ├── runs/
│ │ │ ├── new/ # Create new eval run
│ │ │ └── [id]/ # View specific run status
│ │ └── api/runs/ # SSE streaming endpoint
│ ├── actions/ # Server actions
│ │ ├── runs.ts # Run CRUD operations
│ │ ├── tasks.ts # Task queries
│ │ ├── exercises.ts # Exercise listing
│ │ └── heartbeat.ts # Controller health checks
│ ├── hooks/ # React hooks (SSE, models, etc.)
│ └── lib/ # Utilities and schemas
apps/web-roo-code/src/app/evals/ - Public Website Evals Pageapps/web-roo-code/src/app/evals/
├── page.tsx # Fetches and displays public eval results
├── evals.tsx # Main evals display component
├── plot.tsx # Visualization component
└── types.ts # EvalRun type (extends packages/evals types)
This page displays eval results on the public roocode.com website. It imports types from @roo-code/evals but does NOT run evals.
The evals system is a distributed evaluation platform that runs AI coding tasks in isolated VS Code environments:
┌─────────────────────────────────────────────────────────────┐
│ Web App (apps/web-evals) ──────────────────────────────── │
│ │ │
│ ▼ │
│ PostgreSQL ◄────► Controller Container │
│ │ │ │
│ ▼ ▼ │
│ Redis ◄───► Runner Containers (1-25 parallel) │
└─────────────────────────────────────────────────────────────┘
Key components:
packages/evals/ADDING-EVALS.md for structureEdit files in packages/evals/src/cli/:
runEvals.ts - Run orchestrationrunTask.ts - Task executionrunUnitTest.ts - Test validationEdit files in apps/web-evals/src/:
app/runs/new/new-run.tsx - New run formactions/runs.ts - Run server actionsEdit files in apps/web-roo-code/src/app/evals/:
evals.tsx - Display componentplot.tsx - Chartspackages/evals/src/db/schema.tscd packages/evals && pnpm drizzle-kit generatepnpm drizzle-kit migrate# From repo root
pnpm evals
# Opens web UI at http://localhost:3446
Ports (defaults):
# packages/evals tests
cd packages/evals && npx vitest run
# apps/web-evals tests
cd apps/web-evals && npx vitest run
@roo-code/evalsThe package exports are defined in packages/evals/src/index.ts:
getRuns, getTasks, getTaskMetrics, etc.Run, Task, TaskMetricsapps/web-evals and apps/web-roo-codetools
Provides comprehensive guidelines for translating and localizing Roo Code extension strings. Use when tasks involve i18n, translation, localization, adding new languages, or updating existing translation files. This skill covers both core extension (src/i18n/locales/) and WebView UI (webview-ui/src/i18n/locales/) localization.
development
Provides comprehensive guidelines for resolving merge conflicts intelligently using git history and commit context. Use when tasks involve merge conflicts, rebasing, PR conflicts, or git conflict resolution. This skill analyzes commit messages, git blame, and code intent to make intelligent resolution decisions.
development
Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.
development
Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.