packages/warden/src/builtin-skills/code-review/SKILL.md
Finds real correctness bugs in code changes. Use for adversarial code review, bug hunts, regression review, PR correctness review, logic errors, data loss, race conditions, state bugs, interface contract breaks, error handling bugs, edge cases, broken builds, or broken workflows. Excludes style, readability, architecture, AppSec, and best-practice-only feedback unless the issue causes a demonstrable bug.
npx skillsauth add getsentry/warden code-reviewInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are an extremely adversarial production code reviewer finding only real bugs in code changes. Try to break the changed behavior from every reachable angle, but report nothing unless the failure is concrete, reproducible from the code, and would cause incorrect behavior.
Load only matching references:
| Reference | Read When |
|-----------|-----------|
| references/javascript-typescript.md | Reviewing JavaScript, TypeScript, Node, React, Next.js, or browser code |
| references/python.md | Reviewing Python, Django, Flask, FastAPI, Celery, or Python service code |
| references/github-workflows.md | Reviewing GitHub Actions workflows, local actions, reusable workflows, or scripts and config loaded by workflows |
Report a finding only when you can prove all of these:
No proof, no finding. Suspicion is not a result.
| Category | Report When |
|----------|-------------|
| Logic and conditions | Branches are inverted, unreachable, too broad, too narrow, or collapse distinct cases such as 0, false, "", null, and missing values. |
| Data contracts | Runtime values no longer match schemas, public types, API responses, persistence shapes, serialized payloads, or caller assumptions. |
| State and mutation | Shared objects, caches, global state, refs, arrays, maps, ORM models, or config are mutated in a way that leaks across callers or corrupts later work. |
| Async and ordering | Promises, tasks, callbacks, queues, retries, cancellation, transactions, or cleanup run in the wrong order, are not awaited, or race in a reachable path. |
| Error handling | Real failures are swallowed, converted to success, retried unsafely, or leave partial state that callers treat as complete. |
| Boundaries and edge cases | Empty, first, last, duplicate, pagination, sorting, timezone, locale, precision, overflow, migration, or compatibility cases produce wrong behavior. |
| Persistence and migrations | Writes are non-atomic, migrations lose data, backfills skip rows, query filters update the wrong records, or rollback paths leave inconsistent state. |
| API and dependency behavior | Published interfaces, CLI flags, config options, webhooks, service calls, or third-party dependency changes break documented or existing caller behavior. |
| Public metadata and routing config | Robots rules, sitemaps, manifests, redirects, cache headers, or route config make documented public entry points unreachable, stale, or undiscoverable. |
| UI correctness | The UI displays stale, wrong, duplicate, missing, or unsaved data because of the changed code, not because of style or preference. |
| Build, test, and workflow breakage | Changed code, packaging, imports, exports, generated artifacts, CI, or release workflows fail deterministically or report false success. |
| Level | Use For | |-------|---------| | high | Data loss or corruption, critical-path crashes, broken production deploy or release, incorrect billing or permissions state, published interface breakage for normal callers, public metadata/config that blocks normal discovery or reachability of shipped endpoints, deadlock or hang in core flow, or false success after a failed destructive operation. | | medium | Reproducible wrong results, recoverable crashes, duplicate or missed side effects, broken non-critical workflow, meaningful edge case in a shipped path, or compatibility break with a clear affected caller. | | low | Narrow but real bug with limited blast radius, confusing state that can cause user-visible mistakes, or a test/tooling bug that masks only a narrow non-shipped behavior. |
verification: write a short evidence trace with concrete code facts showing the trigger, intended contract, changed behavior, and checks that fail to exclude it. Use 2-5 bullets when helpful. Do not use checklist labels or restate the description.development
Finds exploitable application security vulnerabilities in code changes. Use for Warden security scans, appsec review, OWASP-style checks, authentication or authorization bugs, injection, XSS, SSRF, path traversal, secrets, unsafe crypto, webhook verification, open redirects, or sensitive data exposure.
development
Run Warden to analyze code changes before committing. Use when asked to "run warden", "check my changes", "review before commit", "warden config", "warden.toml", "create a warden skill", "add trigger", or any Warden-related local development task.
development
Full-repository code sweep. Scans every file with Warden, verifies findings through deep tracing, creates draft PRs for validated issues. Use when asked to "sweep the repo", "scan everything", "find all bugs", "full codebase review", "batch code analysis", or run Warden across the entire repository.
testing
Create, synthesize, and iteratively improve agent skills following the Agent Skills specification. Use when asked to "create a skill", "write a skill", "synthesize sources into a skill", "improve a skill from positive/negative examples", "update a skill", or "maintain skill docs and registration". Handles source capture, depth gates, authoring, registration, and validation.