skills/visual-regression/SKILL.md
--- name: visual-regression description: Wire Playwright screenshot baselines with CI PR-diff comments to protect against silent CSS regressions. Composes with `/ro:playwright-check` (browser tool) and `/ro:visual-diff` (per-image diff). Use after polish work is done and you want to keep it intact. category: quality-review argument-hint: [--routes <comma-list>] [--viewports <list>] [--update-baselines] allowed-tools: Bash(*) Read Write Edit Glob Grep content-pipeline: - pipeline:review - pla
npx skillsauth add RonanCodes/ronan-skills skills/visual-regressionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Playwright's toHaveScreenshot() is the cheapest-to-maintain visual regression setup that exists. Baseline PNGs live in the repo, diffs appear as PR comments, false positives are rare if you pin viewport + font + timing.
Visual regression is the safety net for polish work. Without it, a global CSS change silently moves a padding value and nobody notices for three weeks.
/ro:visual-regression # baseline the default routes
/ro:visual-regression --routes '/, /how-it-works' # specific routes
/ro:visual-regression --viewports 'mobile,tablet' # override default viewport set
/ro:visual-regression --update-baselines # regen after an intentional design change
e2e/visual.spec.ts — baseline tests per route × viewport.playwright.config.ts updates — deterministic screenshot config.e2e/__screenshots__/ — baseline PNGs committed to the repo.bramblex/playwright-report-to-pr-comment or similar.// e2e/visual.spec.ts
import { test, expect } from '@playwright/test'
const routes = [
{ path: '/', name: 'home' },
{ path: '/how-it-works', name: 'how-it-works' },
{ path: '/?date=2026-04-12', name: 'home-with-puzzle' },
]
const viewports = [
{ name: 'mobile', width: 390, height: 844 },
{ name: 'tablet', width: 768, height: 1024 },
{ name: 'desktop', width: 1440, height: 900 },
]
for (const route of routes) {
for (const vp of viewports) {
test(`${route.name} @ ${vp.name}`, async ({ page }) => {
await page.setViewportSize({ width: vp.width, height: vp.height })
await page.goto(route.path)
// Wait for fonts + images to stabilise.
await page.waitForLoadState('networkidle')
await page.evaluate(() => document.fonts.ready)
// Mask anything that changes run-to-run (time, counters, live data).
await expect(page).toHaveScreenshot(`${route.name}-${vp.name}.png`, {
fullPage: true,
mask: [
page.locator('[data-test="stats-counter"]'),
page.locator('[data-test="live-timestamp"]'),
],
maxDiffPixels: 100,
})
})
}
}
mask replaces the element with a pink rectangle before the diff. Use for any content that changes between runs (counters, timestamps, random images).
maxDiffPixels: 100 tolerates sub-rendering differences (anti-aliasing, sub-pixel positioning). If the diff exceeds 100 pixels, it fails.
playwright.config.ts determinismimport { defineConfig, devices } from '@playwright/test'
export default defineConfig({
testDir: './e2e',
expect: {
toHaveScreenshot: {
maxDiffPixelRatio: 0.01, // allow ≤1% pixel diff
threshold: 0.2, // per-pixel RGB tolerance
animations: 'disabled',
caret: 'hide',
},
},
use: {
...devices['Desktop Chrome'],
locale: 'en-GB',
timezoneId: 'Europe/Amsterdam',
colorScheme: 'light',
// Reduce motion prevents any animation drift.
reducedMotion: 'reduce',
// Force same font rendering across macOS dev and Linux CI.
launchOptions: {
args: ['--font-render-hinting=none'],
},
},
projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }],
})
Locale + timezone pinned explicitly. A test running on a CI box in PST with a US locale renders dates differently from local macOS in CET — and the baseline will mismatch.
animations: 'disabled' pauses CSS animations at their starting frame. Without this, the 300ms fade-in on a toast is the difference between pass and fail depending on how fast CI runs.
# Local (macOS):
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
# Commit the baselines:
git add e2e/__screenshots__
git commit -m "🧪 test: add visual regression baselines"
Capture on Linux via Docker for consistency with CI:
# If dev is macOS but CI is Linux, baseline on Linux to avoid font-rendering diff:
docker run --rm --network=host \
-v $(pwd):/work -w /work \
mcr.microsoft.com/playwright:v1.48.0-jammy \
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
This is annoying but necessary. macOS and Linux render fonts differently; the baseline diff will be ~5% even when nothing changed. Either baseline on Linux (docker above), or run CI only on macOS runners (expensive).
# .github/workflows/visual-regression.yml
name: Visual Regression
on: [pull_request]
jobs:
visual:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: pnpm
- run: pnpm install --frozen-lockfile
- run: pnpm exec playwright install --with-deps chromium
- run: pnpm build
- name: Run visual tests
run: pnpm exec playwright test e2e/visual.spec.ts
- name: Upload diff on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: visual-diff
path: test-results/
retention-days: 7
- name: Comment on PR
if: failure()
uses: mshick/add-pr-comment@v2
with:
message: |
🖼️ Visual regression failed. Download the `visual-diff` artifact to see what changed.
If the change is intentional, run:
```
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
git add e2e/__screenshots__
git commit
```
The failure artifact includes expected.png, actual.png, and diff.png for every failing test. Reviewer opens the diff, decides intentional vs regression, and either approves or asks for a fix.
# After a design change, regenerate baselines:
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
git diff --stat e2e/__screenshots__
# Review the changes in the PR using something like GitHub's image compare view.
git add e2e/__screenshots__
git commit -m "🎨 style: update visual regression baselines after header redesign"
Commit message discipline matters. Reviewers should be able to audit why baselines changed 3 months later.
High-value routes to baseline:
/how-it-works, /pricing, marketing pages — silent regression here hurts conversion.Low-value routes (skip):
Viewport strategy:
/ro:visual-diff/ro:visual-diff does per-image pixel diff. Use it for one-off "does this PR change this screenshot" checks, outside Playwright. This skill (/ro:visual-regression) composes those diffs into an always-on CI gate.
/ro:visual-diff before.png after.png # one-off check
/ro:visual-regression # always-on, in CI
mask: [...] or a frozen clock (page.clock.install({ time: '2026-04-12T10:00:00Z' })).page.route('**/api/**', (route) => route.fulfill({ json: fixture }))).fullPage: false + scoped selectors for sub-components if size matters.test.describe.configure({ mode: 'serial' }) if flakiness correlates with parallel runs.diff.png artifact before merging is non-negotiable. Don't just click "update baselines"./ro:app-polish — umbrella; this is check #8/ro:playwright-check — underlying browser tool used by this skill/ro:visual-diff — per-image diff primitive this skill composes/ro:design-system-audit — runs before visual regression to catch design-token drift/ro:accessibility-ci — complementary; focus-ring visual changes are caught heredevelopment
--- name: worktree description: Coordinate multiple agents on one repo via a worktree-lock pool, so two agents never clobber each other's working tree. Acquire the first free slot (main, then beta/gamma… worktrees, created on demand), work there on your own branch, release when you've pushed. Use before modifying any repo that might be in use by another agent (factory, dataforce, etc.), or whenever you're told a repo is being worked on. Backed by `ro worktree`. category: development argument-hin
testing
--- name: ship description: Ship a feature branch the local-CI-first way — run the full local gate, push, open a PR, squash-merge, then deploy, without waiting on GitHub Actions. Use when a branch is ready for main and you want it merged and deployed now. Reads CI policy from `ro ci` (default skips remote CI because GitHub Actions billing keeps hitting limits). Sibling to /ro:gh-ship (waits on GitHub checks) and /ro:cf-ship (the deploy half). Triggers on "ship it", "ship this", "merge and deploy
testing
--- name: setup-logging description: Set up (or audit) the observability stack in a TanStack Start + Cloudflare Workers app so it is "diagnosable by default" — structured logging (logtape) with a request context carrying trace_id + userId + tenant/orgId, a trace_id propagated FE→BE→logs→Sentry→PostHog, Cloudflare Workers observability enabled, and Sentry + PostHog wired. Two modes: `setup` (wire it into an app) and `audit` (check an existing app + report gaps). Use when scaffolding a new app, wh
development
Manage credentials INSIDE the active ~/.claude/.env file — read which token/account to use for a given app (Simplicity vs Dataforce vs Ronan-personal), add or update a secret WITHOUT it passing through the chat (an interactive Terminal window prompts for it), and track secrets that were exposed in a transcript so they get rotated. Sibling to /ro:context (which switches WHICH env file is active). Use when the user wants to add an API key/token/secret, asks "which credential do I use for X", needs the env organized/labelled, or a secret was pasted into the chat and should be rotated.