Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

RonanCodes/skills/visual-regression

Name: skills/visual-regression
Author: RonanCodes

skills/visual-regression/SKILL.md

npx skillsauth add RonanCodes/ronan-skills skills/visual-regression

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Visual Regression

Playwright's toHaveScreenshot() is the cheapest-to-maintain visual regression setup that exists. Baseline PNGs live in the repo, diffs appear as PR comments, false positives are rare if you pin viewport + font + timing.

Visual regression is the safety net for polish work. Without it, a global CSS change silently moves a padding value and nobody notices for three weeks.

Usage

/ro:visual-regression                                  # baseline the default routes
/ro:visual-regression --routes '/,  /how-it-works'     # specific routes
/ro:visual-regression --viewports 'mobile,tablet'      # override default viewport set
/ro:visual-regression --update-baselines               # regen after an intentional design change

What gets wired

e2e/visual.spec.ts — baseline tests per route × viewport.
playwright.config.ts updates — deterministic screenshot config.
e2e/__screenshots__/ — baseline PNGs committed to the repo.
GitHub Actions workflow running on PR with diff artifacts.
PR comment integration via bramblex/playwright-report-to-pr-comment or similar.

1. The test file

// e2e/visual.spec.ts
import { test, expect } from '@playwright/test'

const routes = [
  { path: '/', name: 'home' },
  { path: '/how-it-works', name: 'how-it-works' },
  { path: '/?date=2026-04-12', name: 'home-with-puzzle' },
]

const viewports = [
  { name: 'mobile', width: 390, height: 844 },
  { name: 'tablet', width: 768, height: 1024 },
  { name: 'desktop', width: 1440, height: 900 },
]

for (const route of routes) {
  for (const vp of viewports) {
    test(`${route.name} @ ${vp.name}`, async ({ page }) => {
      await page.setViewportSize({ width: vp.width, height: vp.height })
      await page.goto(route.path)

      // Wait for fonts + images to stabilise.
      await page.waitForLoadState('networkidle')
      await page.evaluate(() => document.fonts.ready)

      // Mask anything that changes run-to-run (time, counters, live data).
      await expect(page).toHaveScreenshot(`${route.name}-${vp.name}.png`, {
        fullPage: true,
        mask: [
          page.locator('[data-test="stats-counter"]'),
          page.locator('[data-test="live-timestamp"]'),
        ],
        maxDiffPixels: 100,
      })
    })
  }
}

mask replaces the element with a pink rectangle before the diff. Use for any content that changes between runs (counters, timestamps, random images).

maxDiffPixels: 100 tolerates sub-rendering differences (anti-aliasing, sub-pixel positioning). If the diff exceeds 100 pixels, it fails.

2. `playwright.config.ts` determinism

import { defineConfig, devices } from '@playwright/test'

export default defineConfig({
  testDir: './e2e',
  expect: {
    toHaveScreenshot: {
      maxDiffPixelRatio: 0.01,      // allow ≤1% pixel diff
      threshold: 0.2,               // per-pixel RGB tolerance
      animations: 'disabled',
      caret: 'hide',
    },
  },
  use: {
    ...devices['Desktop Chrome'],
    locale: 'en-GB',
    timezoneId: 'Europe/Amsterdam',
    colorScheme: 'light',
    // Reduce motion prevents any animation drift.
    reducedMotion: 'reduce',
    // Force same font rendering across macOS dev and Linux CI.
    launchOptions: {
      args: ['--font-render-hinting=none'],
    },
  },
  projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }],
})

Locale + timezone pinned explicitly. A test running on a CI box in PST with a US locale renders dates differently from local macOS in CET — and the baseline will mismatch.

animations: 'disabled' pauses CSS animations at their starting frame. Without this, the 300ms fade-in on a toast is the difference between pass and fail depending on how fast CI runs.

3. Initial baseline capture

# Local (macOS):
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots

# Commit the baselines:
git add e2e/__screenshots__
git commit -m "🧪 test: add visual regression baselines"

Capture on Linux via Docker for consistency with CI:

# If dev is macOS but CI is Linux, baseline on Linux to avoid font-rendering diff:
docker run --rm --network=host \
  -v $(pwd):/work -w /work \
  mcr.microsoft.com/playwright:v1.48.0-jammy \
  pnpm exec playwright test e2e/visual.spec.ts --update-snapshots

This is annoying but necessary. macOS and Linux render fonts differently; the baseline diff will be ~5% even when nothing changed. Either baseline on Linux (docker above), or run CI only on macOS runners (expensive).

4. GitHub Actions workflow

# .github/workflows/visual-regression.yml
name: Visual Regression
on: [pull_request]

jobs:
  visual:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: pnpm
      - run: pnpm install --frozen-lockfile
      - run: pnpm exec playwright install --with-deps chromium
      - run: pnpm build
      - name: Run visual tests
        run: pnpm exec playwright test e2e/visual.spec.ts
      - name: Upload diff on failure
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: visual-diff
          path: test-results/
          retention-days: 7
      - name: Comment on PR
        if: failure()
        uses: mshick/add-pr-comment@v2
        with:
          message: |
            🖼️ Visual regression failed. Download the `visual-diff` artifact to see what changed.

            If the change is intentional, run:
            ```
            pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
            git add e2e/__screenshots__
            git commit
            ```

The failure artifact includes expected.png, actual.png, and diff.png for every failing test. Reviewer opens the diff, decides intentional vs regression, and either approves or asks for a fix.

5. Updating baselines after intentional change

# After a design change, regenerate baselines:
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
git diff --stat e2e/__screenshots__

# Review the changes in the PR using something like GitHub's image compare view.
git add e2e/__screenshots__
git commit -m "🎨 style: update visual regression baselines after header redesign"

Commit message discipline matters. Reviewers should be able to audit why baselines changed 3 months later.

Picking what to baseline

High-value routes to baseline:

Homepage / landing page.
Primary feature flow (e.g. puzzle view, checkout page, dashboard).
Any page that renders data from a database (empty state, populated state).
/how-it-works, /pricing, marketing pages — silent regression here hurts conversion.

Low-value routes (skip):

Admin pages — internal, not sensitive to regression.
Auth flows — tested at a different layer.
Error pages (404, 500) — usually fine; baselining creates more maintenance than value.

Viewport strategy:

Mobile (390×844 — iPhone 14) — most traffic.
Desktop (1440×900) — dev checks.
Tablet only if the app renders meaningfully differently at tablet widths.

Complementary tool: `/ro:visual-diff`

/ro:visual-diff does per-image pixel diff. Use it for one-off "does this PR change this screenshot" checks, outside Playwright. This skill (/ro:visual-regression) composes those diffs into an always-on CI gate.

/ro:visual-diff before.png after.png              # one-off check
/ro:visual-regression                              # always-on, in CI

Gotchas

Flakiness from fonts. Baseline on the same OS as CI (Linux via Docker) or the diff-pixel ratio will eat you.
Flakiness from timestamps. Any UI showing "now" or relative time needs mask: [...] or a frozen clock (page.clock.install({ time: '2026-04-12T10:00:00Z' })).
Flakiness from API data. Tests that render real data are flaky. Mock the API at the fetch layer (page.route('**/api/**', (route) => route.fulfill({ json: fixture }))).
Image size explosion. A full-page screenshot on desktop 4k is ~1-2 MB. Repo can grow fast. Consider fullPage: false + scoped selectors for sub-components if size matters.
Baselines in merge conflicts. Binary PNG conflicts are impossible to resolve in git. Rebase, regenerate baselines, recommit.
Ignoring dynamic content. If you mask too much, the test stops catching real changes. Masks should cover only truly random regions.
Running tests in parallel. Playwright parallelises by default; screenshots can occasionally collide on scroll position. Use test.describe.configure({ mode: 'serial' }) if flakiness correlates with parallel runs.

When NOT to use visual regression

Early-stage apps with rapid design iteration. You'll regenerate baselines every PR — noise exceeds signal.
Pure-backend projects. Visual regression is for UI.
Marketing sites where content changes daily. Move to a CMS-level approval workflow instead.

Rules

Baseline intentionally. Every baseline in the repo is a contract. If it was captured by accident, it's technical debt.
Update baselines in dedicated PRs when the change is large. Mixing a visual-regression update with feature work makes review harder.
Don't tolerate flaky tests. A flaky visual test trains the team to ignore failures. Fix or delete.
Review diffs. Opening the diff.png artifact before merging is non-negotiable. Don't just click "update baselines".
Pin everything. Locale, timezone, color scheme, font rendering, animation state. Any unpinned variable is future flake.

RonanCodes/skills/visual-regression

skills/visual-regression/SKILL.md

--- name: visual-regression description: Wire Playwright screenshot baselines with CI PR-diff comments to protect against silent CSS regressions. Composes with `/ro:playwright-check` (browser tool) and `/ro:visual-diff` (per-image diff). Use after polish work is done and you want to keep it intact. category: quality-review argument-hint: [--routes <comma-list>] [--viewports <list>] [--update-baselines] allowed-tools: Bash(*) Read Write Edit Glob Grep content-pipeline: - pipeline:review - pla

tools

Updated May 1, 2026

$ install --global

skillsauth

npx skillsauth add RonanCodes/ronan-skills skills/visual-regression

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 1, 2026, 2:31 AM140.3s1 file scanned

SKILL.md

name:: visual-regression
description:: Wire Playwright screenshot baselines with CI PR-diff comments to protect against silent CSS regressions. Composes with `/ro:playwright-check` (browser tool) and `/ro:visual-diff` (per-image diff). Use after polish work is done and you want to keep it intact.
category:: quality-review
argument-hint:: [--routes <comma-list>] [--viewports <list>] [--update-baselines]
allowed-tools:: Bash(*) Read Write Edit Glob Grep
- pipeline:: review
- platform:: agnostic
- role:: adapter

Visual Regression

Visual regression is the safety net for polish work. Without it, a global CSS change silently moves a padding value and nobody notices for three weeks.

Usage

/ro:visual-regression                                  # baseline the default routes
/ro:visual-regression --routes '/,  /how-it-works'     # specific routes
/ro:visual-regression --viewports 'mobile,tablet'      # override default viewport set
/ro:visual-regression --update-baselines               # regen after an intentional design change

What gets wired

e2e/visual.spec.ts — baseline tests per route × viewport.
playwright.config.ts updates — deterministic screenshot config.
e2e/__screenshots__/ — baseline PNGs committed to the repo.
GitHub Actions workflow running on PR with diff artifacts.
PR comment integration via bramblex/playwright-report-to-pr-comment or similar.

1. The test file

// e2e/visual.spec.ts
import { test, expect } from '@playwright/test'

const routes = [
  { path: '/', name: 'home' },
  { path: '/how-it-works', name: 'how-it-works' },
  { path: '/?date=2026-04-12', name: 'home-with-puzzle' },
]

const viewports = [
  { name: 'mobile', width: 390, height: 844 },
  { name: 'tablet', width: 768, height: 1024 },
  { name: 'desktop', width: 1440, height: 900 },
]

for (const route of routes) {
  for (const vp of viewports) {
    test(`${route.name} @ ${vp.name}`, async ({ page }) => {
      await page.setViewportSize({ width: vp.width, height: vp.height })
      await page.goto(route.path)

      // Wait for fonts + images to stabilise.
      await page.waitForLoadState('networkidle')
      await page.evaluate(() => document.fonts.ready)

      // Mask anything that changes run-to-run (time, counters, live data).
      await expect(page).toHaveScreenshot(`${route.name}-${vp.name}.png`, {
        fullPage: true,
        mask: [
          page.locator('[data-test="stats-counter"]'),
          page.locator('[data-test="live-timestamp"]'),
        ],
        maxDiffPixels: 100,
      })
    })
  }
}

mask replaces the element with a pink rectangle before the diff. Use for any content that changes between runs (counters, timestamps, random images).

maxDiffPixels: 100 tolerates sub-rendering differences (anti-aliasing, sub-pixel positioning). If the diff exceeds 100 pixels, it fails.

2. `playwright.config.ts` determinism

import { defineConfig, devices } from '@playwright/test'

export default defineConfig({
  testDir: './e2e',
  expect: {
    toHaveScreenshot: {
      maxDiffPixelRatio: 0.01,      // allow ≤1% pixel diff
      threshold: 0.2,               // per-pixel RGB tolerance
      animations: 'disabled',
      caret: 'hide',
    },
  },
  use: {
    ...devices['Desktop Chrome'],
    locale: 'en-GB',
    timezoneId: 'Europe/Amsterdam',
    colorScheme: 'light',
    // Reduce motion prevents any animation drift.
    reducedMotion: 'reduce',
    // Force same font rendering across macOS dev and Linux CI.
    launchOptions: {
      args: ['--font-render-hinting=none'],
    },
  },
  projects: [{ name: 'chromium', use: { ...devices['Desktop Chrome'] } }],
})

Locale + timezone pinned explicitly. A test running on a CI box in PST with a US locale renders dates differently from local macOS in CET — and the baseline will mismatch.

animations: 'disabled' pauses CSS animations at their starting frame. Without this, the 300ms fade-in on a toast is the difference between pass and fail depending on how fast CI runs.

3. Initial baseline capture

# Local (macOS):
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots

# Commit the baselines:
git add e2e/__screenshots__
git commit -m "🧪 test: add visual regression baselines"

Capture on Linux via Docker for consistency with CI:

# If dev is macOS but CI is Linux, baseline on Linux to avoid font-rendering diff:
docker run --rm --network=host \
  -v $(pwd):/work -w /work \
  mcr.microsoft.com/playwright:v1.48.0-jammy \
  pnpm exec playwright test e2e/visual.spec.ts --update-snapshots

4. GitHub Actions workflow

# .github/workflows/visual-regression.yml
name: Visual Regression
on: [pull_request]

jobs:
  visual:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: pnpm
      - run: pnpm install --frozen-lockfile
      - run: pnpm exec playwright install --with-deps chromium
      - run: pnpm build
      - name: Run visual tests
        run: pnpm exec playwright test e2e/visual.spec.ts
      - name: Upload diff on failure
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: visual-diff
          path: test-results/
          retention-days: 7
      - name: Comment on PR
        if: failure()
        uses: mshick/add-pr-comment@v2
        with:
          message: |
            🖼️ Visual regression failed. Download the `visual-diff` artifact to see what changed.

            If the change is intentional, run:
            ```
            pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
            git add e2e/__screenshots__
            git commit
            ```

The failure artifact includes expected.png, actual.png, and diff.png for every failing test. Reviewer opens the diff, decides intentional vs regression, and either approves or asks for a fix.

5. Updating baselines after intentional change

# After a design change, regenerate baselines:
pnpm exec playwright test e2e/visual.spec.ts --update-snapshots
git diff --stat e2e/__screenshots__

# Review the changes in the PR using something like GitHub's image compare view.
git add e2e/__screenshots__
git commit -m "🎨 style: update visual regression baselines after header redesign"

Commit message discipline matters. Reviewers should be able to audit why baselines changed 3 months later.

Picking what to baseline

High-value routes to baseline:

Homepage / landing page.
Primary feature flow (e.g. puzzle view, checkout page, dashboard).
Any page that renders data from a database (empty state, populated state).
/how-it-works, /pricing, marketing pages — silent regression here hurts conversion.

Low-value routes (skip):

Admin pages — internal, not sensitive to regression.
Auth flows — tested at a different layer.
Error pages (404, 500) — usually fine; baselining creates more maintenance than value.

Viewport strategy:

Mobile (390×844 — iPhone 14) — most traffic.
Desktop (1440×900) — dev checks.
Tablet only if the app renders meaningfully differently at tablet widths.

Complementary tool: `/ro:visual-diff`

/ro:visual-diff before.png after.png              # one-off check
/ro:visual-regression                              # always-on, in CI

Gotchas

Flakiness from fonts. Baseline on the same OS as CI (Linux via Docker) or the diff-pixel ratio will eat you.
Flakiness from timestamps. Any UI showing "now" or relative time needs mask: [...] or a frozen clock (page.clock.install({ time: '2026-04-12T10:00:00Z' })).
Flakiness from API data. Tests that render real data are flaky. Mock the API at the fetch layer (page.route('**/api/**', (route) => route.fulfill({ json: fixture }))).
Image size explosion. A full-page screenshot on desktop 4k is ~1-2 MB. Repo can grow fast. Consider fullPage: false + scoped selectors for sub-components if size matters.
Baselines in merge conflicts. Binary PNG conflicts are impossible to resolve in git. Rebase, regenerate baselines, recommit.
Ignoring dynamic content. If you mask too much, the test stops catching real changes. Masks should cover only truly random regions.
Running tests in parallel. Playwright parallelises by default; screenshots can occasionally collide on scroll position. Use test.describe.configure({ mode: 'serial' }) if flakiness correlates with parallel runs.

When NOT to use visual regression

Early-stage apps with rapid design iteration. You'll regenerate baselines every PR — noise exceeds signal.
Pure-backend projects. Visual regression is for UI.
Marketing sites where content changes daily. Move to a CMS-level approval workflow instead.

Rules

Baseline intentionally. Every baseline in the repo is a contract. If it was captured by accident, it's technical debt.
Update baselines in dedicated PRs when the change is large. Mixing a visual-regression update with feature work makes review harder.
Don't tolerate flaky tests. A flaky visual test trains the team to ignore failures. Fix or delete.
Review diffs. Opening the diff.png artifact before merging is non-negotiable. Don't just click "update baselines".
Pin everything. Locale, timezone, color scheme, font rendering, animation state. Any unpinned variable is future flake.

Related Skills

RonanCodes/skills/linear-pipeline

testing

VerifiedTrustedCommunity

--- name: linear-pipeline description: The Fable orchestrator for a single dispatched Linear ticket. Holds almost no context itself; it receives `--issue <ID> --detached`, decides the stage sequence, and fans out a sub-agent per stage, passing forward only each stage's artifact (never re-derived, never inlined into its own context). Step zero, before any planning or stage routing, is a boundary triage against `canon/security-boundary.md` (#199): a match tags Ronan Connolly and stops the run, no

SKILL.mdUpdated Jul 5, 2026

RonanCodes/skills/linear-pipeline

RonanCodes/skills/in-your-face

development

VerifiedTrustedCommunity

--- name: in-your-face description: Capture a chat-only answer into a durable artifact (markdown + HTML, PDF when cheap) and launch it automatically so the user cannot miss it. Use when user says "in your face", "don't let me lose this", "save that answer", "make that durable", or right after answering a substantive side question (a recipe, comparison, how-to, or generated prompt) that would otherwise die with the context. category: workflow argument-hint: [--no-open] [--vault <short>] [hint of

SKILL.mdUpdated Jul 5, 2026

RonanCodes/skills/in-your-face

RonanCodes/codex

tools

VerifiedTrustedCommunity

One-shot headless OpenAI Codex CLI calls for background/admin AI tasks — summaries, classification, extraction, admin glue. The default engine for anything that runs AI constantly in the background (daemon-driven, per-event), because it bills the flat ChatGPT subscription instead of Claude usage or per-token API spend, and it keeps working while Claude is rate-limited. NEVER for coding — coding stays Claude. Use when a skill or daemon needs a cheap always-on AI call, when the user says "use codex", "ask codex", "codex as backup", or when building a background summarizer/classifier into a listener or loop. Reads auth from ~/.codex/auth.json (ChatGPT account, no API key).

SKILL.mdUpdated Jul 5, 2026

RonanCodes/warranty-check

research

VerifiedTrustedCommunity

Turn a warranty rejection, repair quote, or RMA email into a cited decision brief — legal read (NL/EU consumer law), is the part user-serviceable, live part and new-unit prices, repair-vs-DIY-vs-new economics, before-you-send-it checklist, deadlines. Use when the user pastes or screenshots a repair quote, warranty rejection, "not covered" email, onderzoekskosten fee, or asks "should I repair or replace this".

SKILL.mdUpdated Jul 4, 2026

RonanCodes/warranty-check

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/RonanCodes/ronan-skills.git

# Copy into Claude Code skills folder (global)
cp -r ronan-skills/skills/visual-regression ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

RonanCodes/ronan-skills

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

RonanCodes/skills/visual-regression

$ install --global

Security Scan Results

SKILL.md

Visual Regression

Usage

What gets wired

1. The test file

2. playwright.config.ts determinism

3. Initial baseline capture

4. GitHub Actions workflow

5. Updating baselines after intentional change

Picking what to baseline

Complementary tool: /ro:visual-diff

Gotchas

When NOT to use visual regression

Rules

See also

Related Skills

RonanCodes/skills/linear-pipeline

RonanCodes/skills/in-your-face

RonanCodes/codex

RonanCodes/warranty-check

RonanCodes/skills/visual-regression

$ install --global

Security Scan Results

SKILL.md

Visual Regression

Usage

What gets wired

1. The test file

2. playwright.config.ts determinism

3. Initial baseline capture

4. GitHub Actions workflow

5. Updating baselines after intentional change

Picking what to baseline

Complementary tool: /ro:visual-diff

Gotchas

When NOT to use visual regression

Rules

See also

Related Skills

RonanCodes/skills/linear-pipeline

RonanCodes/skills/in-your-face

RonanCodes/codex

RonanCodes/warranty-check

2. `playwright.config.ts` determinism

Complementary tool: `/ro:visual-diff`

2. `playwright.config.ts` determinism

Complementary tool: `/ro:visual-diff`