skills/visual-diff/SKILL.md
Compare images for visual correctness using pixel diff and Claude vision. Use for screenshot comparison, visual regressions, design-match checks (current UI vs a reference image), or per-component diffs against a design.
npx skillsauth add RonanCodes/ronan-skills visual-diffInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Compare two images using pixel-level diffing and Claude's multimodal vision. Two workflows:
# Regression (as before)
/visual-diff screenshot.png baseline.png
/visual-diff current.png --baseline # save as baseline
# Design-match — whole page vs a local reference
/visual-diff http://localhost:3000 --reference ./design.png
# Design-match — whole page vs an online reference (curl-fetched)
/visual-diff http://localhost:3000 --reference https://example.com/nyt-connections.png
# Per-component — crop the live page by CSS selector
/visual-diff http://localhost:3000 --selector '[data-slot="card"]' --reference ./card-ref.png
# Per-component — reference is a live page too (both sides selected)
/visual-diff http://localhost:3000 --selector '.word-tile' \
--reference-page https://www.nytimes.com/games/connections --reference-selector '[data-testid="card"]'
If ImageMagick is available (which compare):
compare -metric AE image1.png image2.png diff.png 2>&1
# total_pixels=$(identify -format "%w*%h" image1.png | bc)
# match = (1 - differing/total) * 100
Or npx pixelmatch image1.png image2.png diff.png --threshold 0.1. Skip if neither tool is available.
Size normalisation — if the two images differ in dimensions (common for design-match where the reference was captured at a different scale), resize the smaller to the larger's box before pixel-diffing:
identify -format "%wx%h" image1.png # e.g. 1440x900
identify -format "%wx%h" image2.png # e.g. 1024x640
convert image2.png -resize 1440x900\! image2-scaled.png
compare -metric AE image1.png image2-scaled.png diff.png
Report both the raw metric and the scaled result. Design-match is noisier by nature — set threshold lower (e.g. 85%) and lean harder on Claude vision for the verdict.
Read both images with the Read tool (Claude is multimodal) and analyze:
This is the key insight — Claude sees both images and reasons about visual correctness better than pixel diff alone. For design-match against Figma exports or online screenshots, pixel diff often scores ~60% even when the match is "right enough"; Claude vision is the decisive check.
--baseline — save image1 to .visual-diff/baselines/ as the new baseline, exit.--threshold N — pass/fail percentage for pixel diff (default: 95% for regression, 85% for design-match).If the first argument looks like a URL (starts with http:// or https://), the skill launches playwright to capture the current state:
--selector crop if passed). Save to .visual-diff/tmp/current-<name>.png. Wait for networkidle before capturing.--reference <local-path> — read from disk.--reference <http-url> — curl -sSL -o .visual-diff/tmp/ref.png <url> (add User-Agent header to avoid 403s).--reference-page <url> --reference-selector <css> — playwright opens the reference URL and screenshots, optionally cropped.--selector)When cropping by CSS selector:
page.locator(selector).first.screenshot({ path, omitBackground: true }). omitBackground helps when the component is translucent.waitFor({ state: 'visible' })) before screenshotting.--reference-selector is also passed, crop the reference side symmetrically so like-for-like.[data-slot="card"], button[data-variant="default"], [role="dialog"], header, nav.When --reference is an HTTP URL:
mkdir -p .visual-diff/tmp
curl -sSL -A "Mozilla/5.0 (compatible; visual-diff)" -o .visual-diff/tmp/ref.png "$URL"
# Verify it's actually an image — URLs behind auth often return HTML instead
file .visual-diff/tmp/ref.png | grep -qE 'image|PNG|JPEG|GIF|WebP' || {
echo "Reference URL did not return an image (probably an auth/HTML page)"; exit 1;
}
Do not fetch from URLs that require auth (paywalled NYT, etc.) — you'll get an HTML error page. For those, the user should either:
--reference <local-path>, or--reference-page <url> with a playwright flow that handles login..visual-diff/
├── baselines/ # regression baselines (can be committed if you want a tracked reference set)
├── references/ # design-match reference images (rarely committed; gitignore by default)
├── diffs/ # produced diff images (always ignored)
└── tmp/ # fetched/captured intermediates (always ignored)
Add .visual-diff/tmp/, .visual-diff/diffs/, and typically .visual-diff/references/ to .gitignore.
## Visual Diff Report
- Mode: design-match (component)
- Current: http://localhost:3000 → selector `.word-tile` (captured 2026-04-23)
- Reference: ./nyt-tile-reference.png (1024x1024 → scaled to 1440x1440)
- Pixel match: 81.3% (threshold: 85%) — below
- Claude vision verdict: Close match. Background correct (#efefe6). Letter-spacing looks ~0.01em tighter in current; tile radius is 8px in both. One real issue: font-weight is 600 in current vs 700 in reference.
- Diff image: .visual-diff/diffs/word-tile-diff.png
- Status: Fail — font-weight mismatch
Always report Claude's verdict and the pixel metric. The verdict is the decision; the metric is evidence.
compare/pixelmatch are missing..visual-diff/tmp/ that include the selector or reference source, not just current.png, so parallel runs don't clobber..visual-diff/references/ and gitignore.development
--- name: worktree description: Coordinate multiple agents on one repo via a worktree-lock pool, so two agents never clobber each other's working tree. Acquire the first free slot (main, then beta/gamma… worktrees, created on demand), work there on your own branch, release when you've pushed. Use before modifying any repo that might be in use by another agent (factory, dataforce, etc.), or whenever you're told a repo is being worked on. Backed by `ro worktree`. category: development argument-hin
testing
--- name: ship description: Ship a feature branch the local-CI-first way — run the full local gate, push, open a PR, squash-merge, then deploy, without waiting on GitHub Actions. Use when a branch is ready for main and you want it merged and deployed now. Reads CI policy from `ro ci` (default skips remote CI because GitHub Actions billing keeps hitting limits). Sibling to /ro:gh-ship (waits on GitHub checks) and /ro:cf-ship (the deploy half). Triggers on "ship it", "ship this", "merge and deploy
testing
--- name: setup-logging description: Set up (or audit) the observability stack in a TanStack Start + Cloudflare Workers app so it is "diagnosable by default" — structured logging (logtape) with a request context carrying trace_id + userId + tenant/orgId, a trace_id propagated FE→BE→logs→Sentry→PostHog, Cloudflare Workers observability enabled, and Sentry + PostHog wired. Two modes: `setup` (wire it into an app) and `audit` (check an existing app + report gaps). Use when scaffolding a new app, wh
development
Manage credentials INSIDE the active ~/.claude/.env file — read which token/account to use for a given app (Simplicity vs Dataforce vs Ronan-personal), add or update a secret WITHOUT it passing through the chat (an interactive Terminal window prompts for it), and track secrets that were exposed in a transcript so they get rotated. Sibling to /ro:context (which switches WHICH env file is active). Use when the user wants to add an API key/token/secret, asks "which credential do I use for X", needs the env organized/labelled, or a secret was pasted into the chat and should be rotated.