skills/visual-verdict/SKILL.md
Screenshot comparison QA for frontend development. Takes a screenshot of the current implementation, scores it across multiple visual dimensions, and returns a structured PASS/REVISE/FAIL verdict with concrete fixes. Use when implementing UI from a design reference or verifying visual correctness.
npx skillsauth add rubicanjr/FinCognis visual-verdictInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Pixel-imprecise human eyes miss visual bugs. This skill provides structured, scored visual review using screenshots, returning actionable verdicts rather than vague impressions.
1. Take screenshot of current implementation
2. Load reference (design mockup, Figma export, or previous version)
3. Score against 6 dimensions (0-100 each)
4. Compute weighted total score
5. Issue verdict: PASS (90+) / REVISE (60-89) / FAIL (<60)
6. List concrete mismatches with file + line fix hints
7. Loop: frontend-dev fixes → new screenshot → rescore → repeat until PASS
~/.mcp.json)| Dimension | Weight | Description | |-----------|--------|-------------| | Layout accuracy | 0.25 | Positioning, spacing, alignment of all elements | | Typography | 0.15 | Font family, size, weight, line-height, letter-spacing | | Color accuracy | 0.15 | Background, text, border, shadow, gradient colors | | Responsive behavior | 0.15 | Breakpoint transitions, mobile/tablet rendering | | Interactive states | 0.15 | Hover, focus, active, disabled, loading states | | Content completeness | 0.15 | All text, images, icons, labels present |
total = (layout * 0.25) + (typography * 0.15) + (color * 0.15)
+ (responsive * 0.15) + (interactive * 0.15) + (content * 0.15)
Layout accuracy
Typography
Color accuracy
Responsive behavior
Interactive states
Content completeness
| Verdict | Score | Meaning | |---------|-------|---------| | PASS | 90-100 | Acceptable for production | | REVISE | 60-89 | Needs fixes, not shippable | | FAIL | 0-59 | Major rework required |
## Visual Verdict: [Component/Page Name]
**Score: 85/100 — REVISE**
**Attempt: 2/5**
**Previous score: 72 (+13 this iteration)**
### Dimension Scores
| Dimension | Score | Weight | Weighted | Issues |
|-----------|-------|--------|----------|--------|
| Layout | 92 | 0.25 | 23.0 | Minor padding on mobile |
| Typography | 68 | 0.15 | 10.2 | Wrong font-weight on H2 |
| Color | 96 | 0.15 | 14.4 | - |
| Responsive | 80 | 0.15 | 12.0 | Card grid breaks at 768px |
| Interactive | 82 | 0.15 | 12.3 | Missing hover on CTA |
| Content | 91 | 0.15 | 13.7 | - |
| **TOTAL** | | | **85.6** | |
### Concrete Mismatches (prioritized by impact)
1. **[Typography / HIGH]** H2 hero heading
Expected: `font-weight: 700`
Actual: `font-weight: 400` (normal weight)
Fix: `src/components/Hero.tsx` → add `font-bold` class to `<h2>`
2. **[Responsive / MEDIUM]** Card grid layout
Expected: 3-column grid collapses at 1024px
Actual: Collapses at 768px — causes overflow between 768-1024
Fix: `src/components/CardGrid.tsx` → change `md:grid-cols-3` to `lg:grid-cols-3`
3. **[Interactive / MEDIUM]** Primary CTA button
Expected: Darker background on hover (`bg-blue-700`)
Actual: No hover state change
Fix: `src/components/Hero.tsx` → add `hover:bg-blue-700 transition-colors` to CTA
4. **[Layout / LOW]** Mobile padding
Expected: 16px horizontal padding on mobile
Actual: 12px
Fix: `src/components/Layout.tsx` → change `px-3` to `px-4`
### Next Steps
Fix issues 1-3 above, then call visual-verdict again.
Estimated score after fixes: ~94 (PASS)
Use browser-use MCP:
1. Navigate to implementation URL (e.g., http://localhost:3000/dashboard)
2. Wait for page fully loaded (no spinners)
3. Take full-page screenshot
4. Save to /tmp/verdict-current-{timestamp}.png
Three reference types accepted:
Figma PNG export: Designer exports component/page as PNG at 2x
Reference path: ~/Desktop/designs/dashboard-v2.png
Previous production screenshot: Regression check
Reference path: ~/.claude/benchmarks/screenshots/dashboard-prod.png
Live URL comparison: Compare two running versions
Reference URL A: http://localhost:3000/dashboard (current)
Reference URL B: https://staging.app.com/dashboard (baseline)
Always test at these three widths unless told otherwise:
| Name | Width | Target | |------|-------|--------| | Mobile | 375px | iPhone SE | | Tablet | 768px | iPad portrait | | Desktop | 1440px | Standard laptop |
The typical iteration loop:
frontend-dev implements component
→ visual-verdict takes screenshot + scores
→ REVISE: sends concrete fix list to frontend-dev
→ frontend-dev applies fixes
→ visual-verdict rescores
→ Repeat until PASS (max 5 iterations)
→ PASS: task marked complete
Maximum 5 iterations per component. If PASS is not reached after 5 attempts:
ESCALATION: Visual QA failed after 5 iterations
Component: [name]
Best score achieved: [score]
Blocker: [specific issue that keeps failing]
Recommendation: [designer clarification needed / fundamentally wrong approach]
When cloning an external site, visual-verdict is the quality gate:
1. clone-website skill produces initial implementation
2. visual-verdict screenshots both original and clone
3. Scores similarity across all 6 dimensions
4. REVISE verdict triggers targeted fixes
5. PASS verdict confirms clone is production-ready
Color is compared by extracting computed CSS values and comparing:
Reference color: #1D4ED8 (rgb 29, 78, 216)
Actual color: #2563EB (rgb 37, 99, 235)
Delta E (perceived): 6.2 — noticeable, score -15
Delta E thresholds:
Typography is checked by reading computed styles, not pixels:
// Extract from running implementation
const element = document.querySelector('.hero-title')
const styles = window.getComputedStyle(element)
{
fontFamily: styles.fontFamily, // Compare against reference
fontSize: styles.fontSize, // "32px" vs expected "36px"
fontWeight: styles.fontWeight, // "400" vs expected "700"
lineHeight: styles.lineHeight, // "1.5" vs expected "1.25"
letterSpacing: styles.letterSpacing // "normal" vs expected "-0.02em"
}
px-3 when design shows px-4After a PASS verdict, save the screenshot as the new baseline:
cp /tmp/verdict-current-{timestamp}.png \
~/.claude/benchmarks/screenshots/{component-name}-{date}.png
On the next change, this saved screenshot becomes the reference for regression comparison.
Remember: A visual that looks "pretty close" in a quick glance often has 3-5 compounding issues that add up to a broken experience. Score it. Trust the number, not the impression.
development
Goal-based workflow orchestration - routes tasks to specialist agents based on user goals
tools
Wiring Verification
development
Connection management, room patterns, reconnection strategies, message buffering, and binary protocol design.
development
Comprehensive verification system covering build, types, lint, tests, security, and diff review before a PR.