skills/abejitsu/ai-visual-accuracy-check/SKILL.md
Use AI to compare rendered HTML to original PDF page. AI makes contextual judgment about visual accuracy with explainable reasoning. BLOCKING quality gate - stops pipeline if score below 85%.
npx skillsauth add aiskillstore/marketplace ai-visual-accuracy-checkInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This is a BLOCKING quality gate that uses AI to validate visual accuracy of generated HTML against the original PDF page. Unlike pixel-perfect comparison, AI understands:
The AI provides:
This combines AI's contextual understanding with deterministic gating (must pass 85+ to continue).
Load input files
chapter_XX.html (generated consolidated HTML)02_page_XX.png (original PDF page image)Render HTML to image
Invoke Claude with visual comparison
Parse AI response
Save comparison report
output/chapter_XX/chapter_artifacts/ai_visual_accuracy.jsonMake gate decision
html_file: <str> - Path to chapter_XX.html
pdf_page_png: <str> - Path to original PDF page PNG (or multiple for multi-page)
output_dir: <str> - Directory for report
chapter: <int> - Chapter number (for reporting)
book_pages: <str> - Page range (for reporting)
threshold: <float> - Minimum score to pass (default: 85.0)
You are validating the visual accuracy of a generated HTML page against the original PDF.
ORIGINAL PDF PAGE:
[PNG Image of original PDF page attached]
GENERATED HTML (Rendered):
[PNG Image of rendered HTML page attached]
TASK:
Compare these two images and determine if the HTML accurately recreates the visual appearance and layout of the PDF page.
EVALUATION CRITERIA:
1. Layout Match (40% weight)
- Overall page structure matches original
- Sections in correct order and position
- Spacing between elements appropriate
- Page dimensions/aspect ratio similar
2. Visual Hierarchy (30% weight)
- Headings stand out with appropriate prominence
- Section breaks clearly visible
- Emphasis (bold, italic) preserved or equivalent
- Visual relationships between elements clear
3. Content Positioning (20% weight)
- Elements aligned correctly (left, center, right)
- Lists indented with proper spacing
- Tables/exhibits positioned and aligned correctly
- Paragraph flow matches original
4. Typography & Styling (10% weight)
- Font sizes relative to each other correct
- Text styling appropriate (bold, italic, caps)
- Color scheme preserved (if applicable)
- Overall readability equivalent or better
SCORING GUIDELINES:
For each criterion:
- 90-100%: Excellent, no issues
- 80-89%: Good, minor cosmetic differences
- 70-79%: Acceptable, noticeable but not critical
- Below 70%: Poor, significant differences
IMPORTANT CONTEXT:
- HTML rendering in browser may differ slightly from PDF (spacing, fonts)
- Focus on INTENT and READABILITY, not pixel-perfect match
- Small spacing/margin differences (2-5px) are acceptable
- Font rendering differences are acceptable if hierarchy preserved
- Web rendering constraints are acceptable (no absolute PDF positioning)
OUTPUT FORMAT:
Provide your analysis in this exact JSON format:
```json
{
"overall_score": 92.5,
"threshold": 85.0,
"recommendation": "PASS",
"criteria_analysis": {
"layout_match": {
"score": 94,
"feedback": "Overall page structure matches well. Section order correct, spacing appropriate."
},
"visual_hierarchy": {
"score": 90,
"feedback": "Headings clearly distinguished. Visual relationships preserved. Minor font size variance acceptable."
},
"content_positioning": {
"score": 91,
"feedback": "Element alignment correct. Lists properly indented. Tables positioned correctly."
},
"typography_styling": {
"score": 88,
"feedback": "Text styling preserved. Bold and italic distinctions clear. Readability excellent."
}
},
"differences_noted": [
"Paragraph line-height 1.6 vs 1.5 in PDF (acceptable, improves readability)",
"Bullet list indentation 20px vs 15px in PDF (acceptable, clear and readable)"
],
"visual_fidelity_assessment": "EXCELLENT",
"confidence_level": 0.95,
"explanation": "The HTML accurately recreates the PDF page layout and visual hierarchy. All major elements are positioned correctly. Minor spacing and font differences are within acceptable tolerances for web rendering and actually improve readability.",
"pass_fail_verdict": "PASS"
}
VALIDATION:
## Process Flow
┌─ Load HTML & PNG ──────────────────┐ │ • chapter_XX.html │ │ • 02_page_XX.png │ └────────┬────────────────────────────┘ │ ▼ ┌─ Render HTML to PNG ───────────────┐ │ • Headless browser │ │ • Full page screenshot │ │ • Save to temp location │ └────────┬────────────────────────────┘ │ ▼ ┌─ Invoke Claude API ────────────────┐ │ • Send original PDF PNG │ │ • Send rendered HTML PNG │ │ • Multi-modal comparison prompt │ │ • Request JSON response │ └────────┬────────────────────────────┘ │ ▼ ┌─ Parse & Save Report ──────────────┐ │ • Extract JSON from response │ │ • Validate score 0-100 │ │ • Save to JSON file │ └────────┬────────────────────────────┘ │ ▼ ┌─ Gate Decision ────────────────────┐ │ • If score ≥ 85: PASS │ │ • If score < 85: FAIL │ └────────┬────────────────────────────┘ │ ▼ Exit with code 0 or 1
## Output File Format
**Path**: `output/chapter_XX/chapter_artifacts/ai_visual_accuracy.json`
```json
{
"chapter": 2,
"book_pages": "16-29",
"validation_type": "ai_visual_accuracy",
"validation_timestamp": "2025-11-08T14:45:00Z",
"overall_score": 92.5,
"threshold": 85.0,
"status": "PASS",
"ai_model": "claude-3-5-sonnet-20241022",
"inputs": {
"html_file": "chapter_02.html",
"original_pdf_png": "02_page_16.png",
"rendered_html_png": "rendered_chapter_02.png"
},
"criteria_scores": {
"layout_match": 94,
"visual_hierarchy": 90,
"content_positioning": 91,
"typography_styling": 88
},
"differences": [
"Paragraph line-height 1.6 vs 1.5 in PDF (acceptable)",
"Bullet list indentation 20px vs 15px in PDF (acceptable)"
],
"visual_fidelity": "EXCELLENT",
"confidence": 0.95,
"explanation": "The HTML accurately recreates the PDF page layout and visual hierarchy...",
"recommendation": "PASS",
"notes": "All criteria well within acceptable ranges. Minor web rendering differences do not impact readability or intent."
}
For chapters spanning multiple pages:
Option A: Compare key pages
Option B: Compare consolidated view
Approach: Use Option A for thorough validation
Score ≥ 85: PASS → Continue to deployment
Score < 85: FAIL → Trigger hook, block pipeline
Interpretation:
90-100: Excellent, no concerns
85-89: Good, minor cosmetic differences acceptable
< 85: Requires review and likely fixes
If HTML rendering fails:
If AI response is invalid JSON:
If score seems wrong (too high/low):
If original PNG is missing:
Before saving report:
Score validity
Report completeness
AI reasoning
✓ Visual accuracy report generated successfully ✓ Overall score calculated and justified ✓ All criteria scored and explained ✓ Differences clearly documented ✓ Pass/fail decision clear ✓ Exit code 0 if PASS, 1 if FAIL ✓ Report saved in JSON format
If validation passes (score ≥ 85):
If validation fails (score < 85):
calypso-visual-accuracy.sh triggeredTo test AI visual accuracy:
# Generate chapter HTML (previous steps)
# Render to PNG
# Compare with original PDF
# Expected behavior:
# - AI compares images
# - Scores layout, hierarchy, positioning, typography
# - Generates report with score and explanation
# - Returns PASS (score ≥ 85) or FAIL (score < 85)
| Aspect | Python Pixel-Diff | AI Visual Comparison | |--------|------------------|----------------------| | Understanding | Detects changes | Understands intent | | Flexibility | Exact match required | Accepts valid variations | | Explanation | Pixel coordinates | Semantic feedback | | Tolerance | Binary (match/no match) | Graduated (85%+ acceptable) | | Context | No context | Full visual context | | Human-like | No | Yes, like QA reviewer |
AI visual accuracy validation is smarter and more human-like than pixel-perfect comparison.
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.