skills/technical-patterns/seo-checklist-skill/SKILL.md
SEO first principles for PolicyEngine web applications - meta tags, crawlability, performance, and dual-mode (standalone + iframe) considerations
npx skillsauth add policyengine/policyengine-claude seo-checklistInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Use this skill when auditing or building web applications that need to be discoverable via search engines. PolicyEngine apps are typically React SPAs deployed to GitHub Pages, often served both standalone and embedded as iframes in policyengine.org research pages.
Google does three things:
Your job is to make all three steps easy. If any step fails, your page won't appear in search results.
The most critical issue for React SPAs. When Googlebot visits a client-side rendered app, it sees:
<div id="root"></div>
All content generated by JavaScript may or may not be indexed. Google can execute JS but:
Test: Run curl -s YOUR_URL | grep -c '<h1>' — if the result is 0, Google likely can't see your content.
Solutions (ranked by effectiveness):
| Approach | Description | Effort | |----------|------------|--------| | SSR (Next.js, Remix) | Server renders full HTML on each request | High (framework change) | | SSG (Static Site Generation) | Pre-build HTML at deploy time | Medium | | Pre-rendering | Render SPA to static HTML for crawlers | Low-Medium | | Meta tags only | At minimum, add static meta tags to index.html | Low |
For PolicyEngine calculator apps, pre-rendering or SSG is the sweet spot. The form/landing page is static content; only results are dynamic.
Google ranks pages, not websites. Each URL you want to rank for needs:
https://example.com/#country=us®ion=CA&head=45000
Google treats everything after # as the same page. All hash variations = one URL = one indexed page.
https://example.com/us/california?head=45000
Query parameters (?key=value) ARE seen by Google (though they may be treated as variants). Path segments (/us/california) are treated as distinct pages.
PolicyEngine apps often run in two modes:
policyengine.github.io/us-marriage-incentive/)policyengine.org/us/research/marriage)| Concern | Standalone | Embedded (iframe) | |---------|-----------|-------------------| | Indexed by Google? | Yes (if crawlable) | No — Google indexes the parent page, not iframe content | | Needs meta tags? | Yes — this is the version Google sees | No — parent page provides meta tags | | Needs canonical URL? | Yes — should point to itself OR the parent page | N/A | | Needs robots.txt? | Yes | N/A (inherits from parent domain) | | Needs sitemap? | Yes | N/A (parent sitemap covers parent pages) |
If the primary audience should find the app via policyengine.org:
<link rel="canonical" href="https://policyengine.org/us/research/marriage">
This tells Google: "The real version of this content lives on policyengine.org. Index that one."
If the standalone version is the primary:
<link rel="canonical" href="https://policyengine.github.io/us-marriage-incentive/">
Rule: Every page needs exactly one canonical URL. Without it, Google may index both versions and split your ranking power between them (called "duplicate content dilution").
Most PolicyEngine apps already detect this:
const isEmbedded = window.self !== window.top;
SEO-relevant behavior should NOT depend on this check — meta tags, titles, and structured data must be present in the static HTML regardless of runtime mode.
Every PolicyEngine web app needs these in index.html:
<!-- Basic SEO -->
<title>US Marriage Tax Calculator — Marriage Penalty & Bonus | PolicyEngine</title>
<meta name="description" content="Calculate how marriage affects your taxes and government benefits. See your marriage penalty or bonus across income levels for any US state.">
<link rel="canonical" href="https://policyengine.github.io/us-marriage-incentive/">
<!-- Open Graph (Facebook, LinkedIn, Slack, iMessage previews) -->
<meta property="og:type" content="website">
<meta property="og:title" content="US Marriage Tax Calculator">
<meta property="og:description" content="Calculate how marriage affects your taxes and government benefits.">
<meta property="og:image" content="https://policyengine.github.io/us-marriage-incentive/og-image.png">
<meta property="og:url" content="https://policyengine.github.io/us-marriage-incentive/">
<meta property="og:site_name" content="PolicyEngine">
<!-- Twitter / X -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="US Marriage Tax Calculator">
<meta name="twitter:description" content="Calculate how marriage affects your taxes and government benefits.">
<meta name="twitter:image" content="https://policyengine.github.io/us-marriage-incentive/og-image.png">
<!-- Structured Data (JSON-LD) -->
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "WebApplication",
"name": "US Marriage Tax Calculator",
"description": "Calculate how marriage affects your taxes and government benefits.",
"url": "https://policyengine.github.io/us-marriage-incentive/",
"applicationCategory": "FinanceApplication",
"operatingSystem": "Web",
"offers": { "@type": "Offer", "price": "0", "priceCurrency": "USD" },
"author": {
"@type": "Organization",
"name": "PolicyEngine",
"url": "https://policyengine.org"
}
}
</script>
<!-- Theme color for mobile browsers — use --pe-color-primary-500 value -->
<meta name="theme-color" content="#319795">
public/ directory so it's available at build output rootPlace in public/robots.txt (Vite copies public/ contents to build root):
User-agent: *
Allow: /
Sitemap: https://YOUR_DOMAIN/sitemap.xml
Place in public/sitemap.xml:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://YOUR_DOMAIN/</loc>
<lastmod>2025-01-01</lastmod>
<changefreq>monthly</changefreq>
</url>
</urlset>
For apps with multiple distinct pages, add each URL as a separate <url> entry.
Always add an empty .nojekyll file to public/ when deploying to GitHub Pages. Without it, GitHub runs Jekyll processing which can mangle XML files like sitemap.xml and robots.txt, preventing Google from reading them.
Known issue: Google Search Console cannot fetch sitemaps from .github.io domains. Even with a valid, accessible sitemap.xml, Search Console will show "Sitemap could not be read." This is a GitHub infrastructure limitation — GitHub blocks automated Googlebot fetches.
Workarounds:
tool.policyengine.org) pointing to org.github.io. Sitemaps work correctly on custom domains.After deploying robots.txt and sitemap.xml:
.github.io: expect "Sitemap could not be read" — use URL Inspection insteadGoogle measures Core Web Vitals:
| Metric | What | Target | How to test | |--------|------|--------|-------------| | LCP (Largest Contentful Paint) | Time to render biggest visible element | < 2.5s | PageSpeed Insights | | FID (First Input Delay) | Time until page responds to first interaction | < 100ms | PageSpeed Insights | | CLS (Cumulative Layout Shift) | Visual stability (how much things jump around) | < 0.1 | PageSpeed Insights |
| Issue | Impact | Fix |
|-------|--------|-----|
| Plotly.js bundle (~3-5 MB) | Destroys LCP | Replace with Recharts (~120 KB) or lazy-load aggressively |
| No code splitting | Entire app loads before anything renders | Use React.lazy() + Suspense |
| Unoptimized images | Slow LCP | Use WebP, proper sizing, lazy loading |
| No font preloading | Layout shift when fonts load | Use <link rel="preconnect"> + display=swap |
| Large JSON data files | Blocks initial render | Lazy-load data or move to API calls |
Test: Run PageSpeed Insights at https://pagespeed.web.dev/ with your deployed URL.
Search engines use heading hierarchy to understand page structure.
Prefer semantic HTML over generic divs:
<main> <!-- Primary content -->
<nav> <!-- Navigation -->
<section> <!-- Thematic grouping -->
<article> <!-- Self-contained content -->
<aside> <!-- Sidebar/supplementary -->
<footer> <!-- Footer content -->
Google uses accessibility signals as ranking factors. Key items:
alt text — describes the image for screen readers and Google<label> elements or aria-labelaria-hidden="true"You cannot improve what you cannot measure.
Add to index.html before </head>:
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-XXXXXXXXXX"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-XXXXXXXXXX');
</script>
Replace G-XXXXXXXXXX with the actual GA4 measurement ID.
The most powerful ranking signal is other websites linking to yours. For PolicyEngine:
This is not something the plugin can check, but it's important context: the policyengine.org embedding strategy provides backlink authority that standalone GitHub Pages deployments lack.
<title> is descriptive, < 60 chars, includes keywords<meta name="description"> is 150-160 chars with call to action<link rel="canonical"> points to preferred URLog:title, og:description, og:image, og:url presenttwitter:card, twitter:title, twitter:description, twitter:image presentrobots.txt exists in build output rootsitemap.xml exists in build output root.nojekyll exists in public/ (GitHub Pages only — prevents XML mangling)<html lang="en"> attribute set<meta name="theme-color"> settools
ONLY use this skill when users explicitly ask about the PolicyEngine Python package installation, REST API endpoints, API authentication, rate limits, or policyengine.py client library. DO NOT use for household benefit/tax calculations — ALWAYS use policyengine-us or policyengine-uk instead. This skill is about the API/client tooling itself, not about calculating benefits or taxes.
development
ALWAYS USE THIS SKILL for PolicyEngine microsimulation, population-level analysis, winners/losers calculations. Triggers: microsimulation, share who would lose/gain, policy impact, national average, weighted analysis, cost, revenue impact, budgetary, estimate the cost, federal revenues, tax revenue, budget score, how much would it cost, how much would the policy cost, total cost of, aggregate impact, cost to the government, revenue loss, fiscal impact, poverty impact, child poverty, deep poverty, poverty rate, poverty reduction, how many people lifted out of poverty, SPM poverty, distributional impact, state tax, state-level, California, New York, UBI, universal basic income, flat tax, standard deduction, winners and losers, winners, losers, inequality, Gini, decile, SALT, marginal tax rate, effective tax rate. NOT for single-household calculations like "what would my benefit be" - use policyengine-us or policyengine-uk for those. Use this skill's code pattern; explore codebase for parameter paths if needed.
development
PolicyEngine API v2 - Next-generation microservices architecture with monorepo structure
development
ALWAYS LOAD THIS SKILL before setting up any Python environment or installing packages. Defines the standard: uv, Python 3.13, uv pip install, .venv at project root. Triggers: "set up python", "install python", "create a venv", "virtual environment", "pip install", "install packages", "uv pip", "uv venv", "python version", "VIRTUAL_ENV", "venv conflict", "which python", "activate", "deactivate", "run the script", "run with uv", "uv run", "pyproject.toml", "install dependencies", "install requirements", "install the package", "editable install", "pip install -e", "latest package", "latest version", "current version", "newest version".