skills/geo-crawlers/SKILL.md
AI crawler access analysis. Checks robots.txt, meta tags, and HTTP headers to determine which AI crawlers can access the site. Provides a complete access map and recommendations for maximizing AI visibility while maintaining appropriate control.
npx skillsauth add kennyolofsson23-netizen/claude-code-config geo-crawlersInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill analyzes a website's accessibility to AI crawlers -- the bots that AI companies use to discover, index, and train on web content. If AI crawlers are blocked, the site's content cannot appear in AI-generated responses regardless of its quality. Crawler access is the foundational technical requirement for GEO.
As of early 2026, many websites inadvertently block AI crawlers through overly aggressive robots.txt rules, inherited from legacy SEO configurations. An Originality.ai 2025 study found that over 35% of the top 1,000 websites block at least one major AI crawler, and 5-10% block all AI crawlers. Blocking AI crawlers is the single fastest way to become invisible in AI-generated search results.
These crawlers power the AI search products where users actively look for answers. Blocking them directly reduces your visibility in AI-generated responses.
GPTBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)OAI-SearchBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; OAI-SearchBot/1.0; +https://docs.openai.com/bots/overview)ChatGPT-UserMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/1.0; +https://openai.com/bot)ClaudeBotClaudeBot/1.0; +https://www.anthropic.com/claude-botPerplexityBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)These crawlers serve large AI platforms or search ecosystems. Allowing them increases your content's reach.
Google-ExtendedGoogleOtherApplebot-ExtendedAmazonbotMozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/600.2.5 (KHTML, like Gecko) Version/8.0.2 Safari/600.2.5 (compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)FacebookBotThese crawlers are primarily used for AI model training rather than live search features. Blocking them does not affect AI search visibility.
CCBotCCBot/2.0 (https://commoncrawl.org/faq/)anthropic-aiBytespidercohere-ai| Crawler | Tier | Recommendation | Reason | |---|---|---|---| | GPTBot | 1 | ALLOW | Powers ChatGPT Search (300M+ users) | | OAI-SearchBot | 1 | ALLOW | Search-only, no training use | | ChatGPT-User | 1 | ALLOW | User-initiated browsing | | ClaudeBot | 1 | ALLOW | Claude web search and analysis | | PerplexityBot | 1 | ALLOW | Best referral traffic AI search | | Google-Extended | 2 | ALLOW | Gemini features; no search rank impact | | GoogleOther | 2 | ALLOW | Google AI research | | Applebot-Extended | 2 | ALLOW | Apple Intelligence (2B+ devices) | | Amazonbot | 2 | ALLOW | Alexa and Amazon AI | | FacebookBot | 2 | ALLOW | Meta AI (3B+ app users) | | CCBot | 3 | Context | Training data only | | anthropic-ai | 3 | Context | Training data only | | Bytespider | 3 | BLOCK | Aggressive crawler, low benefit | | cohere-ai | 3 | Context | Training data only |
For sites wanting maximum AI search visibility:
# AI Crawlers - ALLOWED for AI search visibility
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: GoogleOther
Allow: /
User-agent: Applebot-Extended
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: FacebookBot
Allow: /
# AI Crawlers - BLOCKED (aggressive/low value)
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
[domain]/robots.txt.User-agent: *) block that would applyCrawl-delay directives that may slow AI crawler access.Sitemap directives (AI crawlers use these for discovery).<meta name="robots" content="noindex"> -- blocks all bots<meta name="robots" content="nofollow"> -- prevents link following<meta name="robots" content="noai"> -- emerging tag to block AI use<meta name="robots" content="noimageai"> -- blocks AI image training<meta name="GPTBot" content="noindex">X-Robots-Tag: noindex -- HTTP header equivalent of meta noindexX-Robots-Tag: noai -- HTTP header to block AI useX-Robots-Tag: noimageai -- blocks AI image trainingX-Robots-Tag: GPTBot: noindex/llms.txt (emerging standard for AI crawler guidance)./.well-known/ai-plugin.json (OpenAI plugin manifest)./ai.txt (proposed standard, similar to ads.txt for AI).Generate a file called GEO-CRAWLER-ACCESS.md:
# AI Crawler Access Report: [Domain]
**Analysis Date:** [Date]
**Domain:** [Domain]
**robots.txt Status:** [Found/Not Found/Error]
---
## Crawler Access Summary
| Crawler | Operator | Tier | Status | Impact |
|---|---|---|---|---|
| GPTBot | OpenAI | 1 | [Allowed/Blocked/Not Mentioned] | [Impact description] |
| OAI-SearchBot | OpenAI | 1 | [Status] | [Impact] |
| ChatGPT-User | OpenAI | 1 | [Status] | [Impact] |
| ClaudeBot | Anthropic | 1 | [Status] | [Impact] |
| PerplexityBot | Perplexity | 1 | [Status] | [Impact] |
| Google-Extended | Google | 2 | [Status] | [Impact] |
| GoogleOther | Google | 2 | [Status] | [Impact] |
| Applebot-Extended | Apple | 2 | [Status] | [Impact] |
| Amazonbot | Amazon | 2 | [Status] | [Impact] |
| FacebookBot | Meta | 2 | [Status] | [Impact] |
| CCBot | Common Crawl | 3 | [Status] | [Impact] |
| anthropic-ai | Anthropic | 3 | [Status] | [Impact] |
| Bytespider | ByteDance | 3 | [Status] | [Impact] |
| cohere-ai | Cohere | 3 | [Status] | [Impact] |
## AI Visibility Score: [X]/100
**Tier 1 Access:** [X/5 crawlers allowed]
**Tier 2 Access:** [X/5 crawlers allowed]
**Tier 3 Access:** [X/4 crawlers allowed]
---
## Critical Issues
[List any Tier 1 crawlers that are blocked]
## Recommendations
### Immediate Actions
[Specific robots.txt changes needed]
### robots.txt Recommendation
[Complete recommended robots.txt content for AI crawlers]
### Additional Technical Findings
- **Meta Robots Tags:** [Findings]
- **X-Robots-Tag Headers:** [Findings]
- **JavaScript Rendering:** [Assessment]
- **llms.txt:** [Present/Absent]
- **Sitemap Accessibility:** [Assessment]
The AI Crawler Access Score is calculated as:
| Component | Weight | Scoring |
|---|---|---|
| Tier 1 Crawlers Allowed | 50% | 20 points per Tier 1 crawler allowed (5 crawlers = 100 points max, scaled to 50) |
| Tier 2 Crawlers Allowed | 25% | 20 points per Tier 2 crawler allowed (5 crawlers = 100 points max, scaled to 25) |
| No Blanket AI Blocks | 15% | Full points if no User-agent: * Disallow: / and no noai meta tags |
| AI-Specific Files Present | 10% | 5 points for llms.txt, 5 points for sitemap accessible to AI crawlers |
Final score = sum of all weighted components, capped at 100.
development
React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
testing
Full QA pass: run all tests, check types, catch regressions, write missing tests. Use when the user says "run tests", "QA", "verify changes", "check for regressions", "test everything", or "make sure it works".
development
Provides guidance for property-based testing across multiple languages and smart contracts. Use when writing tests, reviewing code with serialization/validation/parsing patterns, designing features, or when property-based testing would provide stronger coverage than example-based tests.
development
Initialize a new project with Kenny's universal conventions and stack-appropriate infrastructure. Use this skill whenever: setting up a new project, creating a new repo, scaffolding a new app, starting fresh on a new idea, 'init project', 'new project', 'set up a new app', 'create a project for X', or any variant of starting a new codebase from scratch. Also trigger when the user says 'bootstrap', 'scaffold', or 'kickstart'. This skill ensures every project gets the same quality gates and conventions while adapting infrastructure to the chosen stack.