Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/llms-txt-support

Name: llms-txt-support
Author: jmagly

agentic/code/addons/doc-intelligence/skills/llms-txt-support/SKILL.md

npx skillsauth add jmagly/aiwg llms-txt-support

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

llms.txt Support Skill

Purpose

Single responsibility: Detect, fetch, and utilize llms.txt files that provide LLM-optimized documentation, enabling 10x faster documentation ingestion. (BP-4)

Background

The llms.txt standard (https://llmstxt.org/) provides a convention for websites to expose LLM-friendly documentation. Instead of scraping entire sites, check for llms.txt first.

File hierarchy (check in order):

llms-full.txt - Complete documentation (largest)
llms.txt - Standard documentation
llms-small.txt - Condensed documentation (smallest)

Grounding Checkpoint (Archetype 1 Mitigation)

Before executing, VERIFY:

[ ] Base URL is accessible
[ ] Check all three llms.txt variants in order
[ ] Validate file content is actual documentation (not error page)
[ ] Confirm file size is reasonable for the documentation scope

DO NOT assume llms.txt exists. Always probe first.

Uncertainty Escalation (Archetype 2 Mitigation)

ASK USER instead of guessing when:

Multiple llms.txt variants found - which size to use?
llms.txt content appears partial or outdated
File returns but content seems like error page
Site has llms.txt but content doesn't match expected documentation

NEVER assume llms.txt quality without verification.

Context Scope (Archetype 3 Mitigation)

| Context Type | Included | Excluded | |--------------|----------|----------| | RELEVANT | Target base URL, llms.txt content | Full site scraping | | PERIPHERAL | llms.txt spec reference | Other sites' llms.txt | | DISTRACTOR | Previous scraping attempts | Unrelated documentation |

Workflow Steps

Step 1: Detect llms.txt (Grounding)

# Check for llms.txt variants (in order of preference)
curl -I https://example.com/llms-full.txt
curl -I https://example.com/llms.txt
curl -I https://example.com/llms-small.txt

# Check common alternate locations
curl -I https://example.com/.well-known/llms.txt
curl -I https://docs.example.com/llms.txt

Step 2: Validate Content

# Fetch and inspect first 100 lines
curl -s https://example.com/llms.txt | head -100

# Check file size
curl -sI https://example.com/llms.txt | grep -i content-length

# Verify it's not an error page
curl -s https://example.com/llms.txt | grep -i "not found\|error\|404" && echo "WARNING: May be error page"

Step 3: Choose Variant

| Variant | Size | Use Case | |---------|------|----------| | llms-full.txt | Large (1MB+) | Complete documentation, full API reference | | llms.txt | Medium | Standard use, balanced coverage | | llms-small.txt | Small (<100KB) | Quick reference, limited context windows |

Decision tree:

If context window is limited → llms-small.txt
If need complete coverage → llms-full.txt
Default → llms.txt

Step 4: Fetch and Process

# Download llms.txt
curl -o docs/llms.txt https://example.com/llms.txt

# Convert to skill format (if using skill-seekers)
skill-seekers scrape --llms-txt docs/llms.txt --name myskill

# Or process manually
# llms.txt is already LLM-optimized markdown
cp docs/llms.txt output/myskill/references/complete.md

Step 5: Validate Output

# Check content structure
head -50 output/myskill/references/complete.md

# Verify sections
grep "^#" output/myskill/references/complete.md | head -20

# Check for code examples
grep -c '```' output/myskill/references/complete.md

Recovery Protocol (Archetype 4 Mitigation)

On error:

PAUSE - Note which variant failed
DIAGNOSE - Check error type:
- 404 Not Found → Try next variant or alternate location
- 403 Forbidden → May need authentication or user-agent
- Timeout → Retry with longer timeout
- Invalid content → Fall back to traditional scraping
ADAPT - Try alternate approach
RETRY - Next variant (max 3 attempts per variant)
ESCALATE - Inform user llms.txt unavailable, suggest scraping

Checkpoint Support

State saved to: .aiwg/working/checkpoints/llms-txt-support/

checkpoints/llms-txt-support/
├── detection_results.json    # Which variants found
├── selected_variant.txt      # Which was chosen
└── content_hash.txt          # For cache validation

llms.txt Format Reference

Standard llms.txt structure:

# Project Name

> Brief description of the project

## Overview
[High-level explanation]

## Installation
[Setup instructions]

## Quick Start
[Getting started guide]

## API Reference
[Detailed API documentation]

## Examples
[Code examples]

## FAQ
[Common questions]

Detection Results Output

{
  "base_url": "https://example.com",
  "detected": {
    "llms-full.txt": {
      "found": true,
      "url": "https://example.com/llms-full.txt",
      "size": 1523456,
      "last_modified": "2025-01-15T10:30:00Z"
    },
    "llms.txt": {
      "found": true,
      "url": "https://example.com/llms.txt",
      "size": 245678,
      "last_modified": "2025-01-15T10:30:00Z"
    },
    "llms-small.txt": {
      "found": false
    }
  },
  "recommended": "llms.txt",
  "reason": "Standard size, good for most use cases"
}

Known Sites with llms.txt

Sites known to support llms.txt (verify before use):

Anthropic documentation
Many modern API documentation sites
Framework documentation following the standard

Always verify - this list may be outdated.

Troubleshooting

| Issue | Diagnosis | Solution | |-------|-----------|----------| | No llms.txt found | Site doesn't support | Fall back to doc-scraper | | Content seems wrong | Error page or redirect | Check actual content, verify URL | | File too large | llms-full.txt overwhelming | Use llms.txt or llms-small.txt | | Outdated content | llms.txt not maintained | Consider scraping + llms.txt merge |

Integration with doc-scraper

If llms.txt is incomplete or outdated, combine approaches:

# 1. Fetch llms.txt as base
curl -o base.md https://example.com/llms.txt

# 2. Scrape for additional/updated content
skill-seekers scrape --config config.json --skip-covered-by base.md

# 3. Merge results
# llms.txt provides structure, scraping fills gaps

References

llms.txt Standard: https://llmstxt.org/
Skill Seekers llms.txt Detection: https://github.com/jmagly/Skill_Seekers/blob/main/docs/LLMS_TXT_SUPPORT.md
REF-001: Production-Grade Agentic Workflows (BP-4, BP-9)
REF-002: LLM Failure Modes (Archetype 1-4 mitigations)

jmagly/llms-txt-support

agentic/code/addons/doc-intelligence/skills/llms-txt-support/SKILL.md

Detect and use llms.txt files for LLM-optimized documentation. Use when checking if a site has LLM-ready docs before scraping.

122 stars

development

Updated Apr 24, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg llms-txt-support

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 5:18 PM246.0s1 file scanned

SKILL.md

namespace:: aiwg
name:: llms-txt-support
description:: Detect and use llms.txt files for LLM-optimized documentation. Use when checking if a site has LLM-ready docs before scraping.
tools:: Read, Write, WebFetch
platforms:: [all]

llms.txt Support Skill

Purpose

Single responsibility: Detect, fetch, and utilize llms.txt files that provide LLM-optimized documentation, enabling 10x faster documentation ingestion. (BP-4)

Background

The llms.txt standard (https://llmstxt.org/) provides a convention for websites to expose LLM-friendly documentation. Instead of scraping entire sites, check for llms.txt first.

File hierarchy (check in order):

llms-full.txt - Complete documentation (largest)
llms.txt - Standard documentation
llms-small.txt - Condensed documentation (smallest)

Grounding Checkpoint (Archetype 1 Mitigation)

Before executing, VERIFY:

[ ] Base URL is accessible
[ ] Check all three llms.txt variants in order
[ ] Validate file content is actual documentation (not error page)
[ ] Confirm file size is reasonable for the documentation scope

DO NOT assume llms.txt exists. Always probe first.

Uncertainty Escalation (Archetype 2 Mitigation)

ASK USER instead of guessing when:

Multiple llms.txt variants found - which size to use?
llms.txt content appears partial or outdated
File returns but content seems like error page
Site has llms.txt but content doesn't match expected documentation

NEVER assume llms.txt quality without verification.

Context Scope (Archetype 3 Mitigation)

Workflow Steps

Step 1: Detect llms.txt (Grounding)

# Check for llms.txt variants (in order of preference)
curl -I https://example.com/llms-full.txt
curl -I https://example.com/llms.txt
curl -I https://example.com/llms-small.txt

# Check common alternate locations
curl -I https://example.com/.well-known/llms.txt
curl -I https://docs.example.com/llms.txt

Step 2: Validate Content

# Fetch and inspect first 100 lines
curl -s https://example.com/llms.txt | head -100

# Check file size
curl -sI https://example.com/llms.txt | grep -i content-length

# Verify it's not an error page
curl -s https://example.com/llms.txt | grep -i "not found\|error\|404" && echo "WARNING: May be error page"

Step 3: Choose Variant

Decision tree:

If context window is limited → llms-small.txt
If need complete coverage → llms-full.txt
Default → llms.txt

Step 4: Fetch and Process

# Download llms.txt
curl -o docs/llms.txt https://example.com/llms.txt

# Convert to skill format (if using skill-seekers)
skill-seekers scrape --llms-txt docs/llms.txt --name myskill

# Or process manually
# llms.txt is already LLM-optimized markdown
cp docs/llms.txt output/myskill/references/complete.md

Step 5: Validate Output

# Check content structure
head -50 output/myskill/references/complete.md

# Verify sections
grep "^#" output/myskill/references/complete.md | head -20

# Check for code examples
grep -c '```' output/myskill/references/complete.md

Recovery Protocol (Archetype 4 Mitigation)

On error:

PAUSE - Note which variant failed
DIAGNOSE - Check error type:
- 404 Not Found → Try next variant or alternate location
- 403 Forbidden → May need authentication or user-agent
- Timeout → Retry with longer timeout
- Invalid content → Fall back to traditional scraping
ADAPT - Try alternate approach
RETRY - Next variant (max 3 attempts per variant)
ESCALATE - Inform user llms.txt unavailable, suggest scraping

Checkpoint Support

State saved to: .aiwg/working/checkpoints/llms-txt-support/

checkpoints/llms-txt-support/
├── detection_results.json    # Which variants found
├── selected_variant.txt      # Which was chosen
└── content_hash.txt          # For cache validation

llms.txt Format Reference

Standard llms.txt structure:

# Project Name

> Brief description of the project

## Overview
[High-level explanation]

## Installation
[Setup instructions]

## Quick Start
[Getting started guide]

## API Reference
[Detailed API documentation]

## Examples
[Code examples]

## FAQ
[Common questions]

Detection Results Output

{
  "base_url": "https://example.com",
  "detected": {
    "llms-full.txt": {
      "found": true,
      "url": "https://example.com/llms-full.txt",
      "size": 1523456,
      "last_modified": "2025-01-15T10:30:00Z"
    },
    "llms.txt": {
      "found": true,
      "url": "https://example.com/llms.txt",
      "size": 245678,
      "last_modified": "2025-01-15T10:30:00Z"
    },
    "llms-small.txt": {
      "found": false
    }
  },
  "recommended": "llms.txt",
  "reason": "Standard size, good for most use cases"
}

Known Sites with llms.txt

Sites known to support llms.txt (verify before use):

Anthropic documentation
Many modern API documentation sites
Framework documentation following the standard

Always verify - this list may be outdated.

Troubleshooting

Integration with doc-scraper

If llms.txt is incomplete or outdated, combine approaches:

# 1. Fetch llms.txt as base
curl -o base.md https://example.com/llms.txt

# 2. Scrape for additional/updated content
skill-seekers scrape --config config.json --skip-covered-by base.md

# 3. Merge results
# llms.txt provides structure, scraping fills gaps

References

llms.txt Standard: https://llmstxt.org/
Skill Seekers llms.txt Detection: https://github.com/jmagly/Skill_Seekers/blob/main/docs/LLMS_TXT_SUPPORT.md
REF-001: Production-Grade Agentic Workflows (BP-4, BP-9)
REF-002: LLM Failure Modes (Archetype 1-4 mitigations)

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/addons/doc-intelligence/skills/llms-txt-support ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

122 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT