.claude/skills/ingest-web/SKILL.md
Extract readable content from a web URL for wiki ingestion. Handles article extraction, HTML-to-markdown conversion.
npx skillsauth add RonanCodes/llm-wiki ingest-webInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Extract readable content from a web URL.
curl -sL -H "User-Agent: Mozilla/5.0" "<url>"
Extract readable content from the HTML. Focus on:
<title>, <h1>, or og:title meta tag)<meta name="author">, byline elements, or article:author)<meta> tags, <time> elements)Convert to clean markdown preserving:
Save to vault's raw/<descriptive-slug>.md with a YAML header:
---
source-url: <original-url>
title: <extracted-title>
author: <extracted-author>
date-fetched: <today>
images-downloaded: <count>
---
After extracting the article content, download referenced images locally:
Find image URLs in the extracted markdown:
 markdown image syntax.jpg, .jpeg, .png, .gif, .svg, .webpDownload each image to vault's raw/assets/:
VAULT="vaults/<vault-name>"
mkdir -p "$VAULT/raw/assets"
# For each image URL:
FILENAME="<descriptive-name>-<hash>.<ext>"
curl -sL -o "$VAULT/raw/assets/$FILENAME" "<image-url>"
Use a descriptive filename derived from the article slug + image position:
karpathy-llm-wiki-img-01.png, karpathy-llm-wiki-img-02.jpg
# Before:

# After:

Skip images that are:
Record count in the raw file's frontmatter: images-downloaded: 3
If HTML extraction is poor (SPA, heavy JS), note this in the source-note and suggest the user use Obsidian Web Clipper for better extraction.
None — uses only curl and Claude's text processing.
data-ai
Extract transcript from a YouTube video as clean readable text. Use when user shares a youtube.com or youtu.be link and wants the transcript, content summary, or to read what was said.
development
Page type templates and frontmatter conventions for LLM Wiki pages. Reference skill loaded by ingest, query, and lint skills to ensure consistent wiki structure.
testing
Show status of all LLM Wiki vaults — page counts, source counts, last activity, and git status. Use when user wants to see vault status, list vaults, or check wiki health.
documentation
Import an existing Obsidian vault, markdown folder, or git repo as an llm-wiki vault. Moves content into vaults/, adds missing structure (index, log, CLAUDE.md, frontmatter). Use when user wants to import, adopt, migrate, or bring in an existing knowledge base.