plugins/knowledge-forge/skills/kb-ingest/SKILL.md
Download an external documentation site into your personal knowledge base as a doc pack and source note. Use when the user says "/kb-ingest <url>", "ingest these docs", "add these docs to the KB", or "download reference docs". Runs webdown crawl via the knowledge repo's ingest-pack recipe, then creates a MANIFEST.md and source note automatically.
npx skillsauth add kelp/kelp-claude-plugins kb-ingestInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Download an external documentation site into the personal knowledge base as a doc pack and source note. Safe to run from any working directory.
Scope: $ARGUMENTS
Two forms:
Web crawl: <url> [category] [tool] [version] [path_prefix] [selector]
Local copy: <local-path> [category] [tool] [version]
url or local-path — required. An https:// URL
to crawl, or an absolute/relative path to a local
directory of already-downloaded docs.category — optional. Where in raw/external/ to
file the pack (e.g. providers, tooling,
protocols, languages, libraries).tool — optional. Tool/service name slug (e.g.
fastmail, jmap, uv).version — optional. Defaults to latest.path_prefix — optional (web only). Limits crawl to
URLs with this prefix. Leave empty to crawl from root.selector — optional (web only). CSS selector for
content extraction. Defaults to main.Run this shell block first. Every subsequent command uses
"$kb_path" (double-quoted).
kb_path=""
if [ -f CLAUDE.md ]; then
raw=$(grep -E "^knowledge-base:" CLAUDE.md | head -1 \
| sed 's/^knowledge-base://; s/^[[:space:]]*//; s/[[:space:]]*$//')
if [ -n "$raw" ]; then
# Expand ~ or $HOME prefix only -- never eval
# untrusted CLAUDE.md content
case "$raw" in
'~'|'~/'*) kb_path="$HOME${raw#\~}" ;;
'$HOME'*) kb_path="$HOME${raw#\$HOME}" ;;
*) kb_path="$raw" ;;
esac
fi
fi
if [ -z "$kb_path" ]; then
kb_path="$HOME/code/knowledge"
fi
kb_path=$(realpath "$kb_path" 2>/dev/null) || {
echo "knowledge-forge: cannot resolve KB path" >&2
exit 1
}
case "$kb_path" in
*$'\n'*|*$'\r'*|*$'\0'*|*\\*)
echo "knowledge-forge: invalid characters in KB path" >&2
exit 1 ;;
esac
if [ ! -f "$kb_path/justfile" ] || [ ! -d "$kb_path/index" ]; then
echo "knowledge-forge: $kb_path is not a knowledge base" >&2
exit 1
fi
Extract the first token from $ARGUMENTS. Detect mode:
/, ~/, or ./ → local mode
(copy from a local directory; skip the web crawl)In local mode, resolve the source path:
src_path=$(realpath "$first_token" 2>/dev/null)
[ -d "$src_path" ] || { echo "Source path not found"; exit 1; }
If category and tool are not provided, derive them
from the URL domain (web mode) or ask once (local mode):
| Domain | category | tool |
|---|---|---|
| fastmail.com | providers | fastmail |
| jmap.io | protocols | jmap |
| anthropic.com | providers | anthropic |
| openai.com | providers | openai |
| docs.*.com or similar | ask | ask |
When unable to classify from the domain, use AskUserQuestion once:
"What category and tool name should I use for
<domain>? (e.g.providers fastmail,tooling uv,protocols jmap)"
Default version to latest if omitted.
Default selector to main if omitted.
Default path_prefix to "" (empty) if omitted.
pack_dir="$kb_path/raw/external/$category/$tool/$version"
if [ -d "$pack_dir" ]; then
echo "Warning: pack already exists at $pack_dir"
echo "Re-ingest will overwrite changed files."
fi
Warn and proceed. Re-ingest is safe; it overwrites existing crawled files in place.
Web mode — crawl the URL:
today=$(date +%Y-%m-%d)
cd "$kb_path" && just ingest-pack \
"$category" "$tool" "$version" "$url" \
"$path_prefix" "$selector"
If the crawl produces no files (or only tiny stubs),
the site likely requires JS rendering. Try a different
CSS selector (e.g. article, .content, body) or
report to the user.
Local mode — copy from the existing directory:
today=$(date +%Y-%m-%d)
mkdir -p "$pack_dir"
cp -r "$src_path/." "$pack_dir/"
This copies the already-downloaded docs into
raw/external/<category>/<tool>/<version>/ without
making any network requests.
find "$pack_dir" -name "*.md" | sort
Record the file list. You will use it in the MANIFEST and source note.
Use the Write tool to create
"$pack_dir/MANIFEST.md". This is what gen-indexes
reads to populate index/doc-packs.md.
---
id: pack-<tool>-<version>
type: doc-pack
name: <Human-Readable Name>
category: <category>
version: <version>
variant: current
source_urls:
- <url>
ingest_method: webdown
downloaded_at: <today>
reviewed_at: <today>
status: fresh
related_projects: []
---
# <Human-Readable Name> doc pack
## Local files
- <one bullet per .md file from the directory listing>
ID pattern: pack-<tool>-<version> (e.g.
pack-fastmail-latest, pack-jmap-latest).
Fill related_projects if the current session has an
obvious project (e.g. building a Fastmail MCP server →
add that project slug). Leave [] if unknown.
Compute the source note path:
"$kb_path/wiki/sources/external/<category>/source-<tool>-<version>.md"
This matches kb-capture's source-note destination
(wiki/sources/external/<category>/<slug>.md), so
ingested and captured source notes share one layout.
If that path already exists, append -2, -3, … to
the slug until free.
Use the Write tool:
---
id: source-<tool>-<version>
type: source
title: <Human-Readable Name> Official Docs
summary: <2-3 sentences covering what the docs contain
and why they are useful in this KB>
topics: [3-5 kebab-case terms]
sources: []
aliases: []
status: draft
updated_at: <today>
raw_path: raw/external/<category>/<tool>/<version>
url: <url>
---
# <Human-Readable Name> Official Docs
## Why it matters
<One paragraph: why these docs were ingested and which
projects or tasks in this KB rely on them>
## Best current files
<Curated list of 3-8 files from the directory listing
that are most useful for navigation — not every file>
Use the compact format of existing external source
notes (see wiki/sources/external/source-github-cli- latest.md for the pattern). Do not use the full
source-note template sections (Abstract, Key claims,
etc.).
cd "$kb_path" && just lint
The linter checks that raw_path exists on disk — this
will pass since the pack was crawled in step 4. Fix any
failure before proceeding.
cd "$kb_path" && just refresh
This runs gen-indexes → shape → qmd update →
qmd embed → validate-qmd in sequence.
If validate-qmd fails, surface the output to the user
and stop. Do not auto-fix retrieval regressions.
Append one line to "$kb_path/index/sources.md":
- `source-<tool>-<version>` — <short description>.
(`wiki/sources/external/<category>/source-<tool>-<version>.md`)
Print:
raw/. They are the
immutable source archive. Write MANIFEST.md there
(it is metadata, not a crawled page), but never edit
the crawled .md files.gen-indexes reads it
to populate index/doc-packs.md. Without it the pack
is invisible to the auto-generated index.status: stable on first ingest. Use
fresh; Travis promotes to stable manually after
review.tools
Correct Zig 0.15.x patterns for I/O, ArrayList, format strings, and build.zig. Use when writing or reviewing any Zig code -- Claude's training data is outdated for these APIs.
tools
Add Zig 0.15.x training corrections to this project's CLAUDE.md. Run this in any Zig project to fix Claude's outdated patterns for I/O, ArrayList, format strings, build.zig, BoundedArray, and usingnamespace.
tools
Audit Zig source files for Zig 0.15.x mistakes -- checks for removed APIs (getStdOut, usingnamespace, BoundedArray, async), missing flush, wrong ArrayList usage, ambiguous format strings, signed division, and renamed stdlib functions.
tools
Tiger Style rules for Zig: assertions (2+ per fn, paired positive/negative space), bounded loops (no recursion), static memory after init, snake_case naming with unit suffixes, 70-line function limit, 100-column line limit, zig fmt. Use when writing or reviewing Zig in a project that follows Tiger Style.