agentic/code/addons/rlm/skills/rlm-prep/SKILL.md
Prepare source content for RLM processing by discovering files, chunking each one, and writing a unified searchable manifest
npx skillsauth add jmagly/aiwg rlm-prepInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Prepare source content for RLM processing in one shot: discovers files, chunks each one, builds a searchable index, and writes a unified manifest.json. Run this once on a codebase or document set; then use rlm-search or fanout against the output without re-preparing.
Alternate expressions and non-obvious activations:
docs/| Pattern | Example | Action |
|---------|---------|--------|
| Prep a directory | "prepare src/ for RLM" | rlm-prep src/ |
| Prep a single file | "prep this file for recursive search" | rlm-prep path/to/file.ts |
| Strategy override | "prep with fixed-count chunking" | --strategy fixed-count |
| Size override | "prep in 100-line chunks" | --size 100 |
| Custom output | "prep into tmp/rlm/" | --output tmp/rlm/ |
| Force refresh | "re-prep even if already done" | --force |
| Check status | "is this codebase already prepped?" | Inspect output dir for manifest |
When triggered:
Resolve source — determine whether the input is a single file or a directory. For directories, discover all supported file types (.ts, .js, .py, .go, .md, .txt, .yaml, .json, .sql, and others). Respect .gitignore patterns.
Check for existing prep — look for a manifest in the output directory. If found and --force is not set, report that prep already exists and offer to use it or re-run.
Chunk each file — apply the selected strategy per file. Each file produces its own subdirectory under chunks/, named after the file path (slashes replaced with underscores).
Build index — construct a searchable index (index.json) with:
Write unified manifest — a single manifest.json at the output root that references all chunks across all files. This is what fanout and rlm-search consume.
Report result — print file count, total chunk count, index size, and output path.
.aiwg/rlm-prep/<source-hash>/
├── manifest.json # Unified chunk manifest (all files)
├── index.json # Searchable index with summaries
├── meta.json # Source path, strategy, timestamp
└── chunks/
├── src__auth__middleware.ts/
│ ├── chunk-0001.txt
│ ├── chunk-0002.txt
│ └── chunk-0003.txt
├── src__auth__jwt.ts/
│ ├── chunk-0001.txt
│ └── chunk-0002.txt
└── src__core__parser.ts/
├── chunk-0001.txt
├── chunk-0002.txt
├── chunk-0003.txt
└── chunk-0004.txt
{
"source": "src/auth/",
"source_hash": "sha256:a1b2c3d4...",
"strategy": "semantic-boundary",
"chunk_size": 200,
"overlap": 20,
"created_at": "2026-04-01T14:23:00Z",
"files": 12,
"total_chunks": 47,
"output_dir": ".aiwg/rlm-prep/a1b2c3d4/",
"chunks": [
{
"id": "src__auth__middleware.ts/chunk-0001",
"file_source": "src/auth/middleware.ts",
"chunk_file": ".aiwg/rlm-prep/a1b2c3d4/chunks/src__auth__middleware.ts/chunk-0001.txt",
"start_line": 1,
"end_line": 218,
"boundary_label": "validateToken()"
}
]
}
<file|dir> — Source file or directory to prepare (required)--output <dir> — Output directory (default: .aiwg/rlm-prep/<source-hash>/)--strategy semantic-boundary|fixed-count|adaptive — Chunking strategy (default: semantic-boundary)--size N — Target chunk size in lines (default: 200)--overlap N — Overlap lines between adjacent chunks (default: 20)--force — Re-prep even if a manifest already existsUser: "prepare src/ for RLM processing"
Action:
aiwg rlm-prep src/
Response: "Prepped src/ for RLM. 12 files, 47 chunks. Strategy: semantic-boundary (200 lines, 20 overlap). Manifest: .aiwg/rlm-prep/a1b2c3d4/manifest.json"
User: "index the entire repo for RLM, use 100-line chunks"
Action:
aiwg rlm-prep . --size 100 --overlap 15
Response: "Prepped . for RLM. 84 files, 312 chunks. Strategy: semantic-boundary (100 lines, 15 overlap). Manifest: .aiwg/rlm-prep/b3c4d5e6/manifest.json"
User: "get the docs folder ready for recursive search"
Action:
aiwg rlm-prep docs/ --strategy fixed-count --size 150
Response: "Prepped docs/ for RLM. 23 files, 89 chunks. Strategy: fixed-count (150 lines, 20 overlap). Manifest: .aiwg/rlm-prep/c4d5e6f7/manifest.json"
User: "re-prep the auth module, I've made changes"
Action:
aiwg rlm-prep src/auth/ --force
Response: "Re-prepped src/auth/ (previous prep from 2026-03-28 replaced). 4 files, 14 chunks. Manifest: .aiwg/rlm-prep/d5e6f7a8/manifest.json"
User: "is src/ already prepped for RLM?"
Action: Check .aiwg/rlm-prep/ for a manifest matching the source hash of src/.
Response: "Yes — src/ was prepped on 2026-04-01 (47 chunks, strategy: semantic-boundary). Run with --force to re-prep."
If the user's intent is ambiguous:
data-ai
Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.
data-ai
Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.
testing
Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.
data-ai
Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.