Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jmagly/chunk

Name: chunk
Author: jmagly

agentic/code/addons/rlm/skills/chunk/SKILL.md

npx skillsauth add jmagly/aiwg chunk

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Chunk

Split a file into overlapping chunks suitable for parallel fanout processing. Produces numbered chunk files and a manifest.json describing each chunk's location, line range, and overlap metadata.

Triggers

Alternate expressions and non-obvious activations:

"break this file up for parallel processing" → chunk with defaults
"prepare for fanout" → chunk + write manifest
"split into pieces" → chunk at semantic boundaries
"make this codebase searchable" → chunk directory of files

Trigger Patterns Reference

| Pattern | Example | Action | |---------|---------|--------| | Chunk file | "chunk this file" | Apply semantic-boundary strategy, write to .aiwg/rlm-chunks/ | | Size override | "chunk into 100-line pieces" | --size 100 | | Overlap override | "chunk with 50-line overlap" | --overlap 50 | | Fixed count | "split into fixed-size chunks" | --strategy fixed-count | | JSON output | "chunk as JSON" | --format json | | Custom directory | "chunk into tmp/chunks/" | --output tmp/chunks/ | | Dry run | "how would this file be chunked?" | Read file, describe strategy, no writes |

Behavior

When triggered:

Parse arguments — identify source file, strategy, size, overlap, format, and output directory from user input.
Read the source file — determine total line count and content type (code, markdown, prose, config).
Select chunking strategy:
- semantic-boundary (default) — split at headings (##, ###), blank lines between sections, function/class definitions, or import blocks. Preserves logical units.
- fixed-count — fixed number of lines per chunk regardless of content. Use when content has no clear structure.
- adaptive — measure content density (code density, average line length) and shrink chunk size for dense regions, expand for sparse ones.
Apply overlap — each chunk includes the last --overlap lines of the previous chunk and the first --overlap lines of the next. This ensures queries that span chunk boundaries are answerable from either side.
Write output:
- Text mode: one file per chunk, named chunk-0001.txt, chunk-0002.txt, etc.
- JSON mode: single chunks.json with each chunk's content embedded as a field.
- Always write manifest.json regardless of format.
Report result — print chunk count, output directory, and manifest path.

Manifest Format

{
  "source": "src/auth/middleware.ts",
  "source_lines": 842,
  "strategy": "semantic-boundary",
  "chunk_size": 200,
  "overlap": 20,
  "format": "text",
  "output_dir": ".aiwg/rlm-chunks/middleware-ts/",
  "created_at": "2026-04-01T14:23:00Z",
  "chunks": [
    {
      "id": "chunk-0001",
      "file": ".aiwg/rlm-chunks/middleware-ts/chunk-0001.txt",
      "start_line": 1,
      "end_line": 218,
      "overlap_start": 0,
      "overlap_end": 20,
      "boundary_type": "function",
      "boundary_label": "validateToken()"
    },
    {
      "id": "chunk-0002",
      "file": ".aiwg/rlm-chunks/middleware-ts/chunk-0002.txt",
      "start_line": 199,
      "end_line": 412,
      "overlap_start": 20,
      "overlap_end": 20,
      "boundary_type": "class",
      "boundary_label": "AuthMiddleware"
    }
  ]
}

Parameters

<file> — Source file to chunk (required)
--size N — Target chunk size in lines (default: 200). For adaptive, this is the base size before density adjustments.
--overlap N — Lines of overlap on each side of a chunk boundary (default: 20)
--strategy semantic-boundary|fixed-count|adaptive — Chunking strategy (default: semantic-boundary)
--format json|text — Output format (default: text)
--output <dir> — Output directory (default: .aiwg/rlm-chunks/<filename>/)

Examples

Example 1: Default chunk

User: "chunk src/auth/middleware.ts"

Action:

aiwg chunk src/auth/middleware.ts

Response: "Split middleware.ts (842 lines) into 5 chunks using semantic-boundary strategy. Overlap: 20 lines. Manifest: .aiwg/rlm-chunks/middleware-ts/manifest.json"

Example 2: Small chunks for a dense file

User: "chunk this file into 100-line pieces with 30-line overlap for the RLM fanout"

Action:

aiwg chunk src/core/parser.ts --size 100 --overlap 30

Response: "Split parser.ts (1,240 lines) into 14 chunks. 100-line target, 30-line overlap. Manifest: .aiwg/rlm-chunks/parser-ts/manifest.json"

Example 3: Fixed-count for a flat config file

User: "split config/nginx.conf into fixed chunks"

Action:

aiwg chunk config/nginx.conf --strategy fixed-count --size 150

Response: "Split nginx.conf (620 lines) into 5 fixed-count chunks. Manifest: .aiwg/rlm-chunks/nginx-conf/manifest.json"

Example 4: JSON format for programmatic use

User: "chunk the migration SQL file as JSON"

Action:

aiwg chunk db/migrations/0042_schema.sql --format json --output .aiwg/rlm-chunks/migration/

Response: "Split 0042_schema.sql (380 lines) into 2 JSON chunks. Output: .aiwg/rlm-chunks/migration/chunks.json. Manifest: .aiwg/rlm-chunks/migration/manifest.json"

Clarification Prompts

If the user's intent is ambiguous:

"Should I split at semantic boundaries (headings, functions) or use fixed line counts?"
"What chunk size would you like? Default is 200 lines."
"Should the output go to .aiwg/rlm-chunks/ or a custom directory?"

References

@$AIWG_ROOT/agentic/code/addons/rlm/skills/fanout/SKILL.md — Next step after chunking
@$AIWG_ROOT/agentic/code/addons/rlm/skills/rlm-prep/SKILL.md — One-shot prep (chunk + index)
@$AIWG_ROOT/agentic/code/addons/rlm/skills/rlm-search/SKILL.md — Full pipeline using chunk output
@$AIWG_ROOT/agentic/code/addons/rlm/schemas/rlm-chunk-manifest.yaml — Manifest schema
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/context-budget.md — Parallel context budget rules

jmagly/chunk

agentic/code/addons/rlm/skills/chunk/SKILL.md

Split a file into overlapping chunks suitable for parallel fanout processing and emit a manifest describing each chunk

126 stars

tools

Updated May 3, 2026

$ install --global

skillsauth

npx skillsauth add jmagly/aiwg chunk

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 6, 2026, 3:03 AM385.9s1 file scanned

SKILL.md

namespace:: aiwg
name:: chunk
platforms:: [all]
description:: Split a file into overlapping chunks suitable for parallel fanout processing and emit a manifest describing each chunk

Chunk

Split a file into overlapping chunks suitable for parallel fanout processing. Produces numbered chunk files and a manifest.json describing each chunk's location, line range, and overlap metadata.

Triggers

Alternate expressions and non-obvious activations:

"break this file up for parallel processing" → chunk with defaults
"prepare for fanout" → chunk + write manifest
"split into pieces" → chunk at semantic boundaries
"make this codebase searchable" → chunk directory of files

Trigger Patterns Reference

Behavior

When triggered:

Parse arguments — identify source file, strategy, size, overlap, format, and output directory from user input.
Read the source file — determine total line count and content type (code, markdown, prose, config).
Select chunking strategy:
- semantic-boundary (default) — split at headings (##, ###), blank lines between sections, function/class definitions, or import blocks. Preserves logical units.
- fixed-count — fixed number of lines per chunk regardless of content. Use when content has no clear structure.
- adaptive — measure content density (code density, average line length) and shrink chunk size for dense regions, expand for sparse ones.
Apply overlap — each chunk includes the last --overlap lines of the previous chunk and the first --overlap lines of the next. This ensures queries that span chunk boundaries are answerable from either side.
Write output:
- Text mode: one file per chunk, named chunk-0001.txt, chunk-0002.txt, etc.
- JSON mode: single chunks.json with each chunk's content embedded as a field.
- Always write manifest.json regardless of format.
Report result — print chunk count, output directory, and manifest path.

Manifest Format

{
  "source": "src/auth/middleware.ts",
  "source_lines": 842,
  "strategy": "semantic-boundary",
  "chunk_size": 200,
  "overlap": 20,
  "format": "text",
  "output_dir": ".aiwg/rlm-chunks/middleware-ts/",
  "created_at": "2026-04-01T14:23:00Z",
  "chunks": [
    {
      "id": "chunk-0001",
      "file": ".aiwg/rlm-chunks/middleware-ts/chunk-0001.txt",
      "start_line": 1,
      "end_line": 218,
      "overlap_start": 0,
      "overlap_end": 20,
      "boundary_type": "function",
      "boundary_label": "validateToken()"
    },
    {
      "id": "chunk-0002",
      "file": ".aiwg/rlm-chunks/middleware-ts/chunk-0002.txt",
      "start_line": 199,
      "end_line": 412,
      "overlap_start": 20,
      "overlap_end": 20,
      "boundary_type": "class",
      "boundary_label": "AuthMiddleware"
    }
  ]
}

Parameters

<file> — Source file to chunk (required)
--size N — Target chunk size in lines (default: 200). For adaptive, this is the base size before density adjustments.
--overlap N — Lines of overlap on each side of a chunk boundary (default: 20)
--strategy semantic-boundary|fixed-count|adaptive — Chunking strategy (default: semantic-boundary)
--format json|text — Output format (default: text)
--output <dir> — Output directory (default: .aiwg/rlm-chunks/<filename>/)

Examples

Example 1: Default chunk

User: "chunk src/auth/middleware.ts"

Action:

aiwg chunk src/auth/middleware.ts

Response: "Split middleware.ts (842 lines) into 5 chunks using semantic-boundary strategy. Overlap: 20 lines. Manifest: .aiwg/rlm-chunks/middleware-ts/manifest.json"

Example 2: Small chunks for a dense file

User: "chunk this file into 100-line pieces with 30-line overlap for the RLM fanout"

Action:

aiwg chunk src/core/parser.ts --size 100 --overlap 30

Response: "Split parser.ts (1,240 lines) into 14 chunks. 100-line target, 30-line overlap. Manifest: .aiwg/rlm-chunks/parser-ts/manifest.json"

Example 3: Fixed-count for a flat config file

User: "split config/nginx.conf into fixed chunks"

Action:

aiwg chunk config/nginx.conf --strategy fixed-count --size 150

Response: "Split nginx.conf (620 lines) into 5 fixed-count chunks. Manifest: .aiwg/rlm-chunks/nginx-conf/manifest.json"

Example 4: JSON format for programmatic use

User: "chunk the migration SQL file as JSON"

Action:

aiwg chunk db/migrations/0042_schema.sql --format json --output .aiwg/rlm-chunks/migration/

Response: "Split 0042_schema.sql (380 lines) into 2 JSON chunks. Output: .aiwg/rlm-chunks/migration/chunks.json. Manifest: .aiwg/rlm-chunks/migration/manifest.json"

Clarification Prompts

If the user's intent is ambiguous:

"Should I split at semantic boundaries (headings, functions) or use fixed line counts?"
"What chunk size would you like? Default is 200 lines."
"Should the output go to .aiwg/rlm-chunks/ or a custom directory?"

References

@$AIWG_ROOT/agentic/code/addons/rlm/skills/fanout/SKILL.md — Next step after chunking
@$AIWG_ROOT/agentic/code/addons/rlm/skills/rlm-prep/SKILL.md — One-shot prep (chunk + index)
@$AIWG_ROOT/agentic/code/addons/rlm/skills/rlm-search/SKILL.md — Full pipeline using chunk output
@$AIWG_ROOT/agentic/code/addons/rlm/schemas/rlm-chunk-manifest.yaml — Manifest schema
@$AIWG_ROOT/agentic/code/addons/aiwg-utils/rules/context-budget.md — Parallel context budget rules

Related Skills

jmagly/radar-status

data-ai

VerifiedTrustedCommunity

Report which research-corpus radar sidecars are overdue for refresh. Computes staleness (days since last refresh vs the cadence window) for every radar, sorted most-overdue-first. Runs via `aiwg corpus radar-status`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-report

data-ai

VerifiedTrustedCommunity

Aggregate research-corpus radar sidecars into a corpus or per-cluster freshness report — totals, overdue count, per-cluster / per-GRADE / per-trajectory breakdowns, an overdue table, and per-radar rationale snippets. Runs via `aiwg corpus radar-report`.

140SKILL.mdUpdated May 28, 2026

jmagly/radar-init

testing

VerifiedTrustedCommunity

Scaffold radar/freshness sidecars for research-corpus REFs. Pulls title/authors from the citation sidecar and GRADE from the analysis doc, defaults the refresh cadence from GRADE and the cluster from a corpus-local map, and stamps documentation/radar/REF-XXX-radar.md. Runs via `aiwg corpus radar-init`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

data-ai

VerifiedTrustedCommunity

Compute an entity's publication trajectory — per-year paper counts, topic drift, hot-streak detection (≥3 consecutive A-grade years), and career phase. Runs via `aiwg corpus profile-temporal`.

140SKILL.mdUpdated May 28, 2026

jmagly/profile-temporal

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jmagly/aiwg.git

# Copy into Claude Code skills folder (global)
cp -r aiwg/agentic/code/addons/rlm/skills/chunk ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jmagly/aiwg

126 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT