src/datapro/data/skills/context-optimizer/SKILL.md
Decomposes large Markdown documentation into an optimized .agent context structure. Use this skill when: (1) Starting a new project with a large requirements document, (2) Migrating legacy docs to .agent structure, (3) Refactoring existing context files for better organization, (4) Converting PDFs or long READMEs into agent-friendly files, or (5) Optimizing context window usage by splitting monolithic docs into Tasks, Memories, Workflows, and References.
npx skillsauth add pablodiegoo/data-pro-skill context-optimizerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill transforms large, monolithic documents into a modular .agent/ folder structure optimized for AI agent context consumption. The goal is to minimize context window usage while maximizing information accessibility.
| Content Type | Destination | Naming Convention | When to Use |
|--------------|-------------|-------------------|-------------|
| Core Rules/Facts | memory/ | project_facts.md, conventions.md | Immutable truths, constraints, standards |
| Processes/How-To | workflows/ | deploy.md, review.md | Step-by-step procedures (turbo-enabled) |
| Tasks/Plans | tasks/ | backlog.md, sprint.md | Active work items, implementation plans |
| Reference Docs | references/ | api_docs.md, schema.md | Large docs loaded on-demand |
| Skills | skills/ | <skill-name>/SKILL.md | Reusable capabilities with scripts |
Before splitting, understand the document's structure:
# Preview structure without splitting
head -100 <input_file> | grep -E "^#{1,3} "
Identify:
Use the bundled script to split the document:
python3 .agent/skills/context-optimizer/scripts/decompose.py <input_file> -o <output_dir> [options]
| Argument | Description | Default |
|----------|-------------|---------|
| input_file | Large markdown/text file to split | Required |
| -o, --output | Output directory for chunks | <input>_split/ |
| -l, --level | Header level to split by (1=#, 2=##) | 2 |
| -r, --regex | Custom regex pattern (group 1 = title) | Markdown headers |
| --min | Minimum lines per section | 3 |
# Split by ## (default)
python3 decompose.py project_spec.md -o .agent/temp_split
# Split by # (top-level only)
python3 decompose.py large_doc.md -o chunks -l 1
# Custom pattern (e.g., numbered sections)
python3 decompose.py report.md -r "^(\d+\.\s+.+)$" -o sections
Output: Creates numbered files (01_section_name.md, 02_...) plus 00_INDEX.md and 00_preamble.md.
After decomposition, manually categorize each chunk:
.agent/
├── memory/ # Persistent context (always loaded)
│ ├── user_global.md # User preferences, patterns
│ ├── project_facts.md # Tech stack, constraints, conventions
│ └── decisions.md # ADRs, architectural decisions
│
├── workflows/ # Step-by-step procedures
│ ├── deploy.md # Deployment process
│ ├── review.md # Code review checklist
│ └── testing.md # Testing procedures
│
├── tasks/ # Active work items
│ ├── backlog.md # Feature backlog
│ ├── current_sprint.md # Active sprint items
│ └── implementation_plan.md # Current implementation plan
│
├── references/ # On-demand documentation
│ ├── api_docs.md # API specifications
│ ├── schema.md # Database/data schemas
│ └── external_libs.md # Third-party library docs
│
└── skills/ # Reusable capabilities
└── <skill-name>/
└── SKILL.md
For each categorized file, apply these optimizations:
// turbo annotations for auto-runnable steps[ ], [/], [x])Remove temporary files and validate structure:
# Remove decomposition output
rm -rf .agent/temp_split
# Validate structure (optional)
find .agent -name "*.md" -exec wc -l {} \; | sort -n
Use this matrix to decide where content belongs:
┌─────────────────────────────────────────────────────────────────┐
│ Is it a PROCESS/HOW-TO? │
│ │ │
│ ┌───────────────┴───────────────┐ │
│ ▼ YES ▼ NO │
│ ┌────────────────┐ ┌────────────────┐ │
│ │ workflows/ │ │ Is it ACTIVE │ │
│ │ │ │ work to track? │ │
│ └────────────────┘ └───────┬────────┘ │
│ ┌───────┴───────┐ │
│ ▼ YES ▼ NO │
│ ┌──────────┐ ┌──────────────┐ │
│ │ tasks/ │ │ Is it a RULE │ │
│ │ │ │ or FACT? │ │
│ └──────────┘ └──────┬───────┘ │
│ ┌──────┴──────┐ │
│ ▼ YES ▼ NO │
│ ┌──────────┐ ┌──────────┐ │
│ │ memory/ │ │references/│ │
│ └──────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────────────┘
snake_case.md for all filesAutomatically categorize your chunks into .agent/ folders:
python3 .agent/skills/context-optimizer/scripts/group_sections.py <split_dir> --move
This script analyzes each chunk for keywords and structural markers to suggest whether it belongs in memory/, workflows/, tasks/, or references/.
...
| Resource | Purpose |
|----------|---------|
| scripts/decompose.py | Split markdown by headers or custom regex |
| scripts/group_sections.py | Automatically categorize chunks by semantic analysis |
| references/examples.md | Real-world categorization examples and patterns |
testing
Comprehensive time-series validation and analysis suite. Handles backtesting of trading and non-trading strategies with support for walk-forward validation (training vs testing windows), performance metric calculation (Sharpe, Drawdown, Win Rate), and event-driven resource allocation simulation. Use for: (1) Validating sequential logic on time-series data, (2) Calculating risk-adjusted performance, (3) Simulating constraints in resource distribution, (4) Detecting look-ahead bias through walk-forward testing.
tools
Core statistical analysis and pipeline automation for survey datasets. Use for: (1) Running standard Crosstabs, NPS, Top-Box calculations, (2) Generating complete EDA or Analytics notebooks, (3) Quantitative and qualitative processing of questionnaire data.
development
Business-level frameworks and actionable reporting for executives. Use for: (1) Plotting Priority Matrices, (2) Generating Pain Curves, (3) Conversion Funnels, (4) Removing Halo Effects to uncover true sentiment.
testing
Tactical and highly interpretable Machine Learning. Use for: (1) Extracting Feature Importance via Random Forest, (2) Running Permutation Tests, (3) Handling Imbalanced Data (SMOTE).