Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

kendreaditya/liteparse

Name: liteparse
Author: kendreaditya

claude/skills/liteparse/SKILL.md

npx skillsauth add kendreaditya/.config liteparse

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

LiteParse Skill

Parse unstructured documents (PDF, DOCX, PPTX, XLSX, images, and more) locally with LiteParse: fast, lightweight, no cloud dependencies or LLM required.

Initial Setup

When this skill is invoked, respond with:

I'm ready to use LiteParse to parse files locally. Before we begin, please confirm that:

- `@llamaindex/liteparse` is installed globally (`npm i -g @llamaindex/liteparse`)
- The `lit` CLI command is available in your terminal

If both are set, please provide:

1. One or more files to parse (PDF, DOCX, PPTX, XLSX, images, etc.)
2. Any specific options: output format (json/text), page ranges, OCR preferences, DPI, etc.
3. What you'd like to do with the parsed content.

I will produce the appropriate `lit` CLI command or TypeScript script, and once approved, report the results.

Then wait for the user's input.

Step 0 — Install LiteParse (if needed)

If liteparse is not yet installed, install it globally:

npm i -g @llamaindex/liteparse

Verify installation:

lit --version

For Office document support (DOCX, PPTX, XLSX), LibreOffice is required:

# macOS
brew install --cask libreoffice

# Ubuntu/Debian
apt-get install libreoffice

For image parsing, ImageMagick is required:

# macOS
brew install imagemagick

# Ubuntu/Debian
apt-get install imagemagick

Step 1 — Produce the CLI Command or Script

Parse a Single File

# Basic text extraction
lit parse document.pdf

# JSON output saved to a file
lit parse document.pdf --format json -o output.json

# Specific page range
lit parse document.pdf --target-pages "1-5,10,15-20"

# Disable OCR (faster, text-only PDFs)
lit parse document.pdf --no-ocr

# Use an external HTTP OCR server for higher accuracy
lit parse document.pdf --ocr-server-url http://localhost:8828/ocr

# Higher DPI for better quality
lit parse document.pdf --dpi 300

Batch Parse a Directory

lit batch-parse ./input-directory ./output-directory

# Only process PDFs, recursively
lit batch-parse ./input ./output --extension .pdf --recursive

Generate Page Screenshots

Screenshots are useful for LLM agents that need to see visual layout.

# All pages
lit screenshot document.pdf -o ./screenshots

# Specific pages
lit screenshot document.pdf --pages "1,3,5" -o ./screenshots

# High-DPI PNG
lit screenshot document.pdf --dpi 300 --format png -o ./screenshots

# Page range
lit screenshot document.pdf --pages "1-10" -o ./screenshots

Step 3 — Key Options Reference

OCR Options

| Option | Description | |--------|-------------| | (default) | Tesseract.js — zero setup, built-in | | --ocr-language fra | Set OCR language (ISO code) | | --ocr-server-url <url> | Use external HTTP OCR server (EasyOCR, PaddleOCR, custom) | | --no-ocr | Disable OCR entirely |

Output Options

| Option | Description | |--------|-------------| | --format json | Structured JSON with bounding boxes | | --format text | Plain text (default) | | -o <file> | Save output to file |

Performance / Quality Options

| Option | Description | |--------|-------------| | --dpi <n> | Rendering DPI (default: 150; use 300 for high quality) | | --max-pages <n> | Limit pages parsed | | --target-pages <pages> | Parse specific pages (e.g. "1-5,10") | | --no-precise-bbox | Disable precise bounding boxes (faster) | | --skip-diagonal-text | Ignore rotated/diagonal text | | --preserve-small-text | Keep very small text that would otherwise be dropped |

Step 4 — Using a Config File

For repeated use with consistent options, generate a liteparse.config.json:

{
  "ocrLanguage": "en",
  "ocrEnabled": true,
  "maxPages": 1000,
  "dpi": 150,
  "outputFormat": "json",
  "preciseBoundingBox": true,
  "skipDiagonalText": false,
  "preserveVerySmallText": false
}

For an HTTP OCR server:

{
  "ocrServerUrl": "http://localhost:8828/ocr",
  "ocrLanguage": "en",
  "outputFormat": "json"
}

Use with:

lit parse document.pdf --config liteparse.config.json

Step 5 — HTTP OCR Server API (Advanced)

If the user wants to plug in a custom OCR backend, the server must implement:

Endpoint: POST /ocr
Accepts: file (multipart) and language (string) parameters
Returns:

{
  "results": [
    { "text": "Hello", "bbox": [x1, y1, x2, y2], "confidence": 0.98 }
  ]
}

Ready-to-use wrappers exist for EasyOCR and PaddleOCR in the LiteParse repo.

Supported Input Formats

| Category | Formats | |----------|---------| | PDF | .pdf | | Word | .doc, .docx, .docm, .odt, .rtf | | PowerPoint | .ppt, .pptx, .pptm, .odp | | Spreadsheets | .xls, .xlsx, .xlsm, .ods, .csv, .tsv | | Images | .jpg, .jpeg, .png, .gif, .bmp, .tiff, .webp, .svg |

Office documents require LibreOffice; images require ImageMagick. LiteParse auto-converts these formats to PDF before parsing.

kendreaditya/liteparse

claude/skills/liteparse/SKILL.md

Fast local document parsing and text extraction (PDF, DOCX, PPTX, XLSX, images). Strengths: no page limits (parsed 810 pages in 3s), exact text extraction, structured JSON with bounding boxes, batch processing whole directories, built-in OCR for scanned docs, supports non-PDF formats. Use when: PDF is large (>20 pages), you need precise/searchable text, structured spatial data, batch processing, or OCR. Not suited for: documents where you need to interpret graphs, charts, LaTeX equations, images, or visual layout — it extracts text only.

development

Updated Apr 17, 2026

$ install --global

skillsauth

npx skillsauth add kendreaditya/.config liteparse

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 9:03 PM1.9s1 file scanned

SKILL.md

name:: liteparse
description:: >
Strengths:: no page limits (parsed 810 pages in 3s), exact text extraction, structured JSON with bounding boxes, batch processing whole directories, built-in OCR for scanned docs, supports non-PDF formats.
Use when:: PDF is large (>20 pages), you need precise/searchable text, structured spatial data, batch processing, or OCR.
Not suited for:: documents where you need to interpret graphs, charts, LaTeX equations, images, or visual layout — it extracts text only.
compatibility:: Requires Node 18+ and `@llamaindex/liteparse` installed globally via npm (`npm i -g @llamaindex/liteparse`)
license:: MIT
author:: LlamaIndex
version:: 0.1.0

LiteParse Skill

Parse unstructured documents (PDF, DOCX, PPTX, XLSX, images, and more) locally with LiteParse: fast, lightweight, no cloud dependencies or LLM required.

Initial Setup

When this skill is invoked, respond with:

I'm ready to use LiteParse to parse files locally. Before we begin, please confirm that:

- `@llamaindex/liteparse` is installed globally (`npm i -g @llamaindex/liteparse`)
- The `lit` CLI command is available in your terminal

If both are set, please provide:

1. One or more files to parse (PDF, DOCX, PPTX, XLSX, images, etc.)
2. Any specific options: output format (json/text), page ranges, OCR preferences, DPI, etc.
3. What you'd like to do with the parsed content.

I will produce the appropriate `lit` CLI command or TypeScript script, and once approved, report the results.

Then wait for the user's input.

Step 0 — Install LiteParse (if needed)

If liteparse is not yet installed, install it globally:

npm i -g @llamaindex/liteparse

Verify installation:

lit --version

For Office document support (DOCX, PPTX, XLSX), LibreOffice is required:

# macOS
brew install --cask libreoffice

# Ubuntu/Debian
apt-get install libreoffice

For image parsing, ImageMagick is required:

# macOS
brew install imagemagick

# Ubuntu/Debian
apt-get install imagemagick

Step 1 — Produce the CLI Command or Script

Parse a Single File

# Basic text extraction
lit parse document.pdf

# JSON output saved to a file
lit parse document.pdf --format json -o output.json

# Specific page range
lit parse document.pdf --target-pages "1-5,10,15-20"

# Disable OCR (faster, text-only PDFs)
lit parse document.pdf --no-ocr

# Use an external HTTP OCR server for higher accuracy
lit parse document.pdf --ocr-server-url http://localhost:8828/ocr

# Higher DPI for better quality
lit parse document.pdf --dpi 300

Batch Parse a Directory

lit batch-parse ./input-directory ./output-directory

# Only process PDFs, recursively
lit batch-parse ./input ./output --extension .pdf --recursive

Generate Page Screenshots

Screenshots are useful for LLM agents that need to see visual layout.

# All pages
lit screenshot document.pdf -o ./screenshots

# Specific pages
lit screenshot document.pdf --pages "1,3,5" -o ./screenshots

# High-DPI PNG
lit screenshot document.pdf --dpi 300 --format png -o ./screenshots

# Page range
lit screenshot document.pdf --pages "1-10" -o ./screenshots

Step 3 — Key Options Reference

OCR Options

Output Options

| Option | Description | |--------|-------------| | --format json | Structured JSON with bounding boxes | | --format text | Plain text (default) | | -o <file> | Save output to file |

Performance / Quality Options

Step 4 — Using a Config File

For repeated use with consistent options, generate a liteparse.config.json:

{
  "ocrLanguage": "en",
  "ocrEnabled": true,
  "maxPages": 1000,
  "dpi": 150,
  "outputFormat": "json",
  "preciseBoundingBox": true,
  "skipDiagonalText": false,
  "preserveVerySmallText": false
}

For an HTTP OCR server:

{
  "ocrServerUrl": "http://localhost:8828/ocr",
  "ocrLanguage": "en",
  "outputFormat": "json"
}

Use with:

lit parse document.pdf --config liteparse.config.json

Step 5 — HTTP OCR Server API (Advanced)

If the user wants to plug in a custom OCR backend, the server must implement:

Endpoint: POST /ocr
Accepts: file (multipart) and language (string) parameters
Returns:

{
  "results": [
    { "text": "Hello", "bbox": [x1, y1, x2, y2], "confidence": 0.98 }
  ]
}

Ready-to-use wrappers exist for EasyOCR and PaddleOCR in the LiteParse repo.

Supported Input Formats

Office documents require LibreOffice; images require ImageMagick. LiteParse auto-converts these formats to PDF before parsing.

Related Skills

kendreaditya/test-coverage-advisor

testing

VerifiedTrustedCommunity

Reviews test coverage and suggests missing test cases for error paths, edge cases, and business logic. Activates when users write tests or implement new features.

SKILL.mdUpdated Jun 12, 2026

kendreaditya/test-coverage-advisor

kendreaditya/tech-debt

development

VerifiedTrustedCommunity

Identify, categorize, and prioritize technical debt. Trigger with "tech debt", "technical debt audit", "what should we refactor", "code health", or when the user asks about code quality, refactoring priorities, or maintenance backlog.

SKILL.mdUpdated Jun 12, 2026

kendreaditya/tech-debt

kendreaditya/security-audit

tools

VerifiedTrustedCommunity

Comprehensive security scanning and vulnerability detection. Includes input validation, path traversal prevention, CVE detection, and secure coding pattern enforcement. Use when: authentication implementation, authorization logic, payment processing, user data handling, API endpoint creation, file upload handling, database queries, external API integration. Skip when: read-only operations on public data, internal development tooling, static documentation, styling changes.

SKILL.mdUpdated Jun 12, 2026

kendreaditya/security-audit

kendreaditya/performance-optimization

development

VerifiedTrustedCommunity

Optimizes application performance. Use when performance requirements exist, when you suspect performance regressions, or when Core Web Vitals or load times need improvement. Use when profiling reveals bottlenecks that need fixing.

SKILL.mdUpdated Jun 12, 2026

kendreaditya/performance-optimization

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/kendreaditya/.config.git

# Copy into Claude Code skills folder (global)
cp -r .config/claude/skills/liteparse ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

kendreaditya/.config

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT