skills/pdf-to-dataset/SKILL.md
Extract tables, forms, invoices, and semi-structured PDF content into CSV and JSON datasets through the hosted Skills runtime.
npx skillsauth add hasna/skills pdf-to-datasetInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convert PDF content into structured datasets and extraction reports. The implementation runs through the hosted Skills runtime, so agents get the instruction file and generated artifacts without receiving parser source, scripts, or model code.
dataset.json, dataset.csv, schema.json, extraction-report.md, and manifest.json.skills run pdf-to-dataset -- --input ./invoice.pdf --schema "invoice_number,date,total,vendor"
skills run pdf-to-dataset -- --input ./reports --mode tables --output ./exports/pdf-data
| Option | Description | Default |
|--------|-------------|---------|
| --input <path-or-url> | PDF file, folder, or URL | required |
| --schema <fields> | Comma-separated field hints | inferred |
| --mode <mode> | tables, forms, invoice, or auto | auto |
| --output <dir> | Output directory | current run export directory |
| --pages <ranges> | Page ranges like 1-3,8 | all pages |
skills auth login.skills mcp --register all
tools
Generate hosted voiceover variants and short jingles
tools
Generate premium video highlight packages with clip plans, captions, thumbnails, chapter markers, social copy, edit decisions, and manifest metadata.
testing
Generate high-quality articles using parallel AI agents. Supports research, writing, and optional cover image generation. Write single articles or batch process multiple topics with configurable parallelism.
testing
Generate videos using OpenAI Sora, Minimax Hailuo, Gemini Veo, or Seedance through the hosted Skills runtime with provider-cost pricing.