skills/pdf-conversion-router/SKILL.md
Use when converting a PDF into another format such as Markdown, HTML, text, JSON, DOCX, or structured notes and the agent must choose the best extraction route, settings, and cleanup strategy for maxi
npx skillsauth add ranbot-ai/awesome-skills pdf-conversion-routerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Route every PDF conversion through a short analysis step before choosing tools or CLI flags.
The goal is not "extract the most text". The goal is:
.md, .html, .txt, .json, .docx, or structured notes.Never start with one fixed default pipeline.
Always:
Heuristics are starting points, not guarantees.
Do not promote one flag combination into a universal default just because it worked well on one PDF. Prefer document-specific evidence over habit.
Use opendataloader-pdf as the primary conversion engine for every PDF conversion task by default.
This skill should assume:
opendataloader-pdf is always the first conversion attemptUse other tools only for one of these reasons:
opendataloader-pdf cannot produce a usable resultIdentify the document class as quickly as possible:
Useful fast checks:
pdfinfo input.pdf
pdftotext -layout input.pdf -
If text is missing or very poor, treat OCR as required.
Use these as default starting points:
medical / lab report
markdown-with-html + --table-method cluster + --image-output off
slide deck / PowerPoint export
markdown-with-html + --image-output off
add --table-method cluster only if the default route under-structures important tabular content
if tables are visually obvious but missing or badly fused, treat this as a detection problem, not a Markdown formatting problem
if the selected route already reconstructs a real table but clips leading characters at column boundaries, treat that as a boundary-splitting defect, not a missing-table failure
narrative / article / letter
start with markdown or text
use markdown-with-html only if structure clearly matters
table-heavy business / finance PDF
start with markdown-with-html
add --table-method cluster when rows or columns flatten
scanned / image-heavy PDF
OCR first, then convert with opendataloader-pdf
mixed-layout PDF
prefer markdown-with-html
validate one easy section and one hard section before accepting output
Pick the output that best matches the document and the user's goal.
markdown-with-html
Use by default when the user wants Markdown and fidelity matters.
Prefer this for tables, medical reports, slides, mixed-layout PDFs, and anything likely to break in pure Markdown.
markdown
Use only when clean plain Markdown matters more than layout fidelity.
html
Use when visual structure matters more than LLM readability.
text
Use for quick linear extraction, narrative documents, or when structure is unimportant.
json
Use when downstream machine processing matters more than human readability.
docx
Use when the user wants editable office output and layout reconstruction matters.
Use OpenDataLoader as the default route.
Preferred defaults:
For Markdown output with fidelity priority:
-f markdown-with-html
For medical PDFs:
add --table-method cluster
For table-heavy PDFs:
add --table-method cluster
For slide decks:
start without --table-method cluster
add it only after a structure check shows meaningful improvement
if a pseudo-table is already collapsed inside one detected row, changing only the Markdown flavor usually will not fix it
if the active engine build recovers the pseudo-table structure, prefer fixing residual boundary artifacts before escalating to hybrid/full mode
For conversions where images are not requested:
add --image-output off
For slide decks, medical reports, and structure-sensitive PDFs: prefer validating both the
testing
Fix SEO indexing issues, crawl budget problems, and Search Console coverage errors for Next.js apps. Covers canonical tags, noindex audits, sitemap health, static rendering, and internal linking.
data-ai
Analyze AI disruption pressure across a business, map competitive exposure, and produce a 90-day defensive action plan.
tools
--- name: longbridge description: 125+ agent skills for Longbridge Securities — real-time quotes, charts, fundamentals, portfolio analysis, options, and more for HK/US/A-share/SG markets. Trilingual: Simplified Chinese, Traditional category: AI & Agents source: antigravity tags: [api, mcp, claude, ai, agent, security, cro] url: https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/longbridge --- # Longbridge ## Overview Longbridge is the official skill collection for Longbr
tools
Design, debug, and harden GitHub Actions CI/CD workflows, including reusable workflows, matrix builds, self-hosted runners, OIDC authentication, caching, environments, secrets, and release automation.