skills/autumnsgrove/pdf/SKILL.md
Comprehensive PDF manipulation, extraction, and generation with support for text extraction, form filling, merging, splitting, annotations, and creation. Use when working with .pdf files for: (1) Extracting text and tables, (2) Filling PDF forms, (3) Merging/splitting PDFs, (4) Creating PDFs programmatically, (5) Adding watermarks/annotations, (6) PDF metadata management
npx skillsauth add aiskillstore/marketplace pdfInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Comprehensive guide for working with PDF files in Python, covering extraction, manipulation, creation, and advanced operations using progressive disclosure for efficiency.
Extract and manipulate PDF content:
Install required libraries:
pip install pypdf pdfplumber reportlab PyMuPDF pdf2image pytesseract pillow
For detailed installation instructions including system dependencies, see:
pypdf: Basic operations (merge, split, rotate, metadata) pdfplumber: Advanced text/table extraction with layout awareness reportlab: Create PDFs from scratch (reports, invoices, documents) PyMuPDF (fitz): Advanced manipulation, annotations, compression pdf2image: Convert PDF pages to images (requires poppler) pytesseract: OCR for scanned documents (requires tesseract)
from pypdf import PdfReader
reader = PdfReader("document.pdf")
for page in reader.pages:
text = page.extract_text()
print(text)
import pdfplumber
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
text = page.extract_text()
words = page.extract_words() # With positioning
print(text)
with pdfplumber.open("document.pdf") as pdf:
page = pdf.pages[0]
bbox = (0, 0, 612, 100) # x0, y0, x1, y1
header = page.crop(bbox).extract_text()
For detailed text extraction methods including OCR fallback and encoding handling, see:
import pdfplumber
with pdfplumber.open("report.pdf") as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
print(table)
table_settings = {
"vertical_strategy": "lines",
"horizontal_strategy": "lines",
"snap_tolerance": 3
}
tables = page.extract_tables(table_settings=table_settings)
For detailed table extraction strategies and data cleaning, see:
import fitz
doc = fitz.open("form.pdf")
for page in doc:
for widget in page.widgets():
if widget.field_name == "name":
widget.field_value = "John Doe"
widget.update()
doc.save("filled.pdf")
doc.close()
doc = fitz.open("form.pdf")
for page in doc:
for widget in page.widgets():
print(f"{widget.field_name}: {widget.field_type_string}")
doc.close()
For form filling, flattening, and debugging, see:
from pypdf import PdfMerger
merger = PdfMerger()
for pdf in ["file1.pdf", "file2.pdf", "file3.pdf"]:
merger.append(pdf)
merger.write("merged.pdf")
merger.close()
merger = PdfMerger()
merger.append("doc1.pdf", pages=(0, 3)) # First 3 pages
merger.append("doc2.pdf") # All pages
merger.write("compiled.pdf")
merger.close()
from pypdf import PdfReader, PdfWriter
reader = PdfReader("document.pdf")
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
with open(f"page_{i+1}.pdf", 'wb') as f:
writer.write(f)
For merging with bookmarks and splitting by size, see:
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import letter
c = canvas.Canvas("output.pdf", pagesize=letter)
c.setFont("Helvetica", 12)
c.drawString(50, 750, "Hello, World!")
c.save()
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet
doc = SimpleDocTemplate("report.pdf")
styles = getSampleStyleSheet()
story = []
story.append(Paragraph("Report Title", styles['Title']))
story.append(Spacer(1, 12))
story.append(Paragraph("Content here", styles['BodyText']))
doc.build(story)
from reportlab.platypus import Table, TableStyle
from reportlab.lib import colors
data = [
['Product', 'Quantity', 'Price'],
['Widget A', '10', '$50'],
['Widget B', '5', '$75']
]
table = Table(data)
table.setStyle(TableStyle([
('BACKGROUND', (0, 0), (-1, 0), colors.grey),
('GRID', (0, 0), (-1, -1), 1, colors.black)
]))
For complete PDF creation workflows including images, multi-column layouts, and custom fonts, see:
For practical examples:
from pypdf import PdfReader
reader = PdfReader("document.pdf")
metadata = reader.metadata
print(f"Title: {metadata.get('/Title')}")
print(f"Author: {metadata.get('/Author')}")
from pypdf import PdfWriter
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
writer.add_metadata({
'/Title': 'New Title',
'/Author': 'John Doe'
})
with open("updated.pdf", 'wb') as f:
writer.write(f)
writer.encrypt(
user_password="user123",
owner_password="owner456",
algorithm="AES-256"
)
For detailed security operations and comprehensive metadata management, see:
from pdf2image import convert_from_path
import pytesseract
images = convert_from_path("scanned.pdf")
for i, image in enumerate(images):
text = pytesseract.image_to_string(image)
print(f"Page {i+1}:\n{text}")
text = pytesseract.image_to_string(image, lang='eng+fra+deu')
For searchable PDF creation and OCR preprocessing, see:
import fitz
doc = fitz.open("document.pdf")
for page in doc:
page.insert_textbox(
page.rect,
"CONFIDENTIAL",
fontsize=50,
rotate=45,
opacity=0.3,
color=(0.7, 0.7, 0.7)
)
doc.save("watermarked.pdf")
doc.close()
page.add_highlight_annot(rect) # Highlight
page.add_text_annot(point, "Note") # Text note
page.add_underline_annot(rect) # Underline
For stamps and image watermarks, see:
from pypdf import PdfReader, PdfWriter
reader = PdfReader("document.pdf")
writer = PdfWriter()
for page in reader.pages:
page.rotate(90)
writer.add_page(page)
with open("rotated.pdf", 'wb') as f:
writer.write(f)
import fitz
doc = fitz.open("document.pdf")
for page_num in range(len(doc)):
page = doc[page_num]
for img_index, img in enumerate(page.get_images()):
xref = img[0]
base_image = doc.extract_image(xref)
with open(f"image_{page_num}_{img_index}.png", "wb") as f:
f.write(base_image["image"])
doc.close()
from PIL import Image
from reportlab.pdfgen import canvas
c = canvas.Canvas("output.pdf")
for img_path in ["img1.jpg", "img2.jpg"]:
img = Image.open(img_path)
c.setPageSize(img.size)
c.drawImage(img_path, 0, 0, width=img.width, height=img.height)
c.showPage()
c.save()
For detailed page operations, see:
import fitz
doc = fitz.open("large.pdf")
doc.save(
"optimized.pdf",
garbage=4,
deflate=True,
clean=True
)
doc.close()
Process large PDFs in chunks:
from pypdf import PdfReader
import gc
reader = PdfReader("large.pdf")
for i, page in enumerate(reader.pages):
text = page.extract_text()
# Process text
if i % 10 == 0:
gc.collect()
Always handle encryption and errors:
from pypdf import PdfReader
try:
reader = PdfReader("document.pdf")
if reader.is_encrypted:
reader.decrypt(password)
for page in reader.pages:
text = page.extract_text()
except Exception as e:
print(f"Error: {e}")
Detect and handle scanned documents:
import fitz
doc = fitz.open("document.pdf")
text = doc[0].get_text()
if not text.strip():
# Use OCR for scanned document
from pdf2image import convert_from_path
import pytesseract
images = convert_from_path("document.pdf")
text = pytesseract.image_to_string(images[0])
For comprehensive best practices, common pitfalls, and troubleshooting, see:
Scanned Documents: Text extraction returns empty for scanned PDFs. Use OCR (pytesseract).
Table Detection: Tables not detected correctly. Adjust table_settings strategies.
Encrypted PDFs: Operations fail. Check and decrypt with password first.
Form Fields: Can't find field names. Use debug helper to list all fields.
Memory Issues: Large PDFs cause crashes. Process in chunks with garbage collection.
Encoding Issues: Special characters corrupted. Handle with UTF-8 encoding explicitly.
For detailed solutions and debugging strategies, see:
Text Extraction:
pypdf - page.extract_text()pdfplumber - page.extract_text() + page.extract_words()Table Extraction:
pdfplumber - page.extract_tables()PDF Creation:
reportlab - canvas.Canvas() or SimpleDocTemplate()Advanced Operations:
PyMuPDF (fitz) - forms, annotations, compressionOCR:
pytesseract + pdf2imageMerging/Splitting:
pypdf - PdfMerger() and PdfWriter()The skill includes helper scripts for common operations:
# See scripts directory for utilities
python scripts/pdf_helper.py --help
Comprehensive References:
Practical Examples:
When working with PDFs:
doc.close()For production use, always implement proper error handling, validate inputs, and test with various PDF types and versions.
development
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.