skills/document-generation-pdf/SKILL.md
Generate, fill, and assemble PDF documents at scale. Handles legal forms, contracts, invoices, certificates. Supports form filling (pdf-lib), template rendering (Puppeteer, LaTeX), digital signatures (DocuSign), and document assembly. Use for legal tech, HR automation, invoice generation. Activate on "PDF generation", "form filling", "document automation", "digital signatures". NOT for simple PDF viewing, basic file conversion, or OCR text extraction.
npx skillsauth add curiositech/windags-skills document-generation-pdfInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Expert in generating, filling, and assembling PDF documents programmatically for legal, HR, and business workflows.
✅ Use for:
❌ NOT for:
| Feature | pdf-lib | Puppeteer | LaTeX | |---------|---------|-----------|-------| | Form filling | ✅ Native | ❌ Complex | ❌ No | | Template rendering | ❌ No | ✅ HTML/CSS | ✅ Templates | | Performance (1000 PDFs) | 5s | 60s | 30s | | File size | Small | Medium | Small | | Signature fields | ✅ Yes | ❌ No | ❌ No | | Best for | Government forms | Invoices, reports | Academic papers |
Timeline:
Decision tree:
Need to fill existing form? → pdf-lib
Need complex layouts? → Puppeteer (HTML/CSS)
Need academic formatting? → LaTeX
Need to merge PDFs? → pdf-lib
Need digital signatures? → pdf-lib + DocuSign API
Novice thinking: "I'll use Puppeteer for everything, it's versatile"
Problem: 12x slower, 10x more memory, can't preserve form fields.
Wrong approach:
// ❌ Puppeteer for simple form filling (SLOW!)
import puppeteer from 'puppeteer';
async function fillForm(data: FormData): Promise<Buffer> {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Load PDF in browser
await page.goto(`file://${pdfPath}`);
// Somehow fill form fields? (hacky)
await page.evaluate((data) => {
// Can't easily access PDF form fields from DOM
// Would need to convert PDF → HTML first
}, data);
const pdf = await page.pdf();
await browser.close();
return pdf;
}
Why wrong:
Correct approach:
// ✅ pdf-lib for form filling (FAST!)
import { PDFDocument } from 'pdf-lib';
async function fillForm(templatePath: string, data: FormData): Promise<Uint8Array> {
// Load existing PDF form
const existingPdfBytes = await fs.readFile(templatePath);
const pdfDoc = await PDFDocument.load(existingPdfBytes);
// Get form
const form = pdfDoc.getForm();
// Fill text fields
form.getTextField('applicant_name').setText(data.name);
form.getTextField('case_number').setText(data.caseNumber);
form.getTextField('date_of_birth').setText(data.dob);
// Fill checkboxes
if (data.hasPriorConvictions) {
form.getCheckBox('prior_convictions').check();
}
// Fill dropdowns
form.getDropdown('state').select(data.state);
// Flatten form (make non-editable)
form.flatten();
// Save
return await pdfDoc.save();
}
Performance comparison (1000 PDFs):
Timeline context:
Problem: Users can edit filled forms, causing data inconsistencies.
Wrong approach:
// ❌ Don't flatten - form stays editable
const pdfDoc = await PDFDocument.load(existingPdfBytes);
const form = pdfDoc.getForm();
form.getTextField('name').setText('John Doe');
const pdfBytes = await pdfDoc.save();
// User can open PDF and change "John Doe" to anything!
Why wrong:
Correct approach:
// ✅ Flatten form after filling
const pdfDoc = await PDFDocument.load(existingPdfBytes);
const form = pdfDoc.getForm();
form.getTextField('name').setText('John Doe');
// Flatten (convert fields to static text)
form.flatten();
const pdfBytes = await pdfDoc.save();
// User can't edit filled values ✅
When NOT to flatten:
Novice thinking: "HTML → PDF is easy with Puppeteer"
Problem: Content splits mid-sentence across pages.
Wrong approach:
// ❌ No page break control
const html = `
<div class="contract">
<h1>Mutual Agreement</h1>
<p>Long paragraph that might split across pages...</p>
<section>
<h2>Terms and Conditions</h2>
<ol>
<li>Term 1 that could get cut off...</li>
<li>Term 2...</li>
</ol>
</section>
</div>
`;
const pdf = await page.pdf({ format: 'A4' });
// Result: Ugly page breaks in middle of sections
Correct approach:
// ✅ Explicit page break control with CSS
const html = `
<style>
@media print {
.page-break { page-break-after: always; }
.avoid-break { page-break-inside: avoid; }
h1, h2, h3 {
page-break-after: avoid;
page-break-inside: avoid;
}
section {
page-break-inside: avoid;
}
}
</style>
<div class="contract">
<section class="avoid-break">
<h1>Mutual Agreement</h1>
<p>This entire section stays together...</p>
</section>
<div class="page-break"></div>
<section class="avoid-break">
<h2>Terms and Conditions</h2>
<ol>
<li>Term 1</li>
<li>Term 2</li>
</ol>
</section>
</div>
`;
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
margin: {
top: '1in',
right: '1in',
bottom: '1in',
left: '1in'
}
});
CSS print properties:
page-break-before: always - Force new page before elementpage-break-after: always - Force new page after elementpage-break-inside: avoid - Keep element togetherProblem: Signature fields aren't clickable in generated PDFs.
Wrong approach:
// ❌ Add signature as image (not a real signature field)
const pdfDoc = await PDFDocument.load(existingPdfBytes);
const signatureImage = await pdfDoc.embedPng(signaturePngBytes);
const page = pdfDoc.getPage(0);
page.drawImage(signatureImage, {
x: 100,
y: 100,
width: 200,
height: 50
});
await pdfDoc.save();
// Not a real signature field - can't be signed digitally
Why wrong:
Correct approach 1: Create signature field (for DocuSign)
// ✅ Create signature field for electronic signing
const pdfDoc = await PDFDocument.load(existingPdfBytes);
const form = pdfDoc.getForm();
// Create signature field
const signatureField = form.createTextField('applicant_signature');
signatureField.addToPage(pdfDoc.getPage(0), {
x: 100,
y: 100,
width: 200,
height: 50
});
// Mark as signature field (metadata)
signatureField.updateWidgets({
borderWidth: 1,
borderColor: { type: 'RGB', red: 0, green: 0, blue: 0 }
});
await pdfDoc.save();
// DocuSign can now detect and fill this field ✅
Correct approach 2: DocuSign API integration
// ✅ Send to DocuSign for e-signature
import { ApiClient, EnvelopesApi } from 'docusign-esign';
async function sendForSignature(pdfBytes: Uint8Array, signerEmail: string) {
const apiClient = new ApiClient();
apiClient.setBasePath('https://demo.docusign.net/restapi');
const envelopesApi = new EnvelopesApi(apiClient);
const envelope = {
emailSubject: 'Please sign: Expungement Petition',
documents: [{
documentBase64: Buffer.from(pdfBytes).toString('base64'),
name: 'Petition.pdf',
fileExtension: 'pdf',
documentId: '1'
}],
recipients: {
signers: [{
email: signerEmail,
name: 'John Doe',
recipientId: '1',
tabs: {
signHereTabs: [{
documentId: '1',
pageNumber: '1',
xPosition: '100',
yPosition: '100'
}]
}
}]
},
status: 'sent'
};
return await envelopesApi.createEnvelope(accountId, { envelopeDefinition: envelope });
}
Problem: Sensitive legal documents stored in plain text.
Wrong approach:
// ❌ Save PDF to disk unencrypted
const pdfBytes = await pdfDoc.save();
await fs.writeFile(`./documents/${caseId}.pdf`, pdfBytes);
// Sensitive data accessible to anyone with file system access
Why wrong:
Correct approach 1: Encrypt at rest
// ✅ Encrypt PDF with user password
const pdfBytes = await pdfDoc.save({
userPassword: generateSecurePassword(),
ownerPassword: process.env.PDF_OWNER_PASSWORD,
permissions: {
printing: 'highResolution',
modifying: false,
copying: false,
annotating: false,
fillingForms: false,
contentAccessibility: true,
documentAssembly: false
}
});
await fs.writeFile(`./documents/${caseId}.pdf`, pdfBytes);
Correct approach 2: Store in encrypted storage (S3 with SSE)
// ✅ Upload to S3 with server-side encryption
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';
const s3Client = new S3Client({ region: 'us-east-1' });
await s3Client.send(new PutObjectCommand({
Bucket: 'expungement-documents',
Key: `cases/${caseId}/petition.pdf`,
Body: pdfBytes,
ServerSideEncryption: 'AES256',
Metadata: {
caseId: caseId,
documentType: 'petition',
generatedAt: new Date().toISOString()
},
ACL: 'private' // Not publicly accessible
}));
// Store S3 URL in database (encrypted)
await db.documents.insert({
case_id: caseId,
s3_url: encrypt(`s3://expungement-documents/cases/${caseId}/petition.pdf`),
created_at: new Date()
});
Compliance requirements:
□ Form fields filled correctly (test with validation)
□ Forms flattened after filling (non-editable)
□ Page breaks controlled (no mid-sentence splits)
□ Signature fields created (for DocuSign/Adobe Sign)
□ PDFs encrypted at rest (S3 SSE or user password)
□ Access logged (who viewed/downloaded)
□ Auto-deletion scheduled (retention policy)
□ Fonts embedded (cross-platform compatibility)
□ File size optimized (<5MB per document)
□ Batch generation tested (1000+ PDFs)
| Scenario | Appropriate? | |----------|--------------| | Fill 50+ government forms | ✅ Yes - automate with pdf-lib | | Generate invoices from template | ✅ Yes - Puppeteer from HTML | | Create certificates at scale | ✅ Yes - pdf-lib or LaTeX | | Assemble multi-doc packets | ✅ Yes - pdf-lib merge | | View PDFs in browser | ❌ No - use pdf.js or browser | | Edit PDF by hand | ❌ No - use Adobe Acrobat | | Extract text with OCR | ❌ No - use Tesseract/Textract |
/references/pdf-lib-guide.md - Form filling, field types, flattening, encryption/references/puppeteer-templates.md - HTML templates, page breaks, styling for print/references/document-assembly.md - Merging PDFs, packet creation, watermarksscripts/form_filler.ts - Fill PDF forms from JSON data, batch processingscripts/document_assembler.ts - Merge multiple PDFs, add cover pages, watermarksThis skill guides: PDF generation | Form filling | Document automation | Digital signatures | pdf-lib | Puppeteer | LaTeX | DocuSign
tools
Building resilient distributed systems with circuit breakers, retries with full-jitter exponential backoff, retry budgets (per-request 3-attempt + per-client 10% ratio per Google SRE), deadline propagation, and the cascading-failure math (4 layers × 3 retries = 64x amplification). Grounded in Resilience4j, Microsoft Cloud Patterns, AWS Architecture Blog (Marc Brooker), and Google SRE Book.
testing
Designing HTTP cache headers that work correctly across browsers, CDNs, and shared proxies — `Cache-Control` directives per RFC 9111, `stale-while-revalidate` and `stale-if-error` per RFC 5861, the Vary header for varying responses, and surrogate keys for tag-based purging. Grounded in IETF RFCs and Cloudflare/Fastly docs.
development
Use when designing or fixing a Content Security Policy on a real site, choosing between nonce-based and hash-based CSP, adding strict-dynamic, debugging "Refused to execute inline script" errors, deploying CSP in report-only mode first, configuring report-to / report-uri, or auditing an existing policy for unsafe-inline / unsafe-eval / wildcards. Triggers: "CSP blocks legitimate inline script", strict-dynamic, nonce-{RANDOM}, sha256-{HASH}, object-src none, base-uri none, frame-ancestors, Trusted Types, X-Content-Security-Policy obsolete, report-only vs enforced. NOT for general HTTP security headers (HSTS, COOP/COEP), Trusted Types deep dive, CORS configuration, or building a WAF.
tools
Choosing and operating an HTTP API versioning strategy that doesn't break clients — Stripe's date-based pinned versions, the Deprecation/Sunset header pair (RFC 9745 + RFC 8594), URI vs header vs media-type approaches, and the version-transformer pattern. Grounded in Stripe's published architecture and IETF RFCs.