skills/asmayaseen/working-with-documents/SKILL.md
Creates and edits Office documents: Word (.docx), PDF, and PowerPoint (.pptx). Use when working with document creation, PDF manipulation, presentation generation, tracked changes, or converting between formats.
npx skillsauth add aiskillstore/marketplace working-with-documentsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
| Format | Read | Create | Edit | |--------|------|--------|------| | DOCX | pandoc, python-docx | docx-js | OOXML (unpack/edit/pack) | | PDF | pdfplumber, pypdf | reportlab | pypdf (merge/split) | | PPTX | markitdown | html2pptx | OOXML (unpack/edit/pack) |
# Convert to markdown (preserves structure)
pandoc document.docx -o output.md
# With tracked changes visible
pandoc --track-changes=all document.docx -o output.md
Use docx-js (JavaScript):
const { Document, Packer, Paragraph, TextRun } = require('docx');
const doc = new Document({
sections: [{
children: [
new Paragraph({
children: [
new TextRun({ text: "Hello World", bold: true }),
],
}),
],
}],
});
Packer.toBuffer(doc).then(buffer => {
fs.writeFileSync("output.docx", buffer);
});
# 1. Unpack
python ooxml/scripts/unpack.py document.docx unpacked/
# 2. Edit XML files in unpacked/word/document.xml
# Key files:
# - word/document.xml (main content)
# - word/comments.xml (comments)
# - word/media/ (images)
# 3. Pack
python ooxml/scripts/pack.py unpacked/ edited.docx
Tracked changes XML pattern:
<!-- Deletion -->
<w:del><w:r><w:delText>old text</w:delText></w:r></w:del>
<!-- Insertion -->
<w:ins><w:r><w:t>new text</w:t></w:r></w:ins>
import pdfplumber
# Extract text
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
print(page.extract_text())
# Extract tables
with pdfplumber.open("document.pdf") as pdf:
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
for row in table:
print(row)
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.lib.styles import getSampleStyleSheet
doc = SimpleDocTemplate("output.pdf", pagesize=letter)
styles = getSampleStyleSheet()
story = [
Paragraph("Report Title", styles['Title']),
Paragraph("Body text goes here.", styles['Normal']),
]
doc.build(story)
from pypdf import PdfReader, PdfWriter
# Merge
writer = PdfWriter()
for pdf_file in ["doc1.pdf", "doc2.pdf"]:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
writer.write(open("merged.pdf", "wb"))
# Split
reader = PdfReader("input.pdf")
for i, page in enumerate(reader.pages):
writer = PdfWriter()
writer.add_page(page)
writer.write(open(f"page_{i+1}.pdf", "wb"))
# Extract text
pdftotext input.pdf output.txt
pdftotext -layout input.pdf output.txt # Preserve layout
# Merge with qpdf
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
# Split pages
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf
# Convert to markdown
python -m markitdown presentation.pptx
Use html2pptx workflow:
# Create thumbnails for validation
python scripts/thumbnail.py output.pptx --cols 4
# 1. Unpack
python ooxml/scripts/unpack.py presentation.pptx unpacked/
# Key files:
# - ppt/slides/slide1.xml, slide2.xml, etc.
# - ppt/notesSlides/ (speaker notes)
# - ppt/media/ (images)
# 2. Edit XML
# 3. Validate
python ooxml/scripts/validate.py unpacked/ --original presentation.pptx
# 4. Pack
python ooxml/scripts/pack.py unpacked/ edited.pptx
# Duplicate, reorder, delete slides
python scripts/rearrange.py template.pptx output.pptx 0,3,3,5,7
# Creates: slide 0, slide 3 (twice), slide 5, slide 7
# DOCX/PPTX to PDF
soffice --headless --convert-to pdf document.docx
# PDF to images
pdftoppm -jpeg -r 150 document.pdf page
# Creates: page-1.jpg, page-2.jpg, etc.
# DOCX to Markdown
pandoc document.docx -o output.md
import pytesseract
from pdf2image import convert_from_path
images = convert_from_path('scanned.pdf')
text = ""
for image in images:
text += pytesseract.image_to_string(image)
Pick 3-5 colors that work together:
| Palette | Colors | |---------|--------| | Classic Blue | Navy #1C2833, Slate #2E4053, Silver #AAB7B8 | | Teal & Coral | Teal #5EA8A7, Coral #FE4447, White #FFFFFF | | Black & Gold | Gold #BF9A4A, Black #000000, Cream #F4F6F6 |
Arial, Helvetica, Times New Roman, Georgia, Verdana, Tahoma, Trebuchet MS, Courier New, Impact
# Python
pip install pypdf pdfplumber reportlab python-docx openpyxl
# System tools
apt-get install pandoc poppler-utils libreoffice
# Node.js (for docx-js)
npm install docx
Run: python scripts/verify.py
working-with-spreadsheets - Excel file handlingbuilding-nextjs-apps - Frontend for document uploadsdevelopment
Apple Human Interface Guidelines for content display components. Use this skill when the user asks about charts component, collection view, image view, web view, color well, image well, activity view, lockup, data visualization, content display, displaying images, rendering web content, color pickers, or presenting collections of items in Apple apps. Also use when the user says how should I display charts, what's the best way to show images, should I use a web view, how do I build a grid of items, what component shows media, or how do I present a share sheet. Cross-references: hig-foundations for color/typography/accessibility, hig-patterns for data visualization patterns, hig-components-layout for structural containers, hig-platforms for platform-specific component behavior.
tools
Automate HelpDesk tasks via Rube MCP (Composio): list tickets, manage views, use canned responses, and configure custom fields. Always search tools first for current schemas.
testing
Expert Haskell engineer specializing in advanced type systems, pure functional design, and high-reliability software. Use PROACTIVELY for type-level programming, concurrency, and architecture guidance.
tools
GraphQL gives clients exactly the data they need - no more, no less. One endpoint, typed schema, introspection. But the flexibility that makes it powerful also makes it dangerous. Without proper controls, clients can craft queries that bring down your server. This skill covers schema design, resolvers, DataLoader for N+1 prevention, federation for microservices, and client integration with Apollo/urql. Key insight: GraphQL is a contract. The schema is the API documentation. Design it carefully.