skills/sanitize/SKILL.md
Detect and redact PII from text files. Supports 15 categories including credit cards, SSNs, emails, API keys, addresses, and more — with zero dependencies.
npx skillsauth add tusosos/manus-knowledge-base sanitizeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Detect and redact personally identifiable information (PII) from text files.
--output FILE to write sanitized output to a file.--json and --preview are safe — they do NOT print raw PII values to stdout.*.entity-map.json) only when --output is used. Do NOT read the entity map file.Scans files for PII — credit cards, SSNs, emails, phone numbers, API keys, IP addresses, mailing addresses, dates of birth, passport numbers, driver's license numbers, bank routing numbers, medical license numbers, and insurance member IDs — and replaces each instance with a numbered placeholder like [CREDIT_CARD_1].
python scripts/sanitize.py patient-notes.txt --output clean.txt
python scripts/sanitize.py notes.md --preview
python scripts/sanitize.py report.txt --json --output clean.txt
python scripts/sanitize.py log.txt --categories ssn,credit_card,email --output clean.txt
See references/SUPPORTED_PII.md for the full list with detection methods and false positive mitigation.
| Category | Pattern type | Example |
|---|---|---|
| credit_card | Luhn-validated 13-19 digits | 4111 1111 1111 1111 |
| ssn | 3-2-4 digit groups | 123-45-6789 |
| cvv | Keyword-anchored 3-4 digits | CVV: 123 |
| expiry_date | Keyword-anchored MM/YY | expiry 01/30 |
| api_key | Provider prefix patterns | sk-abc..., ghp_..., AKIA... |
| email | Standard email format | [email protected] |
| phone | US/intl phone numbers | +1 (555) 123-4567 |
| ip_address | IPv4 addresses | 192.168.1.100 |
| date_of_birth | Keyword-anchored dates | DOB: 03/15/1985 |
| passport | Keyword-anchored alphanumeric | Passport: AB1234567 |
| drivers_license | Keyword-anchored alphanumeric | DL: D12345678 |
| bank_routing | Keyword-anchored 9 digits | routing: 021000021 |
| address | Street + city/state/zip | 742 Evergreen Terrace Dr, Springfield, IL 62704 |
| medical_license | Keyword-anchored license ID | License: CA-MD-8827341 |
| insurance_id | Keyword-anchored member/policy ID | Member ID: BCB-2847193 |
--json and --preview modes strip raw PII values from output. The entity map (containing raw PII to placeholder mappings) is only written to a sidecar file on disk when --output is used.Built by AgentWard — the open-source permission control plane for AI agents.
tools
Download video and audio from YouTube and other platforms with yt-dlp. Use when a user asks to download YouTube videos, extract audio from videos, download playlists, get subtitles, download specific formats or qualities, batch download, archive channels, extract metadata, embed thumbnails, download from social media platforms (Twitter, Instagram, TikTok), or build media ingestion pipelines. Covers format selection, audio extraction, playlists, subtitles, metadata, and automation.
development
Download YouTube videos with customizable quality and format options. Use this skill when the user asks to download, save, or grab YouTube videos. Supports various quality settings (best, 1080p, 720p, 480p, 360p), multiple formats (mp4, webm, mkv), and audio-only downloads as MP3.
development
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
development
Use when you have a spec or requirements for a multi-step task, before touching code