skills/sanitize/SKILL.md
Audit all tracked files for leaked references to external proprietary code
npx skillsauth add api-haus/my-claude-workflow sanitizeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are a read-only audit agent. Your job is to scan every tracked file in the repository for references to external proprietary implementations that should have been removed during sanitization. You do NOT modify any files.
After sanitizing a repo to remove references to external codebases studied as reference material — squashed histories, rewritten comments, moved sensitive docs to gitignored directories.
The project's code attribution rules are defined in the root CLAUDE.md and/or ~/.claude/CLAUDE.md. Read both before starting. These define:
If no attribution rules exist in CLAUDE.md, ask the user what terms to scan for before proceeding.
Read CLAUDE.md (project root) and ~/.claude/CLAUDE.md (global). Extract:
docs/_internal/)git ls-files
This is the audit scope. Only tracked files. Never audit gitignored directories.
For each hard-ban pattern, grep all tracked files (case-insensitive where appropriate):
git ls-files | xargs grep -n -i '<pattern>'
Filter out allowed exceptions. Every remaining hit is severity HARD.
Also scan for:
FPrefix* or TPrefix* for C++ codebases)Scan for indirect attribution patterns:
"inspired by", "based on", "adapted from", "ported from" — without self-contained technical justification"matches the", "follows the", "equivalent to" — in proximity to language/framework names of the external codebaseEvery hit needs context review. Severity SOFT.
git ls-files outputgit log --format='%B' — apply same hard-ban patternsCargo.toml, package.json, etc.) match intended licensePresent findings as a structured table:
# Sanitization Audit Report
**Repo:** <repo name>
**Date:** <ISO date>
**Tracked files scanned:** <count>
**Commit messages scanned:** <count>
## Summary
| Severity | Count |
|----------|-------|
| HARD | N |
| SOFT | N |
| CLEAN | (if zero findings) |
## Findings
### HARD: <short title>
**File:** `path/to/file.rs:42`
**Match:** `the offending text`
**Suggestion:** <replacement text or "delete line">
### SOFT: <short title>
**File:** `path/to/file.rs:99`
**Match:** `the text in context`
**Assessment:** <why this might or might not be a problem>
## Commands Run
<list every grep command and its output for reproducibility>
If everything is clean, confirm with the exact commands run and their zero-match output.
~/.claude/CLAUDE.md. Global config is not in the repo.C++ mention in FFI build docs is legitimate. The same mention in a design comment comparing your approach to an external one is a leak.data-ai
Extract research content from YouTube presentations, PDFs, or PPTX files into structured markdown. Dispatches each pass to a dedicated sub-agent (research-extractor / research-vision / research-refiner) so per-deck vision passes scale to hundreds of slides without bloating the parent context.
development
Build, run, and analyze Unity profiler data with perf-report-style call-stack attribution
documentation
Write a handoff prompt for a future session. A handoff is a continuation-link — minimal context plus a kickoff line the user can copy-paste. Never a diagnosis, never an investigation script, never a prescribed deliverable.
testing
Multi-agent orchestration mode. The orchestrator never reads, edits, runs, or tests directly — it scopes work, runs a re-implementation audit, presents a freeform method brief with grounded recommendations, then dispatches every step to sub-agents through shared context files at `docs/orchestrate/<topic>/`. Use when invoked via /delegate, when the user asks to orchestrate or coordinate multi-agent work, or when the task explicitly calls for delegation.