skills/extract-rules/skills/extract-rules/SKILL.md
Extract project-specific coding rules and domain knowledge from existing codebase, generating markdown documentation for AI agents. Use when onboarding a new project, after code review discussions about coding style, or when coding conventions need documenting. Also consider running after sessions where coding preferences were discussed or corrected (--from-conversation), or after PRs with significant review feedback (--from-pr).
npx skillsauth add hiroro-work/claude-plugins extract-rulesInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Analyzes existing codebase to identify what Claude would get wrong without project-specific guidance, extracting coding rules and domain knowledge as structured markdown documentation for AI agents.
/extract-rules # Extract rules from codebase (initial)
/extract-rules --update # Re-scan and add new patterns (preserve existing)
/extract-rules --restructure # Re-analyze, reorganize structure, merge existing rules
/extract-rules --from-conversation # Extract from current session (latest)
/extract-rules --from-conversation <session-id> # Extract from a specific session
/extract-rules --from-pr 123 # PR in current repo
/extract-rules --from-pr owner/repo#123 # PR in another repo (URL form also accepted)
/extract-rules --from-pr 100..110 # PR range (current repo)
/extract-rules --from-pr owner/repo#100..110 # PR range (another repo)
/extract-rules --compact # Compact all over-threshold rules files (output_dir/**/*.md)
/extract-rules --compact path/to/file.md ... # Compact specific files (caller passes explicit paths)
# Multiple specs allowed (space-separated) → cross-analysis detects org-wide principles
Settings file: extract-rules.local.md (YAML frontmatter only, no markdown body)
.claude/extract-rules.local.md (takes precedence)~/.claude/extract-rules.local.md| Setting | Default | Description |
|---------|---------|-------------|
| target_dirs | ["."] | Analysis target directories |
| exclude_dirs | [".git", ".claude"] | Exclude directories (in addition to .gitignore) |
| exclude_patterns | [] | Exclude file patterns (e.g., *.generated.ts, *.d.ts) |
| output_dir | .claude/rules | Output directory for rule files (.md and .local.md). This directory is inside Claude Code's .claude/rules/** recursive auto-load scope, so every file written here is injected into context on session start |
| examples_output_dir | .claude/rules-extras | Output directory for .examples.md files. Defaults to a sibling directory outside .claude/rules/** so examples are not auto-loaded into context on session start. Set to output_dir (or any path under output_dir) to opt examples back into auto-load |
| staging_output_dir | .claude/rules-staging | Output directory for staged project-level patterns extracted in incremental modes (--from-conversation / --from-pr). On 1st observation, project-level patterns land here; on 2nd observation (matched in a later incremental run or by --update), they are promoted to <output_dir>/project.md and removed from staging. Defaults to a sibling directory outside .claude/rules/** so staged candidates are not auto-loaded into context on session start. Set to output_dir (or any path under output_dir) to opt staging back into auto-load. Language / framework / integration patterns bypass staging and land directly in their respective .local.md files (gating is scoped to project-level patterns) |
| language | ja | Report language (e.g., ja) |
| split_output | true | Separate Principles (.md) and patterns (.local.md) |
| resolve_references | true | Resolve file references during restructure |
| compaction_threshold | 40000 | Char count threshold for --compact mode (file is compacted if char count exceeds this). Set to a very large number (e.g. 99999999) to opt out of compaction. The default 40000 matches Claude Code's per-file warning threshold (40k chars, observed in Claude Code 2.1.x) — firing the gate exactly at the warning matches the user's visible signal that the file needs attention (buffer 0). To restore the prior 32000 default (80% buffer for subsequent rule additions — preventive trigger before the warning fires), set compaction_threshold: 32000 explicitly |
| min_cluster_size | 3 | Minimum related-bullet cluster size for --compact mode's consolidation detection. The subagent emits consolidation_proposals only when a cluster has at least this many related bullets. Set to a very large number (e.g. 99999999) to disable consolidation while keeping compaction (matches the compaction_threshold opt-out sentinel convention) |
---
target_dirs:
- .
exclude_dirs:
- .git
- .claude
exclude_patterns:
- "*.generated.ts"
output_dir: .claude/rules
examples_output_dir: .claude/rules-extras
staging_output_dir: .claude/rules-staging
language: ja
split_output: true
resolve_references: true
compaction_threshold: 40000
min_cluster_size: 3
---
Three output directories are involved: output_dir for rule files (.md / .local.md), examples_output_dir for .examples.md files, and staging_output_dir for staged 1st-observation project-level patterns from incremental modes. The default values place rule files under Claude Code's .claude/rules/** recursive auto-load scope and place examples / staging in sibling directories outside that scope, so examples and staged candidates do not consume context on session start. The paths: frontmatter on rule files is preserved as a human-facing category-scope hint — its loader-side semantics is not empirically verified, and the actual auto-load boundary is determined by directory placement only.
Default (split_output: true):
.claude/rules/ # output_dir (inside auto-load scope)
├── languages/
│ ├── typescript.md # Principles only (portable)
│ └── typescript.local.md # Project-specific patterns only
├── frameworks/
│ ├── react.md # Principles only (portable)
│ └── react.local.md # Project-specific patterns only
└── project.md # Always single file (no split)
.claude/rules-extras/ # examples_output_dir (outside auto-load scope)
├── languages/
│ └── typescript.examples.md # Examples for both
├── frameworks/
│ └── react.examples.md # Examples for both
└── project.examples.md # Examples
.claude/rules-staging/ # staging_output_dir (outside auto-load scope)
└── project.staging.local.md # 1st-observation project-level candidates (incremental modes only)
Staging holds project-level 1-shot pattern candidates written by a single --from-conversation / --from-pr invocation; the next incremental run promotes a re-observed candidate to canonical and removes it from staging.
Principles (portable across projects) and Project-specific patterns (local) are separated by default. This enables organizational rule sharing and AI-driven merge across projects.
Hybrid mode (split_output: false):
.claude/rules/ # output_dir (inside auto-load scope)
├── languages/
│ └── typescript.md # Principles + Project-specific patterns
├── frameworks/
│ └── react.md # Principles + Project-specific patterns
└── project.md # Domain, architecture, conventions
.claude/rules-extras/ # examples_output_dir (outside auto-load scope)
├── languages/
│ └── typescript.examples.md # Examples
├── frameworks/
│ └── react.examples.md # Examples
└── project.examples.md # Examples
.claude/rules-staging/ # staging_output_dir (outside auto-load scope)
└── project.staging.local.md # 1st-observation project-level candidates (incremental modes only)
Layered frameworks (Rails, Django, Spring, etc.): When a framework has distinct architectural layers, generate layer-specific files:
<framework>.md — Cross-layer rules (no paths: or broad scope)<framework>-<layer>.md — Layer-specific rules with scoped paths: (e.g., app/models/**).local.md counterpartsIntegration libraries (Inertia, Pundit, Devise, Turbo, etc.): When integration libraries are detected alongside a layered framework:
integrations/<framework>-<integration>.md — Integration-specific rulesintegrations/ directoryrender inertia: vs Laravel: Inertia::render()).local.md counterpartsExample output with integrations (split mode — each category also gets a .examples.md under examples_output_dir):
.claude/rules/ # output_dir
├── languages/
│ └── ruby.md / ruby.local.md
├── frameworks/
│ ├── rails.md / rails.local.md
│ ├── rails-controllers.md / .local.md
│ └── rails-models.md / .local.md
├── integrations/
│ ├── rails-inertia.md / .local.md
│ └── rails-pundit.md / .local.md
└── project.md
.claude/rules-extras/ # examples_output_dir
├── languages/
│ └── ruby.examples.md
├── frameworks/
│ ├── rails.examples.md
│ ├── rails-controllers.examples.md
│ └── rails-models.examples.md
├── integrations/
│ ├── rails-inertia.examples.md
│ └── rails-pundit.examples.md
└── project.examples.md
Format switching: Run --restructure after changing split_output setting to switch between split and hybrid formats.
Check arguments to determine mode:
--update → Update Mode (Step U1-U6)--restructure → Restructure Mode (Step R1-R5)--from-conversation [session-id] → Conversation Extraction Mode (Step C1-C5)--compact [<paths>] → Compaction Mode (Step CP1-CP5)--from-pr <number|owner/repo#number|range> [...] → PR Review Extraction Mode (Step P1-P5)Search for extract-rules.local.md:
.claude/extract-rules.local.md~/.claude/extract-rules.local.mdPriority:
Extract settings (target_dirs, exclude_dirs, exclude_patterns, output_dir, examples_output_dir, staging_output_dir, language, split_output, resolve_references, compaction_threshold) from the config file. See Configuration section above for defaults.
language resolution: skill config → Claude Code settings (~/.claude/settings.json language field) → default ja
Detect project language and framework:
1. Detect languages by config files (package.json, tsconfig.json, pyproject.toml, go.mod, Cargo.toml, Gemfile, pom.xml, etc.) and file extensions (.ts/.tsx, .py, .go, .rb, etc.)
2. Detect frameworks by their config files (e.g., next.config.*, playwright.config.*) and dependencies in package manifests.
3. Detect architectural layers (for layered frameworks):
If a framework has distinct layers with separate directories (e.g., Rails: app/models/, app/controllers/; Django: models.py, views.py), detect them for layer-specific rule files. Only split when corresponding directories actually exist.
4. Detect integration libraries (for layered frameworks):
Read references/integration-criteria.md for detection rules and classification criteria.
Output: List of detected languages, frameworks, architectural layers, and integration libraries
Collect target files for analysis:
git ls-files (respects .gitignore). If not a git repo, fall back to Glob with manual exclusions from settings.target_dirs, exclude_dirs, exclude_patterns, and detected language extensionsRead references/extraction-criteria.md before proceeding to understand the classification criteria. The core question for every pattern is: "Would Claude produce something different without knowing this?" — extract only what fills the gap between Claude's general knowledge and this project's actual conventions.
For each detected language, framework, and integration library:
1.5. Separate integration-specific patterns (for layered frameworks with integrations):
See references/integration-criteria.md "Pattern routing" section.
Classify each pattern (see references/extraction-criteria.md):
For general style patterns:
For project-specific patterns:
signature - brief context (2-5 words)Apply AI judgment to determine which patterns meet the extraction criteria (see references/extraction-criteria.md)
Determine appropriate detection methods based on language and project structure.
Also analyze non-code documentation:
Extract explicit coding rules and guidelines from these documents.
Deduplication check: Read any files under .claude/rules/ to build a set of already-documented rules. Rules extracted in Step 4 that overlap with these existing rules should be skipped to avoid duplication. Note: CLAUDE.md is NOT a deduplication source — rules should exist in .claude/rules/ even if also mentioned in CLAUDE.md, because rule files are portable across projects via merge-rules. This check applies to all modes (Full Extraction, Update, Conversation, PR Review).
Read references/security.md before generating output to ensure sensitive information is not included.
Check if output_dir exists
--restructure to reorganize, --update to add new patterns, or delete the directory manually to start fresh."output_dir. Also create examples_output_dir if it differs from output_dir and does not exist yet (when both resolve to the same path the single directory created above is reused).Generate rule files per category. Rule files (<name>.md and <name>.local.md) are written under output_dir; <name>.examples.md files are written under examples_output_dir (default: .claude/rules-extras — outside Claude Code's .claude/rules/** auto-load scope, so examples do not consume context on session start).
languages/<lang>.md for language-specific rules (under output_dir)frameworks/<framework>.md for framework-specific rules (under output_dir)project.md for project-specific rules (under output_dir)<framework>.md (cross-layer) + <framework>-<layer>.md per detected layer with scoped paths:references/integration-criteria.md "Output structure" section.By default (split_output: true): Generate 3 files per category (except project which gets 2):
<output_dir>/<name>.md — ## Principles only (portable), with paths: frontmatter<output_dir>/<name>.local.md — ## Project-specific patterns only (local), with the same paths: frontmatter as its <name>.md counterpart (paths: is retained as a human-facing category-scope hint; loader-side semantics is not empirically verified, and auto-load is determined by directory placement only)<examples_output_dir>/<name>.examples.md — Examples for both. Whether the file is auto-loaded depends on examples_output_dir's placement relative to .claude/rules/**: with the default .claude/rules-extras it is outside auto-load scope; with examples_output_dir set to output_dir (or any path under it) it is auto-loadedpaths: independently (applies to both .md and .local.md). Cross-layer files (<framework>.md / <framework>.local.md) use no paths: or broad scope as they apply across all layers.When split_output: false: Generate single hybrid file per category under output_dir, and the matching <name>.examples.md under examples_output_dir.
Rule file format (hybrid example):
---
paths:
- "**/*.ts"
- "**/*.tsx"
---
# TypeScript Rules
## Principles
- FP only (no classes, pure functions, composition over inheritance)
- Strict null handling (no non-null assertions, explicit narrowing required)
- Barrel exports required (re-export from index.ts per directory)
## Project-specific patterns
- `RefOrNull<T extends { id: string }> = T | { id: null }` - nullable relationships
- `pathFor(page) + url()` - Page Object navigation pair
- `useAuthClient()` returns `{ user, login, logout }` - auth hook interface
## Examples
When in doubt: ../../rules-extras/languages/typescript.examples.md
(The path above assumes default settings — output_dir: .claude/rules and examples_output_dir: .claude/rules-extras. See references/examples-format.md § Reference Section in Rule Files for the relative-path computation under non-default settings.)
Format guidelines:
For Principles section:
Principle name (hint1, hint2, hint3)For Project-specific patterns section:
`signature` - brief contextuseAuth() → { user, login, logout } (not full implementation)For .examples.md files: Read references/examples-format.md for file structure, Good/Bad contrast guidelines, and the reference section format. Each rule file with a corresponding .examples.md must end with a ## Examples reference section (see the reference for format). ### titles must match the corresponding rule name exactly — do not translate or rephrase.
paths patterns by category:
**/*.ts, **/*.tsx**/*.py**/*.tsx, **/*.jsxpaths: to layers where the integration is used
(e.g., Inertia in controllers: app/controllers/**)After generating all rule files, verify no sensitive information was included:
[0-9a-fA-F]{20,}[A-Za-z0-9+/=]{40,}(key|token|secret|password|credential)\s*[:=]\s*["'][^"']+(internal|staging|localhost:[0-9]+)API_KEY_REDACTED) and warn the userNote: This check applies to all modes that generate or update rule files (Full Extraction, Update, Restructure, Conversation Extraction). Also check .examples.md files — they contain actual code from the codebase and may include sensitive information.
Display analysis summary. See references/report-templates.md for format.
When --update is specified, re-scan the codebase and add new patterns while preserving existing rules.
Staging awareness: Update Mode reads the staging file under staging_output_dir (when present) and promotes any staged project-level patterns that re-match against fresh code observations to canonical (<output_dir>/project.md, the single hybrid file for project-level patterns), removing them from staging. Update Mode does not write new entries to staging — un-matched new patterns from this run land directly in canonical as today (operator-driven Update is treated as explicit "land this now" intent). See references/conversation-mode.md § Mode interaction summary for the full per-mode staging behavior.
Operational note: After a dependency's major-version bump, run --update so the Step U3 staleness check flags removed symbols. The check only scans inline `symbol` in .local.md's ## Project-specific patterns — .examples.md is not auto-scanned, so manually review it for the affected framework(s). Stale samples there propagate via reviewers trusting project examples as authoritative. --restructure is not a substitute here: it reorganizes files without running the staleness check.
Load settings from extract-rules.local.md (same as Step 1 in Full Extraction Mode)
Check if output directory exists (default: .claude/rules/)
split_output: true and hybrid files exist (.md files containing both ## Principles and ## Project-specific patterns): warn that hybrid files were found — recommend running --restructure to migrate to split formatsplit_output: false and .local.md files exist: warn that orphaned .local.md files were found — recommend deleting orphaned files manually or running --restructureLoad existing rule files to understand current rules (load <output_dir>/<name>.md, <output_dir>/<name>.local.md, and <examples_output_dir>/<name>.examples.md when split). When examples_output_dir does not yet exist (e.g. legacy projects where examples were co-located under output_dir), fall back to loading <output_dir>/<name>.examples.md so existing examples are not invisible to the merge step; --restructure can subsequently migrate them to examples_output_dir. Additionally load <staging_output_dir>/project.staging.local.md if present — this file is required for the Step U4 staging-match branch. Skip silently if the staging file does not yet exist (no incremental run has populated it yet).
Execute Step 2-5 from Full Extraction Mode:
Before adding new rules, check existing project-specific patterns for staleness:
## Project-specific patterns sections:
split_output: true: from .local.md filessplit_output: false: from ## Project-specific patterns sections in .md files`symbol`), verify the symbol still exists in the codebase using Grep
`pathFor() + url()`), check each symbol individuallyThis prevents rule files from growing indefinitely as the codebase evolves.
For each extracted principle/pattern:
Check if already exists: Compare with existing rules (check both shared and local files if split_output: true). Evaluate the branches below in order, first match wins (same evaluate-in-order discipline as references/conversation-mode.md § Step C5's "Check for duplicates and route per category" step):
.md file (use AI judgment: case-insensitive, synonyms). For example, `useAuth() → { user, login, logout }` - auth hook interface is a duplicate of Auth hook interface (useAuth) in ## Principles. Skip patterns that already exist as Principles.<staging_output_dir>/project.staging.local.md per the staging-match criterion (see references/conversation-mode.md § Step C5's "staging-match criterion" paragraph), schedule a promote — append to <output_dir>/project.md (the single hybrid file for project-level patterns) in Step U5, then delete the matched entry from staging in Step U5 (move-atomicity: canonical-first, staging-delete-second). Update Mode does not write new staging entries; un-matched project-level patterns land directly in canonical.Preserve manual edits: Do not modify existing rules
## Principles section## Project-specific patterns sectionsplit_output: true: Principles go to <output_dir>/<name>.md, patterns go to <output_dir>/<name>.local.md. Create missing files with proper frontmatter.<output_dir>/project.md: always append to the single file.examples.md: Resolve the target path via examples_output_dir (<examples_output_dir>/<name>.examples.md). Create the file (and any missing parent directories under examples_output_dir) when absent. Follow the common generation procedure in references/examples-format.md to add examples for each new rule. Per references/conversation-mode.md § Step C5's "Update .examples.md" step ("examples-on-canonical-only"), both direct canonical appends and promotes from staging count as canonical writes and trigger examples; Update Mode has no staging-only path, so every new rule in this step gets an .examples.md entry.<output_dir>/project.md (the single hybrid file for project-level patterns, per item 5) and, after verifying the canonical write, Edit <staging_output_dir>/project.staging.local.md to remove the corresponding bullet. If the staging-delete Edit fails because the bullet is no longer uniquely matchable, leave the duplicate — the next session's canonical-match skip resolves it. Update Mode does not write new staging entries.Run Security Self-Check (same as Step 6.5) on new/updated files, including the staging file if any staging-delete edits landed in Step U5 (the staging file was rewritten by the staging-delete Edit).
Report what was added per file. Also report any stale rules found in Step U3. Include canonical_skip_count and promoted_count (Update Mode never increments staged_count because it does not write new staging entries). See references/report-templates.md for format.
When --restructure is specified, re-analyze the codebase to determine the optimal file structure, then merge existing rule content into the new structure. Use this when the project has evolved (new frameworks, architectural changes), when split_output settings change, or after updating the extract-rules skill itself.
Note: Restructure Mode does NOT run the Step U3 staleness check — use --update first so stale symbols are flagged for manual review (see the Update Mode operational note for the post-major-version-bump workflow).
output_dir exists → Error if not: "Run /extract-rules first to initialize rule files."<output_dir>/**/<name>.md and <output_dir>/**/<name>.local.md (rule files), plus <examples_output_dir>/**/<name>.examples.md (examples files). When examples_output_dir differs from output_dir, also scan <output_dir>/**/<name>.examples.md to pick up legacy co-located examples written by older runs; treat such legacy files as candidates to migrate during Step R4.Execute Step 2-5 from Full Extraction Mode to determine the ideal file structure.
Skip this step if resolve_references is false. Default is true.
Scan existing rule content (loaded in R1) for file references (Markdown links, text references like "See <path>", @path references), resolve them, extract rules from referenced files, and merge into the R1 snapshot. Rules from references are treated as existing rules (take priority on conflict in R4). See references/resolve-references.md for detailed processing steps.
Compare old and new file structures, display planned changes (Keep/New/Remove per file), and wait for user confirmation before proceeding. If references were resolved in R2.5, include the number of rules extracted from referenced files in the plan display so the user understands where additional rules came from.
project.md as fallback; preserve custom sections in the most relevant filesplit_output setting (handle hybrid ↔ split transitions), deduplicate.examples.md: Write .examples.md files to <examples_output_dir>/<name>.examples.md, following the same structure changes as rule files. When R1 picked up legacy <output_dir>/<name>.examples.md files (co-located with rule files from older runs), move them to the new location under examples_output_dir and remove the legacy copies after the new file is written. Generate new .examples.md for categories that didn't have one (see references/examples-format.md).Run Security Self-Check (same as Step 6.5) on all generated files.
Report structural changes, content merge summary, unmatched rules, and reference resolution results. See references/report-templates.md for format.
When --from-conversation is specified, extract rules from the full conversation history stored in session .jsonl files. The heavy processing (jsonl parsing, analysis, rule writing) is delegated to a subagent to keep the main context clean.
Load settings from extract-rules.local.md (same as Step 1 in Full Extraction Mode)
Check if output directory exists (default: .claude/rules/)
Locate the session file:
pwd)/ and . with - (leading - is kept)
/Users/hiropon/Sources/github.com/myproject → -Users-hiropon-Sources-github-com-myproject~/.claude/projects/<encoded-path>/<session-id>.jsonlSelect the target session:
<session-id> argument is provided: use ~/.claude/projects/<encoded-path>/<session-id>.jsonl.jsonl file in the directory (by ls -t)
Spawn a subagent using the Agent tool. The subagent performs all heavy processing (C3–C5) and returns a summary of what was added. Read references/conversation-mode.md for the full subagent instructions (Steps C3–C5).
Include in the agent prompt:
output_dir, examples_output_dir, and staging_output_dir paths, plus split_output / language settings (the subagent must write rule files under output_dir, .examples.md files under examples_output_dir, and staging entries under staging_output_dir)canonical_files: list of existing rule file paths for canonical-match deduplication — include both rule files under output_dir and .examples.md files under examples_output_dirstaging_files: list of existing staging file paths for staging-match detection — include the project-level staging file under staging_output_dir (gating is scoped to project-level patterns)references/conversation-mode.mdAfter the subagent completes, report the results to the user.
When --compact is specified, compact over-threshold rules files so they stay below Claude Code's per-file warning threshold (40k chars in Claude Code 2.1.x). Target file selection, char-count check, and threshold filtering all happen inside this mode — callers (e.g. dev-workflow Step 11) invoke --compact without file arguments and let this mode resolve the target set.
Use the Pattern A iteration loop convention (sibling to verify-diff / publicity-review / skill-review): the Skill wrapper runs in the main thread, a subagent performs the compaction analysis, the main thread applies the resulting mechanical_edits, and a fenced JSON return contract is emitted for caller dispatch. Per-file outer loop with max_iterations = 2 (default).
Load settings from extract-rules.local.md (same as Step 1 in Full Extraction Mode). compaction_threshold (default 40000) is the filter applied below in step 3. min_cluster_size (default 3) gates consolidation detection inside Step CP2 — it does not affect target resolution here
Check output_dir exists. If not, emit {"status": "error", "reason": "output directory not found"} and stop
Resolve targets:
Read the file content and measure its char count via the Read output length (do not use Bash(wc -m) — Read length matches Claude Code's char-count metric, while wc -c reports bytes which diverge for multi-byte content). All explicit paths join the Step CP2 target set, regardless of char count: under-threshold paths still enter Step CP2 so the consolidation pass can run on them — mechanical_edits will be empty for under-threshold files (the convergence check at (d) terminates them immediately), but consolidation_proposals may still be emitted. The widened skipped-below-threshold status (set at Step CP2 (d), not here — (f) merely records what (d) chose) labels these files. Explicit-paths mode accepts paths under either output_dir or examples_output_dir — callers needing to compact .examples.md files when examples_output_dir != output_dir must use this modeGlob <output_dir>/**/*.md (covers .md / .local.md uniformly — and also .examples.md if any are still co-located under output_dir from legacy runs, since the glob does not distinguish by extension). For each file, Read and measure its char count; collect entries with char count > compaction_threshold into the target set. Note: discovery mode does not surface sub-threshold files in files_processed (they are silently filtered out). This asymmetry is intentional: an unargumented --compact invocation reports only files that actually crossed the threshold, while caller-passed paths report every path the caller named so the caller can correlate input to output. Trade-off: clusters inside small (under-threshold) files are not detected by discovery mode; to scan them for consolidation, invoke --compact <path> (or --compact <path1> <path2> ...) with explicit paths. Discovery scope: this branch scans output_dir only — when examples_output_dir differs from output_dir (including the default .claude/rules-extras configuration), .examples.md files under examples_output_dir are not discovered automatically. Rationale: .examples.md files outside .claude/rules/** are already exempt from Claude Code auto-load, so they do not consume session-start context and the compaction priority is correspondingly lower; the explicit-paths route preserves the ability to compact them on demandCache the per-file Read content keyed by path for reuse in Step CP2 (a) iter 1's dispatch payload — avoids a second Read of the same file before the first subagent dispatch
If the target set is empty (no paths resolved at all — empty explicit-paths argument or zero discovery hits), emit {"status": "no-actionable", "compaction_threshold": <int>, "min_cluster_size": <int>, "files_processed": [], "reason": "no targets resolved"} and stop
Pre-register per-file tasks — before entering the per-file outer loop, TaskCreate one task per file in the target set (e.g. compact: <path>); the target set now includes both over-threshold and under-threshold explicit paths (under-threshold paths enter the loop so the consolidation pass can run on them). Mark each task in_progress (via TaskUpdate) before its first dispatch and completed after the per-file loop terminates (regardless of per_file_status outcome — converged / partial / unresolved / error / skipped-below-threshold all flip the row to completed; the outcome is carried in the per-file record, not the task status). Per-iter progress within a file is tracked inline within this Step (no per-iter task) because the iter count is bounded at max_iterations = 2. Where the Task tools are unavailable (e.g. the VSCode extension, or Claude Code before v2.1.142), use the equivalent TodoWrite operations instead — the status values and pre-register semantics are identical; allowed-tools grants both.
For each file in the target set, run the per-file iteration loop. max_iterations = 2 by default (compaction is judgment-heavy; two passes give the subagent a chance to refine its first attempt before declaring partial). Under-threshold files terminate at iter 1's (d) convergence check (chars_after ≤ compaction_threshold is already true), so their loop effectively runs once for consolidation detection only.
(a) Read & dispatch (per-iter): On iter 1, reuse the cached content from Step CP1 step 3 — chars_before is that cache entry's char count (avoids re-reading the same file). On iter i ≥ 2, re-Read the target file so the subagent operates on the post-prior-iter content. Spawn an Agent (subagent_type: general-purpose) with the dispatch prompt assembled from these --- LABEL --- sections (same fence convention as verify-diff Step 3 dispatch):
--- TARGET FILE ---: absolute path + full current content--- COMPACTION HEURISTICS ---: the four heuristics enumerated in references/compaction-mode.md § Heuristics (class-level extension merge / similar-entry merge / example reference extraction / one-shot incident dropout) — emit into mechanical_edits / structural_notes--- CONSOLIDATION HEURISTICS ---: the four heuristics enumerated in references/compaction-mode.md § Consolidation heuristics — emit into consolidation_proposals only, gated by the resolved min_cluster_size--- TARGET CHARS ---: the resolved compaction_threshold--- MIN CLUSTER SIZE ---: the resolved min_cluster_size integer--- ITER INFO ---: current iter number (1 or 2), max_iter (2). On iter 2, also include a one-line summary of what iter 1 applied (the count of mechanical_edits landed and the iter-1 chars_after figure) so the subagent can plan an additional pass. Note: consolidation_proposals are collected from iter 1 only and the subagent should not re-emit them on iter 2--- COMPACTOR PROMPT ---: the subagent instructions, including the mechanical_edits old_string uniqueness convention (1–3 lines of surrounding context, per the verify-diff convention) and the two-heuristic-set / distinct-output-array routing (see references/compaction-mode.md § Contract). Include the body verbatim from references/compaction-mode.md--- RESPONSE FORMAT ---: the fenced JSON schema the subagent must emit (per-iter response, not the top-level skill return shape)(b) Parse: parse the subagent's fenced JSON response. Evaluate in this order, first match wins (same evaluate-in-order discipline as verify-diff § (b) Parse & apply):
status: "error", reason: "verdict parse failure"mechanical_edits, structural_notes, consolidation_proposals) are missing, values are not arrays, or any entry fails its expected shape: each mechanical_edits entry needs non-empty string file, old_string, new_string; each structural_notes entry needs non-empty string file, description, rationale; each consolidation_proposals entry needs non-empty string file, non-empty cluster_bullets array (each item with non-empty string line_range and snippet), merged_principle object with non-empty string name and text, and non-empty replacements array (each item with non-empty string line_range, strategy ∈ {"delete", "cross_ref"}, and — when strategy: "cross_ref" — non-empty string cross_ref_text) → terminate with per-file status: "error", reason: "verdict schema violation". Validating entry shape here prevents a malformed entry from crashing downstream consumers (Edit calls for mechanical_edits, caller-side rendering for consolidation_proposals). For forward-compat with older subagent prompts that do not emit consolidation_proposals, the main thread treats a missing consolidation_proposals key (not present in the JSON) as an empty array; only an explicitly non-array value triggers the schema-violation pathi ≥ 2) — the (remaining_edits_count, structural_notes_count) multiset matches iter i − 1's same multiset (the subagent is not making forward progress) → terminate with per-file status: "unresolved", reason: "no progress between iters"(c) Apply (per-iter): this phase has two sub-phases — (c1) mechanical_edits apply (compaction heuristics 1–4), followed by (c2) consolidation_proposals main-thread synthesis (iter 1 only; consolidation heuristics 1–4). Both sub-phases share the iter-level applied_edits_count counter.
(c1) mechanical_edits apply: for each entry in mechanical_edits, re-Read the target file (so old_string matches the current contents after any earlier edit in this iter), then call Edit. Scope rail: before each Edit, verify the entry's file equals the target file's path; if not, skip that entry (no working-tree write) and record the rejected path. This mirrors verify-diff Auto-derive A2 (c) Scope rail. If old_string is not found, skip that entry — this is the expected no-op fallback for overlapping edits emitted from the same iter-1 snapshot. Increment the iter-level applied_edits_count only for entries whose Edit call succeeded.
(c2) consolidation_proposals main-thread synthesis (iter 1 only): for each cluster in consolidation_proposals (process clusters sequentially — cluster A complete before cluster B begins, so cluster B's bullet extraction reads the post-cluster-A working-tree state). Per § consolidation_proposals schema in references/compaction-mode.md, the subagent does not emit mechanical_edits for these proposals — main thread synthesizes the Edit calls from the proposal's cluster_bullets + merged_principle + replacements fields. Per-cluster procedure:
cluster_bullets[i], use snippet (≤120 chars prefix, canonical tail-truncate / no ellipsis / leading bullet prefix preserved form per schema) as a byte-level prefix-match seed against current file lines. Extract the full bullet body excluding the trailing newline — the surrounding \n is preserved on disk by Edit's in-place replacement semantics for steps 4 (insertion) and 5 (cross_ref), and the trailing \n is appended explicitly to old_string by step 5's delete strategy. Tie-breaker (best-effort): if multiple lines prefix-match, use cluster_bullets[i].line_range as the authoritative selector. Note that (c1)'s mechanical_edits apply earlier in the same iter may have shifted line numbers since the subagent's iter-1 snapshot — line_range is best-effort against the post-(c1) file; if the snippet collides with multiple lines AND line_range no longer resolves to a prefix-matching line, treat the bullet as unresolvable and let the resulting Edit no-op-skip via the standard verbatim-not-found fallback (no wrong-line edit lands). Comparison is byte-level — do not interpret backticks or regex metacharacters. Multi-line bullet: if line_range.M > L, extract lines L through M inclusive as the full bullet body (multi-line old_string — same trailing-newline-excluded convention; the final line's \n is omitted).references/compaction-mode.md § consolidation_proposals schema's replacements paragraph, the subagent may emit both a {strategy: "delete"} and a {strategy: "cross_ref"} entry for the same line_range when the choice is ambiguous. Main thread prefers cross_ref over delete to preserve incident pointers per § Preservation rules (iii)–(iv); ignore the delete entry when both are present.Edit with old_string = cluster_bullets[0] full bullet, new_string = - + merged_principle.text + \n + the original bullet body. This inserts the merged principle immediately above the first cluster bullet.replacements[j] entries from step 3 (a single chosen strategy per line_range after the cross_ref-over-delete preference is applied; cluster_bullets[i] without a matching replacements[] entry — e.g., cluster_bullets[0] when the subagent only emitted replacement strategies for i ≥ 1 — is left in place as the anchor for step 4's insertion and is not edited here). For each selected replacements[j], join back to the corresponding cluster_bullets[i] via line_range to obtain the full bullet body, then:
strategy: "cross_ref": Edit with old_string = full bullet, new_string = - + cross_ref_textstrategy: "delete": Edit with old_string = full bullet + trailing \n, new_string = ""The same scope rail and no-op-fallback semantics from (c1) apply: any entry whose file does not match the dispatched target path is skipped (recorded as rejected); any old_string not found in the current content is skipped (treated as overlapping-edit no-op fallback). Increment applied_edits_count for each successful Edit — the counter is shared between (c1) and (c2), so a post-(c) applied_edits_count > 0 means at least one compaction edit or consolidation edit landed.
(d) Per-iter convergence check: re-Read the target file to measure chars_after_iter_i. If chars_after_iter_i ≤ compaction_threshold, the file's compaction work is complete; terminate the loop. The per-file status is then skipped-below-threshold when the cumulative applied_edits_count across iters is 0 (no compaction-or-consolidation edits ever landed — this can only happen when the file was already at-or-below threshold on entry, since otherwise (d)'s convergence check would have been false and the loop would have continued to iter 2 or terminated via (e) as partial), or converged when the cumulative applied_edits_count > 0 (one or more edits — compaction-mechanical or consolidation-synthesized — landed and the file is now at-or-below threshold). Per (c1) + (c2), applied_edits_count aggregates both edit classes, so the convergence check is uniform across compaction-only / consolidation-only / mixed runs
(e) Continue or terminate: if i < max_iterations and not converged, proceed to iter i + 1 (back to (a)). If i == max_iterations and not converged, terminate the loop with per-file status: "partial" (the file was reduced but did not reach the threshold)
(f) Per-file record: at file completion, aggregate:
path, chars_before, chars_after (the latest measured), iterations_usedapplied_edits_count (sum across iters)structural_notes — captured from iter 1 only (treat iter 1 as the source of truth; iter 2 re-runs the heuristics on already-modified content and may return drifted notes — same inferred_intent persistence discipline as verify-diff). If iter 1 produced no parseable verdict (terminated via the (b) error paths), structural_notes is []consolidation_proposals — same iter-1-only discipline as structural_notes above. Iter 2's consolidation_proposals_count is ignored (the subagent should not re-emit them, and the main thread does not consume them if returned). If iter 1 produced no parseable verdict, consolidation_proposals is []per_file_status ∈ {converged, partial, unresolved, error, skipped-below-threshold}. Set by (d) (converged or skipped-below-threshold per the threshold-vs-applied-edits discrimination), (e) (partial), or (b) (error / unresolved). The skipped-below-threshold value's semantic is widened: it now means "compaction skipped because the file was already at-or-below threshold (no compaction-or-consolidation edits landed — cumulative applied_edits_count == 0), but Step CP2 still ran the per-file dispatch and any consolidation_proposals / structural_notes may be present"below_threshold = chars_after ≤ compaction_thresholdreason (set only when per_file_status ∈ {error, unresolved}; omitted otherwise — including for converged / partial / skipped-below-threshold)Important: consolidation_proposals are auto-applied by the main thread synthesis sub-phase (c2) — the subagent still emits them as detection-only output (the subagent does not call Edit itself, per the analysis-only / file-write contract in references/compaction-mode.md § Forbidden tool calls and § consolidation_proposals schema's Materialization disposition), and the main thread synthesizes the corresponding Edit calls from the cluster description. The consolidation_proposals array in the per-file record is therefore now the applied-cluster trace (surfaced alongside the resulting file-content change), not a caller-judgment note. structural_notes remain not applied by this mode — they are surfaced as caller-judgment notes (the caller, e.g. dev-workflow Step 11 user-gate, decides whether to act). This matches the skill-review semantic for structural notes.
Run Security Self-Check (same as Step 6.5 in Full Extraction Mode) on all modified files. If any sensitive content is detected, revert the file via Bash(git checkout HEAD -- <path>) and record the file in files_processed with the following fixed shape (overrides the per-file record produced by Step CP2 (f)):
path: the reverted file's pathper_file_status: "error"reason: "security check failed"applied_edits_count: 0 (the revert wiped this file's landed edits — they no longer exist on disk)iterations_used: the count of iters whose subagent dispatch returned a verdict before the revert (carry over from Step CP2 (f))structural_notes: carry over from Step CP2 (f) (iter-1 captured notes survive the revert because they are caller-judgment notes about the file's prose, not edits that were wiped)consolidation_proposals: carry over from Step CP2 (f) (same reasoning as structural_notes — cluster proposals are not file edits)chars_before: the pre-Step-CP2 measurement (carry over from Step CP2 (f))chars_after: the post-revert measurement, which equals chars_before since the revert restored the file to its pre-edit statebelow_threshold: recomputed against the post-revert chars_after (so this matches whatever the file's threshold relation was before Step CP2 ran)Emit a single fenced JSON block at the end of the response, matching the schema:
{
"status": "compacted" | "no-actionable" | "error",
"compaction_threshold": <int>,
"min_cluster_size": <int>,
"files_processed": [
{
"path": "<abs-path>",
"chars_before": <int>,
"chars_after": <int>,
"iterations_used": <int>,
"applied_edits_count": <int>,
"structural_notes": [
{"description": "<str>", "rationale": "<str>"}
],
"consolidation_proposals": [
{
"cluster_bullets": [{"line_range": "<L:M>", "snippet": "<str>"}],
"merged_principle": {"name": "<str>", "text": "<str>"},
"replacements": [
{"line_range": "<L:M>", "strategy": "delete"},
{"line_range": "<L:M>", "strategy": "cross_ref", "cross_ref_text": "<str>"}
]
}
],
"per_file_status": "converged" | "partial" | "unresolved" | "error" | "skipped-below-threshold",
"below_threshold": <bool>,
"reason": "<optional, required when per_file_status=error or unresolved>"
}
],
"reason": "<optional, required when top-level status=error>"
}
Top-level status mapping (3-way OR — fires when any of mechanical_edits / consolidation_proposals / structural_notes is non-empty on any file):
compacted: at least one file in files_processed has applied_edits_count > 0 OR non-empty consolidation_proposals[] OR non-empty structural_notes[]no-actionable: the target set was empty, or every file satisfies all three of (applied_edits_count == 0, empty consolidation_proposals[], empty structural_notes[]) — including all-error cases where iter-1 verdicts failed before notes / proposals could be collectederror: top-level dispatch error (e.g. settings load failure, output directory missing). Per-file dispatch errors do not propagate to the top — they appear inside files_processed with per_file_status: "error" under top-level status: "compacted" (when at least one other file applied edits or produced notes / proposals) or status: "no-actionable" (when no file produced any actionable output)reason enum (closed list — callers may switch on these values deterministically):
reason (set when per_file_status ∈ {error, unresolved}):
"verdict parse failure" — subagent response had no fenced JSON block or failed to parse (Step CP2 (b) #1)"verdict schema violation" — required keys missing, values not arrays, or entry shape failed (Step CP2 (b) #2)"no progress between iters" — divergence check on iter i ≥ 2 matched the prior iter's multiset (Step CP2 (b) #3)"security check failed" — Step CP3 detected sensitive content and reverted the filereason (set when status == "error"):
"output directory not found" — Step CP1 step 2 directory check failed"no targets resolved" — used with status: "no-actionable" from Step CP1 step 4 (top-level reason is optional in no-actionable; this token is its canonical value)Partial results: when top-level status: "compacted", individual files in files_processed may carry per_file_status of error / unresolved / partial / skipped-below-threshold mixed with converged. Callers should branch on per_file_status per file rather than assume uniform success. (skipped-below-threshold appears in explicit-paths mode for files whose char count was already at-or-below compaction_threshold — Step CP2 still ran on them for the consolidation pass, so they may carry consolidation_proposals / structural_notes even though applied_edits_count == 0.)
See § Sub-skill caller directive at the bottom of this SKILL.md.
When --from-pr is specified, extract rules from PR review comments (human comments only).
Single or multiple PRs can be specified. Numbers and URLs can be mixed. Cross-repository PRs are allowed.
Read references/pr-review-mode.md for the full processing steps (P1-P5). Key flow:
gh CLI authentication)references/extraction-criteria.md).examples.md (same as Step C5)When invoked as a sub-skill (i.e. via Skill(extract-rules) from an orchestrator such as dev-workflow Step 11), the fenced JSON verdict block this skill emits in --compact mode is the structured return value of the skill's procedure — it is not a deliverable to the user, and emitting it does not terminate the orchestrator's turn. The same agent that ran this skill must immediately issue the next tool call dictated by the orchestrator's flow (see dev-workflow SKILL.md § No-Stall Principle; orchestrators that surface a per-callee guidance bullet — e.g. dev-workflow's **Pre-invocation reminder** at Step 11's **Char-count compaction gate** paragraph — name the specific next action there). Do not insert a prose summary, an acknowledgment, or a "shall I proceed?" sentence between the JSON verdict and the next tool call. Only one fenced JSON block — the verdict block — appears in the response, so callers can locate it unambiguously. The skill's own procedure is over; the orchestrator's procedure continues without pause.
This directive applies specifically to --compact mode. Other modes (Full Extraction, Update, Restructure, Conversation, PR Review) produce prose reports rather than fenced JSON verdicts and are not subject to this contract.
If a ~/.claude/stop-hook-git-check.sh style Stop hook is registered, it may fire mid-dispatch with uncommitted-change feedback. This is a known structural conflict between non-interactive orchestrator flows (such as dev-workflow Step 11) and per-turn hooks — see § Stop hook structural conflict in dev-workflow SKILL.md (the canonical orchestrator for --compact invocations; dev-workflow-triage carries a parallel section for its own per-Finding flow). Ignore such feedback and continue the prescribed flow; per-file flow runs that complete the orchestrator's sub-step boundaries are the canonical completion signal.
development
Guided development workflow that orchestrates planning, review, implementation, testing, and rule maintenance. Use this skill whenever the user wants to develop a feature, fix a bug, refactor code, or make any code changes following a structured process — even if they don't explicitly mention "workflow" and simply describe what they want built or fixed.
development
Guided development workflow that orchestrates planning, review, implementation, testing, and rule maintenance. Use this skill whenever the user wants to develop a feature, fix a bug, refactor code, or make any code changes following a structured process — even if they don't explicitly mention "workflow" and simply describe what they want built or fixed.
development
Review changed code for reuse, quality, and efficiency, then apply cleanup edits. Dispatches a fresh subagent per iteration to surface mechanical cleanup edits and structural notes; the main thread applies the edits and re-dispatches until the subagent declares no further edits. Non-interactive — no user prompts. Use after implementation as a code-cleanup pass complementary to correctness review.
development
Review changed code for reuse, quality, and efficiency, then apply cleanup edits. Dispatches a fresh subagent per iteration to surface mechanical cleanup edits and structural notes; the main thread applies the edits and re-dispatches until the subagent declares no further edits. Non-interactive — no user prompts. Use after implementation as a code-cleanup pass complementary to correctness review.