dot_claude/skills/refactoring-legacy-code/SKILL.md
Use when modernizing, migrating, or restructuring existing code - systematic approach to safe refactoring through characterization tests, dependency analysis, strangler fig migration, and incremental transformation; ensures no behavior changes without test coverage first | 既存コードのモダナイゼーション、移行、再構築時に使用 - 特性テスト、依存関係分析、ストラングラーフィグ移行、段階的変換による安全なリファクタリングの体系的アプローチ。テストカバレッジなしでの動作変更を防止
npx skillsauth add lv416e/dotfiles refactoring-legacy-codeInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Big-bang rewrites fail. Refactoring without tests creates new bugs. Scope creep turns a "quick cleanup" into a six-month project.
Core principle: ALWAYS write characterization tests before changing legacy code. Understand current behavior before changing it.
Violating the letter of this process is violating the spirit of refactoring.
NO REFACTORING WITHOUT CHARACTERIZATION TESTS FIRST
If you haven't captured existing behavior in tests, you cannot change the code.
No exceptions:
Use for ANY code modernization work:
Use this ESPECIALLY when:
Don't skip when:
You MUST complete each phase before proceeding to the next.
BEFORE touching ANY code:
Map the Dependency Graph
# Example: find all consumers of a module
grep -r "import.*from.*legacy-module" --include="*.ts" src/
grep -r "require.*legacy-module" --include="*.js" src/
# Example: find all callers of a function
grep -rn "legacyFunction\(" --include="*.py" .
Identify the Blast Radius
Catalog Code Smells and Prioritize
Not all smells are equal. Prioritize by impact:
| Priority | Smell | Why | |----------|-------|-----| | Critical | Shared mutable state | Causes race conditions, impossible to reason about | | Critical | No separation of concerns | Changes ripple everywhere | | High | God classes/functions (500+ lines) | Untestable, incomprehensible | | High | Circular dependencies | Prevents modularization | | Medium | Copy-paste duplication | Bugs fixed in one copy, not others | | Medium | Primitive obsession | Stringly-typed code hides bugs | | Low | Naming conventions | Confusing but functional | | Low | Formatting inconsistency | Cosmetic, fix with tooling |
Understand Before Changing
Define the Target State
REQUIRED before any code changes. No exceptions.
Characterization tests capture CURRENT behavior, not DESIRED behavior. They answer: "What does this code actually do?"
Write Tests for Current Behavior
# Characterization test: capture what the code DOES, not what it SHOULD do
def test_legacy_price_calculator_with_negative_quantity():
# Legacy code silently returns 0 for negative quantities
# This might be a bug, but it's CURRENT behavior
result = calculate_price(item="widget", quantity=-5, price=10.00)
assert result == 0 # Captures actual behavior
def test_legacy_price_calculator_with_none_price():
# Legacy code treats None as 0 - probably wrong, but current behavior
result = calculate_price(item="widget", quantity=5, price=None)
assert result == 0 # Captures actual behavior
Cover All Code Paths
Use Approval Testing for Complex Output
For code with complex output (HTML, reports, serialized data):
def test_legacy_report_generator():
result = generate_report(sample_data)
# First run: manually approve the output as "golden"
# Subsequent runs: compare against golden file
assert result == load_golden_file("report_golden.txt")
Test the Boundaries, Not the Internals
Verify Tests Catch Changes
Characterization test coverage must be sufficient before proceeding.
Choose your pattern. Big-bang is NOT an option.
Wrap legacy code. Route new calls through new code. Gradually migrate old calls. Remove legacy when empty.
Phase A: Legacy handles everything
[All Traffic] → [Legacy System]
Phase B: New code wraps legacy, intercepts some paths
[All Traffic] → [Facade/Router]
├── [New Code] (migrated paths)
└── [Legacy Code] (remaining paths)
Phase C: All paths migrated
[All Traffic] → [New Code]
[Legacy Code] ← (dead, remove it)
Implementation:
// Step 1: Create facade that delegates to legacy
class UserServiceFacade {
private legacy = new LegacyUserService();
getUser(id: string): User {
return this.legacy.getUser(id); // Pass-through initially
}
}
// Step 2: Migrate one method at a time
class UserServiceFacade {
private legacy = new LegacyUserService();
private modern = new ModernUserService();
getUser(id: string): User {
return this.modern.getUser(id); // Migrated!
}
updateUser(id: string, data: UserData): void {
return this.legacy.updateUser(id, data); // Not yet migrated
}
}
// Step 3: When all methods migrated, remove facade and legacy
For internal modules you can't wrap with a facade:
// Step 1: Extract interface from legacy
interface DataStore {
get(key: string): Promise<unknown>;
set(key: string, value: unknown): Promise<void>;
}
// Step 2: Legacy implements interface
class LegacyFileStore implements DataStore { /* ... */ }
// Step 3: New implementation
class ModernDatabaseStore implements DataStore { /* ... */ }
// Step 4: Swap via configuration
const store: DataStore = config.useModernStore
? new ModernDatabaseStore()
: new LegacyFileStore();
function processOrder(order: Order): Result {
if (featureFlags.isEnabled('modern-order-processing', order.userId)) {
return modernProcessOrder(order);
}
return legacyProcessOrder(order);
}
Roll out to 1% of users, then 10%, 50%, 100%. Rollback instantly if issues arise.
One transformation at a time. Run tests after EVERY change.
Make ONE Change
Run ALL Tests
# After every single change
npm test # or pytest, or cargo test, or whatever
Commit Frequently
git commit -m "Extract validation logic from OrderProcessor.process()"
git commit -m "Rename UserManager to UserRepository"
git commit -m "Replace deprecated crypto.createCipher with crypto.createCipheriv"
Common Safe Transformations
These are mechanical and should not change behavior:
| Transformation | Risk | Verify | |---------------|------|--------| | Rename (variable, function, class) | Low | Tests pass, grep for old name | | Extract method/function | Low | Tests pass, behavior identical | | Inline method/function | Low | Tests pass | | Move to different file/module | Medium | Tests pass, imports updated | | Extract interface | Low | Tests pass, no behavior change | | Replace inheritance with composition | Medium | Tests pass, behavior identical | | Replace deprecated API | Medium | Tests pass, read migration guide completely | | Change data structure | High | Tests pass, check serialization/persistence |
Parallel Implementation for High-Risk Changes
When transformation risk is high:
def process_payment(order):
legacy_result = legacy_process_payment(order)
modern_result = modern_process_payment(order)
if legacy_result != modern_result:
log.error(f"MISMATCH: legacy={legacy_result}, modern={modern_result}")
# Use legacy result until mismatch rate is 0%
return legacy_result
return modern_result
Remove Dead Code
Update Documentation
Verify End-to-End
Retrospective
If you catch yourself thinking:
ALL of these mean: STOP. Return to Phase 2.
If scope is growing: You're in scope creep. Finish the current refactoring. File tickets for new work. ONE thing at a time.
| Excuse | Reality | |--------|---------| | "Code is too messy to test" | That's exactly why you need tests. Use characterization tests at the boundary. | | "Rewrite will be faster" | Rewrites take 3-10x longer than estimated. You'll re-discover every edge case the hard way. | | "Tests slow us down" | Tests slow you down 10 minutes now. No tests slow you down 10 days later. | | "I know what this code does" | You know what you THINK it does. Characterization tests reveal what it ACTUALLY does. | | "Small change, no test needed" | Small changes compound. One untested change invites another. | | "We'll migrate all at once over the weekend" | Big-bang migrations fail. Always. No exceptions. | | "Legacy code is too coupled to test" | Use seams: subclass-and-override, extract-and-override, parameterize constructor. | | "Refactoring while fixing bugs saves time" | Mixing refactoring with behavior changes makes both harder to verify. Separate them. | | "The old tests are enough" | Old tests test old behavior. Characterization tests verify you understand CURRENT behavior. | | "Feature flags add complexity" | Feature flags add CONTROLLED complexity. Big-bang adds UNCONTROLLED chaos. | | "Nobody understands this code anyway" | That's the reason to go slow, not fast. |
When migrating frameworks (e.g., class components to hooks, jQuery to React, Python 2 to 3):
# Example: find React class components
grep -rn "extends React.Component\|extends Component" --include="*.tsx" src/
# Example: React class to function component codemod
npx react-codemod rename-unsafe-lifecycles ./src
Before touching code, understand the dependency graph:
Step 1: Map imports/requires
module-a → module-b → module-c
→ module-d
module-e → module-b
Step 2: Identify the refactoring order
Leaves first: module-c, module-d (no dependents to break)
Then: module-b (update after c and d are stable)
Then: module-a, module-e (consumers of b)
Step 3: Identify circular dependencies (DANGER)
module-a → module-b → module-a ← BREAK THIS FIRST
Rules:
// Before: God module
// user-management.ts (2000 lines)
export function createUser() { /* ... */ }
export function authenticateUser() { /* ... */ }
export function getUserProfile() { /* ... */ }
export function updateUserPreferences() { /* ... */ }
export function sendUserNotification() { /* ... */ }
export function generateUserReport() { /* ... */ }
// After: Separated by concern
// users/creation.ts
export function createUser() { /* ... */ }
// users/authentication.ts
export function authenticateUser() { /* ... */ }
// users/profile.ts
export function getUserProfile() { /* ... */ }
export function updateUserPreferences() { /* ... */ }
// notifications/user-notifications.ts
export function sendUserNotification() { /* ... */ }
// reports/user-reports.ts
export function generateUserReport() { /* ... */ }
# Before: deprecated API
import warnings
with warnings.catch_warnings():
warnings.simplefilter("error", DeprecationWarning)
# If this throws, you have deprecated API usage
result = legacy_function()
# Systematic replacement
# 1. Find all usages
# 2. Replace one at a time
# 3. Test after each
# 4. Grep again to confirm zero remaining
| Phase | Key Activities | Success Criteria | |-------|---------------|------------------| | 1. Reconnaissance | Map dependencies, catalog smells, define target | Understand blast radius and end state | | 2. Characterization Tests | Test current behavior at boundaries | Intentional change causes test failure | | 3. Migration Strategy | Choose pattern (strangler fig, branch by abstraction) | Written plan with incremental steps | | 4. Execute | One change at a time, test after each, commit frequently | All tests pass after every change | | 5. Cleanup | Remove dead code, update docs, verify end-to-end | No legacy code remaining, all tests green |
This skill requires using:
Complementary skills:
Sometimes Phase 1 reveals:
These are valid outcomes. Not every piece of legacy code needs refactoring. The reconnaissance phase exists to prevent wasted effort.
But: "Too hard to test" is never a valid reason to skip refactoring. It's a reason to invest more in characterization tests.
From refactoring projects:
development
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.
testing
Use when creating new skills, editing existing skills, or verifying skills work before deployment - applies TDD to process documentation by testing with subagents before writing, iterating until bulletproof against rationalization | 新しいスキルの作成、既存スキルの編集、またはデプロイ前にスキルが機能するか検証する際に使用 - プロセスドキュメントにTDDを適用し、記述前にサブエージェントでテストし、合理化に対して堅牢になるまで反復
development
Use when design is complete and you need detailed implementation tasks for engineers with zero codebase context - creates comprehensive implementation plans with exact file paths, complete code examples, and verification steps assuming engineer has minimal domain knowledge | 設計が完了し、コードベースの知識がゼロのエンジニア向けに詳細な実装タスクが必要な場合に使用 - 正確なファイルパス、完全なコード例、検証ステップを含む包括的な実装計画を作成。エンジニアの領域知識が最小限であることを前提
tools
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.