Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

rshankras/characterization-test-generator

Name: characterization-test-generator
Author: rshankras

skills/testing/characterization-test-generator/SKILL.md

npx skillsauth add rshankras/claude-code-apple-skills characterization-test-generator

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Characterization Test Generator

Generate tests that document what existing code actually does — not what it should do. These tests capture current behavior so you can refactor with confidence, especially when using AI to modify code.

When This Skill Activates

Use this skill when the user:

Says "I need to refactor this" or "help me refactor"
Wants to "add tests before changing code"
Asks for "characterization tests" or "golden master tests"
Says "I want to safely modify this class/module"
Mentions "legacy code" or "untested code"
Wants AI to refactor but is worried about breaking things

Why This Matters for AI-Generated Code

AI refactoring is powerful but risky:

AI doesn't remember why code works a certain way
AI may "improve" code and break subtle behavior
Characterization tests freeze current behavior as a regression suite
If any test fails after refactoring, you know something changed

Process

Phase 1: Discover Code Under Test

Glob: **/*.swift (in the target module)
Grep: "class |struct |enum |protocol |func " (public API surface)

Identify:

[ ] Public types and their public methods
[ ] Input types and return types
[ ] Side effects (network, disk, UserDefaults, notifications)
[ ] Dependencies (injected or created internally)
[ ] State mutations (properties that change)

Phase 2: Classify Behavior

For each public method, classify:

| Category | Example | Test Strategy | |----------|---------|---------------| | Pure computation | calculate(a, b) -> c | Input/output pairs | | State mutation | addItem(_:) modifies array | Before/after state checks | | Async operation | fetchData() async -> [Item] | Mock dependencies, verify results | | Side effect | save() writes to disk | Verify mock was called | | Event emission | delegate?.didUpdate() | Capture delegate calls | | Error path | Throws on invalid input | Verify correct error type |

Phase 3: Generate Tests

Test Naming Convention

Characterization tests should be clearly labeled:

import Testing
@testable import YourApp

@Suite("Characterization: ItemManager")
struct ItemManagerCharacterizationTests {
    // Tests document CURRENT behavior, not ideal behavior
}

Template: Pure Computation

@Test("current behavior: calculates total with tax")
func calculatesTotal() {
    let calculator = PriceCalculator()

    let result = calculator.total(subtotal: 100.0, taxRate: 0.08)

    // Document actual behavior — even if the rounding seems wrong
    #expect(result == 108.0)
}

Template: State Mutation

@Test("current behavior: addItem appends and sorts by date")
func addItemSortsbyDate() {
    let manager = ItemManager()
    let older = Item(title: "A", date: .distantPast)
    let newer = Item(title: "B", date: .now)

    manager.addItem(older)
    manager.addItem(newer)

    // Document actual ordering behavior
    #expect(manager.items.first?.title == "A")
    #expect(manager.items.last?.title == "B")
    #expect(manager.items.count == 2)
}

Template: Async with Dependencies

@Test("current behavior: loadItems returns cached data when offline")
func loadItemsOffline() async throws {
    let mockNetwork = MockNetworkClient(shouldFail: true)
    let mockCache = MockCache(items: [Item.sample])
    let service = ItemService(network: mockNetwork, cache: mockCache)

    let items = try await service.loadItems()

    // Document: falls back to cache when network fails
    #expect(items.count == 1)
    #expect(items.first?.title == Item.sample.title)
}

Template: Side Effects

@Test("current behavior: save writes to UserDefaults")
func saveWritesToDefaults() {
    let defaults = MockUserDefaults()
    let settings = SettingsManager(defaults: defaults)

    settings.setTheme(.dark)
    settings.save()

    // Document: saves as string, not as enum raw value
    #expect(defaults.lastSetValue as? String == "dark")
    #expect(defaults.lastSetKey == "app_theme")
}

Template: Error Paths

@Test("current behavior: throws on empty title")
func throwsOnEmptyTitle() {
    let validator = ItemValidator()

    #expect(throws: ValidationError.emptyField("title")) {
        try validator.validate(Item(title: "", date: .now))
    }
}

Template: Edge Cases

@Test("current behavior: handles nil optional gracefully")
func handlesNilOptional() {
    let parser = DataParser()

    let result = parser.parse(data: nil)

    // Document: returns empty array on nil, doesn't crash
    #expect(result.isEmpty)
}

@Test("current behavior: handles empty collection")
func handlesEmptyCollection() {
    let aggregator = StatsAggregator()

    let stats = aggregator.compute(values: [])

    // Document: returns zeroes, not NaN or crash
    #expect(stats.average == 0.0)
    #expect(stats.count == 0)
}

Phase 4: Fill in Actual Values

This is the key step. For each test:

Read the source code to understand what the method actually returns
Trace the logic for the given inputs
Write the assertion with the actual value, even if it seems wrong

// ❌ Wrong — this is what you WANT it to do
#expect(result == expectedCorrectValue)

// ✅ Right — this is what it ACTUALLY does
#expect(result == actualCurrentValue)  // Note: off-by-one, but current behavior

Add comments for surprising behavior:

// CHARACTERIZATION: This returns 11 not 10 due to inclusive range.
// Don't "fix" this until intentionally changing behavior.
#expect(range.count == 11)

Phase 5: Verify and Lock

Run all characterization tests — they must ALL pass
If any fail, adjust assertions to match actual behavior
Tag them so they're easy to find later:

@Suite("Characterization: ItemManager")
@Tag(.characterization)
struct ItemManagerCharacterizationTests { ... }

// Define the tag
extension Tag {
    @Tag static var characterization: Self
}

Run selectively:

# Run only characterization tests
xcodebuild test -scheme YourApp \
  -only-testing "YourAppTests/ItemManagerCharacterizationTests"

Output Format

## Characterization Tests Generated

**Module**: [Module name]
**Classes tested**: [List]
**Tests generated**: [Count]

### Coverage Summary

| Class | Methods Covered | Edge Cases | Notes |
|-------|----------------|------------|-------|
| ItemManager | 5/7 | 3 | 2 private methods skipped |
| PriceCalculator | 3/3 | 2 | All public API covered |

### Files Created
- `Tests/CharacterizationTests/ItemManagerCharacterizationTests.swift`
- `Tests/CharacterizationTests/PriceCalculatorCharacterizationTests.swift`

### Surprising Behaviors Found
- `PriceCalculator.total()` rounds DOWN, not to nearest cent
- `ItemManager.sort()` is unstable — equal dates may reorder

### Ready to Refactor
All [X] characterization tests passing. Safe to refactor with AI.
Run `xcodebuild test` after each change to verify no behavior changed.

When NOT to Use This

Code is already well-tested (check coverage first)
You're deleting the code entirely (no need to characterize)
The code is trivially simple (single-line computed properties)
You want to change the behavior (use tdd-bug-fix instead)

References

Michael Feathers, Working Effectively with Legacy Code — coined "characterization test"
generators/test-generator/ — for standard test generation (not characterization)
testing/tdd-refactor-guard/ — pre-refactor checklist that uses these tests

rshankras/characterization-test-generator

skills/testing/characterization-test-generator/SKILL.md

Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code.

172 stars

development

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add rshankras/claude-code-apple-skills characterization-test-generator

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 10:44 PM4.4s1 file scanned

SKILL.md

name:: characterization-test-generator
description:: Generates tests that capture current behavior of existing code before refactoring. Use when you need a safety net before AI-assisted refactoring or modifying legacy code.
allowed-tools:: [Read, Write, Edit, Glob, Grep, Bash, AskUserQuestion]

Characterization Test Generator

When This Skill Activates

Use this skill when the user:

Says "I need to refactor this" or "help me refactor"
Wants to "add tests before changing code"
Asks for "characterization tests" or "golden master tests"
Says "I want to safely modify this class/module"
Mentions "legacy code" or "untested code"
Wants AI to refactor but is worried about breaking things

Why This Matters for AI-Generated Code

AI refactoring is powerful but risky:

AI doesn't remember why code works a certain way
AI may "improve" code and break subtle behavior
Characterization tests freeze current behavior as a regression suite
If any test fails after refactoring, you know something changed

Process

Phase 1: Discover Code Under Test

Glob: **/*.swift (in the target module)
Grep: "class |struct |enum |protocol |func " (public API surface)

Identify:

[ ] Public types and their public methods
[ ] Input types and return types
[ ] Side effects (network, disk, UserDefaults, notifications)
[ ] Dependencies (injected or created internally)
[ ] State mutations (properties that change)

Phase 2: Classify Behavior

For each public method, classify:

Phase 3: Generate Tests

Test Naming Convention

Characterization tests should be clearly labeled:

import Testing
@testable import YourApp

@Suite("Characterization: ItemManager")
struct ItemManagerCharacterizationTests {
    // Tests document CURRENT behavior, not ideal behavior
}

Template: Pure Computation

@Test("current behavior: calculates total with tax")
func calculatesTotal() {
    let calculator = PriceCalculator()

    let result = calculator.total(subtotal: 100.0, taxRate: 0.08)

    // Document actual behavior — even if the rounding seems wrong
    #expect(result == 108.0)
}

Template: State Mutation

@Test("current behavior: addItem appends and sorts by date")
func addItemSortsbyDate() {
    let manager = ItemManager()
    let older = Item(title: "A", date: .distantPast)
    let newer = Item(title: "B", date: .now)

    manager.addItem(older)
    manager.addItem(newer)

    // Document actual ordering behavior
    #expect(manager.items.first?.title == "A")
    #expect(manager.items.last?.title == "B")
    #expect(manager.items.count == 2)
}

Template: Async with Dependencies

@Test("current behavior: loadItems returns cached data when offline")
func loadItemsOffline() async throws {
    let mockNetwork = MockNetworkClient(shouldFail: true)
    let mockCache = MockCache(items: [Item.sample])
    let service = ItemService(network: mockNetwork, cache: mockCache)

    let items = try await service.loadItems()

    // Document: falls back to cache when network fails
    #expect(items.count == 1)
    #expect(items.first?.title == Item.sample.title)
}

Template: Side Effects

@Test("current behavior: save writes to UserDefaults")
func saveWritesToDefaults() {
    let defaults = MockUserDefaults()
    let settings = SettingsManager(defaults: defaults)

    settings.setTheme(.dark)
    settings.save()

    // Document: saves as string, not as enum raw value
    #expect(defaults.lastSetValue as? String == "dark")
    #expect(defaults.lastSetKey == "app_theme")
}

Template: Error Paths

@Test("current behavior: throws on empty title")
func throwsOnEmptyTitle() {
    let validator = ItemValidator()

    #expect(throws: ValidationError.emptyField("title")) {
        try validator.validate(Item(title: "", date: .now))
    }
}

Template: Edge Cases

@Test("current behavior: handles nil optional gracefully")
func handlesNilOptional() {
    let parser = DataParser()

    let result = parser.parse(data: nil)

    // Document: returns empty array on nil, doesn't crash
    #expect(result.isEmpty)
}

@Test("current behavior: handles empty collection")
func handlesEmptyCollection() {
    let aggregator = StatsAggregator()

    let stats = aggregator.compute(values: [])

    // Document: returns zeroes, not NaN or crash
    #expect(stats.average == 0.0)
    #expect(stats.count == 0)
}

Phase 4: Fill in Actual Values

This is the key step. For each test:

Read the source code to understand what the method actually returns
Trace the logic for the given inputs
Write the assertion with the actual value, even if it seems wrong

// ❌ Wrong — this is what you WANT it to do
#expect(result == expectedCorrectValue)

// ✅ Right — this is what it ACTUALLY does
#expect(result == actualCurrentValue)  // Note: off-by-one, but current behavior

Add comments for surprising behavior:

// CHARACTERIZATION: This returns 11 not 10 due to inclusive range.
// Don't "fix" this until intentionally changing behavior.
#expect(range.count == 11)

Phase 5: Verify and Lock

Run all characterization tests — they must ALL pass
If any fail, adjust assertions to match actual behavior
Tag them so they're easy to find later:

@Suite("Characterization: ItemManager")
@Tag(.characterization)
struct ItemManagerCharacterizationTests { ... }

// Define the tag
extension Tag {
    @Tag static var characterization: Self
}

Run selectively:

# Run only characterization tests
xcodebuild test -scheme YourApp \
  -only-testing "YourAppTests/ItemManagerCharacterizationTests"

Output Format

## Characterization Tests Generated

**Module**: [Module name]
**Classes tested**: [List]
**Tests generated**: [Count]

### Coverage Summary

| Class | Methods Covered | Edge Cases | Notes |
|-------|----------------|------------|-------|
| ItemManager | 5/7 | 3 | 2 private methods skipped |
| PriceCalculator | 3/3 | 2 | All public API covered |

### Files Created
- `Tests/CharacterizationTests/ItemManagerCharacterizationTests.swift`
- `Tests/CharacterizationTests/PriceCalculatorCharacterizationTests.swift`

### Surprising Behaviors Found
- `PriceCalculator.total()` rounds DOWN, not to nearest cent
- `ItemManager.sort()` is unstable — equal dates may reorder

### Ready to Refactor
All [X] characterization tests passing. Safe to refactor with AI.
Run `xcodebuild test` after each change to verify no behavior changed.

When NOT to Use This

Code is already well-tested (check coverage first)
You're deleting the code entirely (no need to characterize)
The code is trivially simple (single-line computed properties)
You want to change the behavior (use tdd-bug-fix instead)

References

Michael Feathers, Working Effectively with Legacy Code — coined "characterization test"
generators/test-generator/ — for standard test generation (not characterization)
testing/tdd-refactor-guard/ — pre-refactor checklist that uses these tests

Related Skills

rshankras/external-purchases

development

Community

US web checkout via the StoreKit External Purchase Link entitlement — currently 0% Apple commission (litigation ongoing), how to ship it safely, and how to architect for a commission flip so a future ruling is a config change, not a rewrite. Use when adding external purchase links, weighing web checkout vs IAP, or planning US-storefront pricing strategy.

562SKILL.mdUpdated Jul 14, 2026

rshankras/external-purchases

Security Scans

mcp-scan — Pending Scan

Semgrep — Pending Scan

Trivy — Pending Scan

OWASP — Pending Scan

VirusTotal — Pending Scan

rshankras/bundles-and-licensing

tools

VerifiedTrustedCommunity

Revenue beyond the single-app price tag — own-app bundles, Family Sharing as a conversion lever, cross-developer bundles & suites, and institutional licensing via Group Purchases / Apple School & Business Manager. Use when a developer has multiple apps, a subscription worth sharing, complementary indie partners, or school/clinic/business buyers.

562SKILL.mdUpdated Jul 14, 2026

rshankras/bundles-and-licensing

rshankras/accessibility-audit

testing

VerifiedTrustedCommunity

Run a structured accessibility audit on an iOS/macOS app — automated XCUITest audits, Accessibility Inspector, manual VoiceOver/Dynamic Type passes, and App Store Accessibility Nutrition Label evaluation. Use before release, when preparing Nutrition Label declarations, or for EU Accessibility Act compliance.

562SKILL.mdUpdated Jul 14, 2026

rshankras/accessibility-audit

rshankras/store-growth-audit

tools

VerifiedTrustedCommunity

Stage-by-stage audit of an app's App Store growth machinery against a 54-item P0–P9 playbook — every item scored from an App Store Connect MCP call, a codebase check, or an explicit question to the user, then routed to the skill or command that fixes it. Read-only on App Store Connect. Use for a growth audit or scorecard, a pre-launch growth plan, a quarterly re-audit, or "which growth levers am I missing."

562SKILL.mdUpdated Jul 14, 2026

rshankras/store-growth-audit

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/rshankras/claude-code-apple-skills.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-apple-skills/skills/testing/characterization-test-generator ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

rshankras/claude-code-apple-skills

172 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT