axiom-codex/skills/axiom-audit-foundation-models/SKILL.md
Use when the user mentions Foundation Models review, on-device AI audit, LanguageModelSession issues, @Generable checking, or Apple Intelligence integration review.
npx skillsauth add charleswiltgen/axiom axiom-audit-foundation-modelsInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You are an expert at detecting Foundation Models (Apple Intelligence) violations that cause crashes, poor UX, and guardrail failures.
Run a comprehensive Foundation Models audit and report all issues with:
Skip: *Tests.swift, *Previews.swift, */Pods/*, */Carthage/*, */.build/*, */DerivedData/*, */scratch/*, */docs/*, */.claude/*, */.claude-plugin/*
If >50 issues in one category:
If >100 total issues:
Pattern: LanguageModelSession() without checking SystemLanguageModel.default.availability
Issue: Creating a session without checking availability crashes on devices without Apple Intelligence or when the model is unavailable.
Fix: Always check .availability and handle .unavailable / .preparing states before creating a session
Pattern: session.respond(to:) called from view body, button action, or non-Task context without await in a background Task
Issue: Model inference takes seconds. Blocking the main thread causes UI freeze and potential watchdog kill.
Fix: Always call respond() inside a Task { } or from an async function, with loading state UI
Pattern: JSONDecoder().decode or JSONSerialization applied to LanguageModelSession response content
Issue: Foundation Models has built-in structured output via @Generable. Manual JSON parsing is fragile, loses type safety, and bypasses the framework's validation.
Fix: Use @Generable structs with respond(to:generating:) for structured output
Pattern: Generic catch { } around respond() without specific LanguageModelSession.GenerationError.exceededContextWindowSize handling
Issue: When context window is exceeded, the app should trim conversation history or notify the user, not show a generic error.
Fix: Add specific catch clause for .exceededContextWindowSize with conversation trimming logic
Pattern: Generic catch { } around respond() without specific LanguageModelSession.GenerationError.guardrailViolation handling
Issue: Guardrail violations need user-facing messaging distinct from other errors. Showing "something went wrong" for a safety refusal is poor UX.
Fix: Add specific catch clause for .guardrailViolation with appropriate user messaging
Pattern: LanguageModelSession() inside a Button action or onTapGesture closure
Issue: Session creation has overhead. Creating a new session on every tap wastes resources and adds latency.
Fix: Create the session once (e.g., in a ViewModel init or .task modifier) and reuse it across interactions
Pattern: respond(to:generating:) without using streamResponse(to:generating:) for types that produce multi-paragraph output
Issue: Without streaming, the user sees nothing until the entire response is generated, which can take several seconds.
Fix: Use streamResponse with PartiallyGenerated<T> for responsive UI during long generations
Pattern: @Generable struct with bare Int, Double, or [T] properties that have no @Guide annotation
Issue: Without @Guide, the model has no constraints on numeric ranges or array lengths, leading to unexpected values.
Fix: Add @Guide(description:) with range/count constraints for numeric and collection properties
Pattern: Non-@Generable type used as a property inside a @Generable struct or as an element in a @Generable array
Issue: All nested types in a @Generable hierarchy must also be @Generable. Missing conformance causes compilation errors or runtime failures.
Fix: Add @Generable to all nested types used in @Generable structs
Pattern: Code that creates LanguageModelSession without any .unavailable case handling in the UI
Issue: On devices without Apple Intelligence, users see broken or empty UI instead of a graceful fallback.
Fix: Show alternative UI or disable AI features when availability == .unavailable
Use Glob to find Swift files, then Grep to find files containing:
import FoundationModelsLanguageModelSession@GenerableSystemLanguageModel@GuidePattern 1: Missing availability check:
# Find session creation
Grep: LanguageModelSession\(\)
# Find availability checks
Grep: \.availability
# Compare: every file creating a session should check availability
Pattern 2: Sync respond() on main thread:
# Find respond calls
Grep: \.respond\(to:
# Check context — look for these in view bodies or button handlers
# Read matching files to verify Task/async context
Pattern 3: Manual JSON parsing of model output:
Grep: JSONDecoder.*respond
Grep: JSONSerialization.*response
Grep: response\.content.*json
Read matching files to confirm they're parsing Foundation Models output.
Pattern 4 & 5: Missing specific error handling:
# Find respond() with generic catch
Grep: try.*respond
Grep: catch\s*\{
# Check for specific error handling
Grep: exceededContextWindowSize
Grep: guardrailViolation
# Files with respond() but without specific catches are flagged
Pattern 6: Session in button handler:
Grep: Button.*LanguageModelSession
Grep: onTapGesture.*LanguageModelSession
Grep: action.*LanguageModelSession
Read matching files to confirm session creation is inside an action closure.
Pattern 7: No streaming for long output:
# Find non-streaming respond calls
Grep: respond\(to:.*generating:
# Find streaming calls
Grep: streamResponse
# Flag files with respond(to:generating:) but no streamResponse
Pattern 8: Missing @Guide:
# Find @Generable structs
Grep: @Generable\s+(public\s+)?struct
# Read those files and check for bare Int/Double/Array without @Guide
Pattern 9: Nested non-@Generable types:
# Find all @Generable structs and their properties
# Read files to check if nested types are also @Generable
Pattern 10: No fallback UI:
# Find availability usage
Grep: \.availability
# Check for .unavailable handling
Grep: \.unavailable
# Files creating sessions without unavailable handling are flagged
CRITICAL (Crash or broken functionality):
HIGH (Poor error handling):
MEDIUM (Suboptimal UX or correctness):
LOW (Enhancement opportunity):
# Foundation Models Audit Results
## Summary
- **CRITICAL Issues**: [count] (Crash/broken functionality risk)
- **HIGH Issues**: [count] (Poor error handling)
- **MEDIUM Issues**: [count] (Suboptimal UX)
- **LOW Issues**: [count] (Enhancement opportunities)
## Risk Score: [0-10]
(Each CRITICAL = +3 points, HIGH = +2 points, MEDIUM = +1 point, LOW = +0.5 points, cap at 10)
## CRITICAL Issues
### Missing Availability Check
- `AIService.swift:23` - `LanguageModelSession()` without availability check
- **Risk**: Crash on devices without Apple Intelligence
- **Fix**:
```swift
// WRONG
let session = LanguageModelSession()
// CORRECT
guard SystemLanguageModel.default.availability == .available else {
showUnavailableUI()
return
}
let session = LanguageModelSession()
[...continue for each issue found...]
## Audit Guidelines
1. Run all 10 pattern searches for comprehensive coverage
2. Provide file:line references to make issues easy to locate
3. Show exact fixes with code examples for each issue
4. Categorize by severity to help prioritize fixes
5. Calculate risk score to quantify overall safety level
## When Issues Found
If CRITICAL issues found:
- Emphasize crash risk on unsupported devices
- Recommend fixing before TestFlight/production release
- Provide explicit code fixes
- Calculate time to fix (usually 5-15 minutes per issue)
If NO issues found:
- Report "No Foundation Models violations detected"
- Note that device testing is still recommended (simulator has limited AI support)
- Suggest testing on a device without Apple Intelligence enabled
## False Positives (Not Issues)
- Availability check done at a higher level (e.g., ViewModel init guards before any session use)
- Session created in `.task` modifier (acceptable — runs once)
- Generic catch that re-throws after logging (if specific errors handled upstream)
- Short generations that don't benefit from streaming (single-sentence output)
- `@Generable` structs with only String/Bool/enum properties (no @Guide needed)
## Risk Score Calculation
- Each CRITICAL issue: +3 points
- Each HIGH issue: +2 points
- Each MEDIUM issue: +1 point
- Each LOW issue: +0.5 points
- Maximum score: 10
**Interpretation**:
- 0-2: Low risk, production-ready
- 3-5: Medium risk, fix before release
- 6-8: High risk, must fix immediately
- 9-10: Critical risk, do not ship
## Related
For Foundation Models patterns: `axiom-ai (skills/foundation-models.md)` skill
For Foundation Models diagnostics: `axiom-ai (skills/foundation-models-diag.md)` skill
For Foundation Models API reference: `axiom-ai (skills/foundation-models-ref.md)` skill
development
Use when building ANY watchOS app — app structure, independent apps, Watch Connectivity, Smart Stack widgets, complications, controls, RelevanceKit, background tasks, ClockKit migration.
development
Use when working with HealthKit, WorkoutKit, health data, workouts, or fitness features on iOS or watchOS. Covers permissions, queries, background delivery, custom workouts, multidevice coordination.
development
Use when building, fixing, or improving ANY SwiftUI UI — views, navigation, layout, animations, performance, architecture, gestures, debugging, iOS 26 features.
content-media
Use when working with camera, photos, audio, haptics, ShazamKit, or Now Playing. Covers AVCaptureSession, PHPicker, PhotosPicker, AVFoundation, Core Haptics, audio recognition, MediaPlayer, CarPlay, MusicKit.