skills/coreml/SKILL.md
Integrate Core ML models in iOS apps for on-device machine learning inference. Covers model loading (.mlmodel, .mlpackage, .mlmodelc), predictions with auto-generated classes and MLFeatureProvider, compute unit configuration (CPU, GPU, Neural Engine), MLTensor, VNCoreMLRequest, MLComputePlan, multi-model pipelines, and deployment strategies. Use when loading Core ML models, making predictions, configuring compute units, or profiling model performance.
npx skillsauth add dpearson2699/swift-ios-skills coremlInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Load, configure, and run Core ML models in iOS apps. This skill covers the Swift side: model loading, prediction, MLTensor, profiling, and deployment. Target iOS 26+ with Swift 6.3, backward-compatible to iOS 14 unless noted.
Scope boundary: Python-side model conversion, optimization (quantization, palettization, pruning), and framework selection live in the
apple-on-device-aiskill. This skill owns Swift integration only.
See references/coreml-swift-integration.md for complete code patterns including actor-based caching, batch inference, image preprocessing, and testing.
When you add a .mlmodel or .mlpackage to an app target, Xcode generates a Swift
class with typed input/output. Use this whenever possible.
import CoreML
let config = MLModelConfiguration()
config.computeUnits = .all
let model = try MyImageClassifier(configuration: config)
Load from a URL when the model is downloaded at runtime or stored outside the bundle.
let modelURL = Bundle.main.url(
forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)
Load models without blocking the main thread. Prefer this for large models.
let model = try await MLModel.load(
contentsOf: modelURL,
configuration: config
)
Compile a .mlpackage or .mlmodel to .mlmodelc on device. Useful for
models downloaded from a server. Do this once per model version, not on every
launch.
let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try await MLModel.load(contentsOf: compiledURL, configuration: config)
Cache the compiled URL -- recompiling on every launch is a bug. Copy
compiledURL to a persistent location (e.g., Application Support). When
reviewing runtime-loaded models, call out both facts together: async
MLModel.compileModel(at:) is iOS 16+, and compiled models must be cached so the
app does not recompile on every launch.
MLModelConfiguration controls compute units, GPU access, and model parameters.
| Value | Uses | When to Choose |
|---|---|---|
| .all | CPU + GPU + Neural Engine | Default. Let the system decide. |
| .cpuOnly | CPU | Deterministic tests, CPU-only fallbacks, or constrained work after profiling shows accelerator policy, contention, thermal state, or energy budget is the limiting factor. |
| .cpuAndGPU | CPU + GPU | Need GPU but model has ops unsupported by ANE. |
| .cpuAndNeuralEngine (iOS 16+) | CPU + Neural Engine | Best energy efficiency for compatible models. |
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
// Optional fallback for constrained work after profiling and policy review
config.computeUnits = .cpuOnly
let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // faster, slight precision loss
The generated class provides typed input/output structs.
let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel) // "golden_retriever"
print(output.classLabelProbs) // ["golden_retriever": 0.95, ...]
Use when inputs are dynamic or not known at compile time.
let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
"image": MLFeatureValue(pixelBuffer: pixelBuffer),
"confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue
MLModel.prediction(...) is synchronous. In async pipelines, keep model loading
async, then run prediction from an actor or non-main task without adding await
to the prediction call.
let output = try model.prediction(from: inputFeatures)
Process multiple inputs in one call for better throughput.
let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(fromBatch: batchInputs)
for i in 0..<batchOutput.count {
let result = batchOutput.features(at: i)
print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}
Use predictions(fromBatch:) when batching without explicit
MLPredictionOptions. Use predictions(from:options:) only when passing both an
MLBatchProvider and MLPredictionOptions; predictions(from:) by itself is
not the no-options batch API.
Use MLState for models that maintain state across predictions (sequence models,
LLMs, audio accumulators). Create state once and pass it to each prediction call.
let state = model.makeState()
// Each synchronous prediction carries forward the internal model state
for frame in audioFrames {
let input = try MLDictionaryFeatureProvider(dictionary: [
"audio_features": MLFeatureValue(multiArray: frame)
])
let output = try model.prediction(from: input, using: state)
let classification = output.featureValue(for: "label")?.stringValue
}
MLState is Sendable, but Sendable does not make one state safe for
concurrent inference. Predictions using the same state must be serialized; do
not read or write state buffers while a prediction is in flight. Call
model.makeState() for each independent concurrent stream. If you need
MLPredictionOptions, iOS 18+ also provides the async
prediction(from:using:options:) overload; the same one-in-flight-per-state rule
still applies.
MLTensor is a Swift-native multidimensional array for pre/post-processing.
Operations run lazily -- call await tensor.shapedArray(of:) to materialize results.
import CoreML
// Creation
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)
// Reshaping
let reshaped = tensor.reshaped(to: [2, 2])
// Math operations
let softmaxed = tensor.softmax(alongAxis: -1)
let centered = tensor - tensor.mean()
// Interop with MLShapedArray / MLMultiArray
let shaped = await tensor.shapedArray(of: Float.self)
let multiArray = try MLMultiArray(shaped)
let shapedAgain = MLShapedArray<Float>(multiArray)
Do not invent MLTensor APIs for statistics or bridging. Avoid examples such as
MLTensor(multiArray), tensor.std(), tensor.standardDeviation(), direct
lazy-buffer access, or synchronous extraction; perform unsupported DSP/statistics
outside the tensor pipeline or with source-confirmed tensor operations.
MLMultiArray is the primary data exchange type for non-image model inputs and
outputs. Use it when the auto-generated class expects array-type features.
// Create a 3D array: [batch, sequence, features]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)
// Write values
for i in 0..<128 {
array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}
// Read values
let value = array[[0, 0, 0] as [NSNumber]].floatValue
let data: [Float] = [1.0, 2.0, 3.0]
let shaped = MLShapedArray(scalars: data, shape: [3])
let fromShaped = try MLMultiArray(shaped)
See references/coreml-swift-integration.md for advanced MLMultiArray patterns including NLP tokenization and audio feature extraction.
Image models expect CVPixelBuffer input. Use CGImage conversion for photos
from the camera or photo library. Vision's VNCoreMLRequest handles this
automatically; manual conversion is needed only for direct MLModel prediction.
import CoreVideo
func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
var pixelBuffer: CVPixelBuffer?
let attrs: [CFString: Any] = [
kCVPixelBufferCGImageCompatibilityKey: true,
kCVPixelBufferCGBitmapContextCompatibilityKey: true,
]
CVPixelBufferCreate(kCFAllocatorDefault, width, height,
kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)
guard let buffer = pixelBuffer else { return nil }
CVPixelBufferLockBaseAddress(buffer, [])
let context = CGContext(
data: CVPixelBufferGetBaseAddress(buffer),
width: width, height: height,
bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
space: CGColorSpaceCreateDeviceRGB(),
bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
)
context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
CVPixelBufferUnlockBaseAddress(buffer, [])
return buffer
}
For additional preprocessing patterns (normalization, center-cropping), see references/coreml-swift-integration.md.
Chain models when preprocessing or postprocessing requires a separate model.
// Sequential inference: preprocessor -> main model -> postprocessor
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)
For Xcode-managed pipelines, use the pipeline model type in the .mlpackage.
Each sub-model runs on its optimal compute unit.
Use Vision to run Core ML image models with automatic image preprocessing (resizing, normalization, color space, orientation).
import Vision
import CoreML
let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)
if let classification = results.first as? ClassificationObservation {
print("\(classification.identifier): \(classification.confidence)")
}
let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
for observation in results {
let label = observation.labels.first?.identifier ?? "unknown"
let confidence = observation.labels.first?.confidence ?? 0
let boundingBox = observation.boundingBox // normalized coordinates
print("\(label): \(confidence) at \(boundingBox)")
}
}
request.imageCropAndScaleOption = .scaleFill
let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])
For complete Vision framework patterns (text recognition, barcode detection, document scanning), see the
vision-frameworkskill.
Inspect which compute device each operation will use before running predictions.
let computePlan = try await MLComputePlan.load(
contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }
for operation in mainFunction.block.operations {
let deviceUsage = computePlan.deviceUsage(for: operation)
let estimatedCost = computePlan.estimatedCost(of: operation)
print("\(operation.operatorName): \(String(describing: deviceUsage?.preferred))")
}
Use the Core ML instrument template in Instruments to profile:
Run outside the debugger for accurate results (Xcode: Product > Profile).
| Strategy | Pros | Cons | |---|---|---| | Bundle in app | Instant availability, works offline | Increases app download size | | Background Assets | Preferred for large or updateable model assets | Requires asset-pack setup | | On-demand resources | Smaller initial download for existing ODR apps | Legacy technology; prefer Background Assets for new work | | CloudKit / server | Maximum flexibility | Requires network, longer setup |
.mlmodelc to skip on-device compilation.mlmodel or .mlpackage files, compile once with
MLModel.compileModel(at:), move the resulting .mlmodelc out of Core ML's
temporary location, and cache it by model version.For Background Assets, make the asset pack locally available, resolve the model
URL, then load the compiled model with MLModel.load(contentsOf:configuration:).
// Existing On-Demand Resources project
let request = NSBundleResourceRequest(tags: ["ml-model-v2"])
try await request.beginAccessingResources()
let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")!
let model = try await MLModel.load(contentsOf: modelURL, configuration: config)
// Call request.endAccessingResources() when done
.all by default. Consider .cpuOnly
only when profiling or app policy shows accelerator contention, thermal state,
energy budget, deterministic testing, or a legitimate background execution
constraint makes CPU the right tradeoff.MLModel instances from the same
compiled model. Use an actor to provide shared access.UIApplication.didReceiveMemoryWarningNotification and release
cached models when under pressure.See references/coreml-swift-integration.md for an actor-based model manager with lifecycle-aware loading and cache eviction.
DON'T: Load models on the main thread.
DO: Use MLModel.load(contentsOf:configuration:) async API or load on a background actor.
Why: Large models can take seconds to load, freezing the UI.
DON'T: Recompile .mlpackage to .mlmodelc on every app launch.
DO: Compile once with MLModel.compileModel(at:) and cache the compiled URL persistently.
Why: Compilation is expensive. Cache the .mlmodelc in Application Support.
DON'T: Hardcode .cpuOnly unless you have a specific reason.
DO: Use .all and let the system choose the optimal compute unit.
Why: .all enables Neural Engine and GPU, which are faster and more energy-efficient.
DON'T: Claim GPU or Neural Engine are categorically unavailable for all
background-adjacent work.
DO: Treat background execution as policy-, mode-, contention-, thermal-, and
energy-dependent, and profile the actual workload on device.
Why: Apps may be suspended, throttled, or limited by their background mode;
.cpuOnly is a tradeoff, not a universal requirement.
DON'T: Ignore MLFeatureValue type mismatches between input and model expectations.
DO: Match types exactly -- use MLFeatureValue(pixelBuffer:) for images, not raw data.
Why: Type mismatches cause cryptic runtime crashes or silent incorrect results.
DON'T: Create a new MLModel instance for every prediction.
DO: Load once and reuse. Use an actor to manage the model lifecycle.
Why: Model loading allocates significant memory and compute resources.
DON'T: Skip error handling for model loading and prediction. DO: Catch errors and provide fallback behavior when the model fails. Why: Models can fail to load on older devices or when resources are constrained.
DON'T: Assume all operations run on the Neural Engine.
DO: Use MLComputePlan (iOS 17.4+) to verify device dispatch per operation.
Why: Unsupported operations fall back to CPU, which may bottleneck the pipeline.
DON'T: Process images manually before passing to Vision + Core ML.
DO: Use CoreMLRequest (iOS 18+) or VNCoreMLRequest (legacy) to let Vision handle preprocessing.
Why: Vision handles orientation, scaling, and pixel format conversion correctly.
MLModelConfiguration.computeUnits set appropriately for use caseCoreMLRequest iOS 18+ or VNCoreMLRequest) for correct preprocessingMLComputePlan checked to verify compute device dispatch (iOS 17.4+)apple-on-device-ai skilldevelopment
Implement, review, or improve data visualizations using Swift Charts. Use when building bar, line, area, point, pie, donut, or iOS 26 3D charts; when adding chart selection, scrolling, annotations, axes, scales, legends, or foregroundStyle grouping; when plotting functions with BarPlot, LinePlot, AreaPlot, PointPlot, Chart3D, or SurfacePlot; or when creating heat maps, Gantt charts, grouped bars, sparklines, threshold lines, or spatial visualizations.
data-ai
Select, implement, or migrate between app architecture patterns for Apple platform apps. Use when choosing between MV (Model-View with @Observable), MVVM, MVI, TCA (The Composable Architecture), Clean Architecture, VIPER, or Coordinator patterns; when evaluating architecture fit for a feature's complexity; when migrating from one pattern to another; or when reviewing whether an app's current architecture is appropriate. Scoped to Apple-platform patterns using Swift 6.3, SwiftUI, and UIKit.
development
Apply Swift API Design Guidelines to name, label, and document Swift APIs. Covers argument label rules (prepositional phrase rule, grammatical phrase rule, first-label omission), mutating/nonmutating pair naming (-ed/-ing participle pattern, form- prefix, sort/sorted, formUnion/union), side-effect naming (noun for pure, verb for mutating), documentation comment structure (summary by declaration kind, O(1) complexity rule), clarity at call site, role-based naming, protocol naming (-able/-ible/-ing), default arguments over method families, casing conventions, and terminology. Use when designing new Swift APIs, reviewing naming and argument labels, writing documentation comments, or refactoring for call site clarity.
development
Implement, review, or improve in-app purchases and subscriptions using StoreKit 2. Use when building paywalls with SubscriptionStoreView or ProductView, processing transactions with Product and Transaction APIs, verifying entitlements, handling purchase flows (consumable, non-consumable, auto-renewable), implementing offer codes or promotional/win-back/introductory offers, managing subscription status and renewal state, setting up StoreKit testing with configuration files, or integrating Family Sharing, Ask to Buy, refund handling, and billing retry logic.