axiom-codex/skills/axiom-vision/SKILL.md
Use when implementing ANY computer vision feature — image analysis, pose detection, person segmentation, subject lifting, text recognition, barcode scanning.
npx skillsauth add charleswiltgen/axiom axiom-visionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You MUST use this skill for ANY computer vision work using the Vision framework.
| Symptom / Task | Reference |
|----------------|-----------|
| Subject segmentation, lifting | See skills/vision-framework.md |
| Hand/body pose detection | See skills/vision-framework.md |
| Text recognition (OCR) | See skills/vision-framework.md |
| Barcode/QR code detection | See skills/vision-framework.md |
| Document scanning | See skills/vision-framework.md |
| DataScannerViewController | See skills/vision-framework.md |
| Structured document extraction (iOS 26+) | See skills/vision-framework.md |
| Isolate object excluding hand | See skills/vision-framework.md |
| Vision framework API reference | See skills/vision-ref.md |
| Visual Intelligence integration (iOS 26+) | See skills/vision-ref.md |
| Subject not detected | See skills/vision-diag.md |
| Hand/body pose missing landmarks | See skills/vision-diag.md |
| Low confidence observations | See skills/vision-diag.md |
| UI freezing during processing | See skills/vision-diag.md |
| Coordinate conversion bugs | See skills/vision-diag.md |
| Text not recognized / wrong chars | See skills/vision-diag.md |
| Barcode not detected | See skills/vision-diag.md |
| DataScanner blank / no items | See skills/vision-diag.md |
| Document edges not detected | See skills/vision-diag.md |
digraph vision {
start [label="Computer vision task" shape=ellipse];
what [label="What do you need?" shape=diamond];
start -> what;
what -> "skills/vision-framework.md" [label="implement feature"];
what -> "skills/vision-ref.md" [label="API reference"];
what -> "skills/vision-ref.md" [label="Visual Intelligence"];
what -> "skills/vision-diag.md" [label="something broken"];
}
skills/vision-framework.mdskills/vision-ref.md (Visual Intelligence section)skills/vision-ref.mdskills/vision-diag.mdImplementation (skills/vision-framework.md):
Diagnostics (skills/vision-diag.md):
| Thought | Reality | |---------|---------| | "Vision framework is just a request/handler pattern" | Vision has coordinate conversion, confidence thresholds, and performance gotchas. vision-framework.md covers them. | | "I'll handle text recognition without the skill" | VNRecognizeTextRequest has fast/accurate modes and language-specific settings. vision-framework.md has the patterns. | | "Subject segmentation is straightforward" | Instance masks have HDR compositing and hand-exclusion patterns. vision-framework.md covers complex scenarios. | | "Visual Intelligence is just the camera API" | Visual Intelligence is a system-level feature requiring IntentValueQuery and SemanticContentDescriptor. vision-ref.md has the integration section. | | "I'll just process on the main thread" | Vision blocks UI on older devices. Users on iPhone 12 will experience frozen app. 15 min to add background queue. |
User: "How do I detect hand pose in an image?"
→ See skills/vision-framework.md
User: "Isolate a subject but exclude the user's hands"
→ See skills/vision-framework.md
User: "How do I read text from an image?"
→ See skills/vision-framework.md
User: "Scan QR codes with the camera"
→ See skills/vision-framework.md
User: "Subject detection isn't working"
→ See skills/vision-diag.md
User: "Text recognition returns wrong characters"
→ See skills/vision-diag.md
User: "Show me VNDetectHumanBodyPoseRequest examples"
→ See skills/vision-ref.md
User: "How do I make my app work with Visual Intelligence?"
→ See skills/vision-ref.md
User: "RecognizeDocumentsRequest API reference"
→ See skills/vision-ref.md
development
Use when building ANY watchOS app — app structure, independent apps, Watch Connectivity, Smart Stack widgets, complications, controls, RelevanceKit, background tasks, ClockKit migration.
development
Use when working with HealthKit, WorkoutKit, health data, workouts, or fitness features on iOS or watchOS. Covers permissions, queries, background delivery, custom workouts, multidevice coordination.
development
Use when building, fixing, or improving ANY SwiftUI UI — views, navigation, layout, animations, performance, architecture, gestures, debugging, iOS 26 features.
content-media
Use when working with camera, photos, audio, haptics, ShazamKit, or Now Playing. Covers AVCaptureSession, PHPicker, PhotosPicker, AVFoundation, Core Haptics, audio recognition, MediaPlayer, CarPlay, MusicKit.