Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

software-mansion-labs/on-device-ai

Name: on-device-ai
Author: software-mansion-labs

skills/react-native-best-practices/references/on-device-ai/SKILL.md

npx skillsauth add software-mansion-labs/skills on-device-ai

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

React Native ExecuTorch

Software Mansion's production patterns for on-device AI in React Native and Expo using React Native ExecuTorch.

Targets the current published API (v0.10.x). Load at most one reference file per question. For hook signatures, model constants, or config options not covered here, webfetch the matching page from docs.swmansion.com/react-native-executorch.

Decision Tree

What does the feature need?
│
├── Generate / chat with text?
│   └── useLLM                                          → see llm.md
│       ├── Plain chat → standard useLLM
│       ├── Image + text input → useLLM with a VLM model (LFM2_VL_*)
│       ├── Tool / function calling → configure with toolsConfig
│       └── Structured JSON output → getStructuredOutputPrompt
│
├── Understand or transform images?
│   ├── What is in this image? → useClassification      → see vision.md
│   ├── Where are the objects? → useObjectDetection     → see vision.md
│   ├── Per-pixel class → useSemanticSegmentation       → see vision.md
│   ├── Per-instance segmentation → useInstanceSegmentation → see vision.md
│   ├── Human pose keypoints → usePoseEstimation        → see vision.md
│   ├── Read text from image → useOCR / useVerticalOCR  → see vision.md
│   ├── Apply artistic style → useStyleTransfer         → see vision.md
│   ├── Generate image from prompt → useTextToImage     → see vision.md
│   └── Embed image as vector → useImageEmbeddings      → see vision.md
│
├── Speech / audio?
│   ├── Transcribe speech → useSpeechToText             → see speech.md
│   ├── Synthesize speech → useTextToSpeech             → see speech.md
│   └── Detect speech segments → useVAD                 → see speech.md
│
├── Text utilities?
│   ├── Embed text as vector → useTextEmbeddings        → see vision.md
│   ├── Count or inspect tokens → useTokenizer          → see setup.md
│   └── Redact PII from text → usePrivacyFilter         → see setup.md
│
├── Full RAG pipeline (retrieval + generation + vector store)?
│   └── react-native-rag (sibling library)              → see setup.md
│
└── Custom `.pte` model not covered by a dedicated hook?
    └── useExecutorchModule                             → see setup.md

Critical Rules

Call initExecutorch() at app entry, before any other API. The library does not bundle a network/file layer — you must register a resource-fetcher adapter (ExpoResourceFetcher for Expo, BareResourceFetcher for bare RN). Any hook called before initialization throws ResourceFetcherAdapterNotInitialized.
Check isReady before calling forward / generate / transcribe. All hooks load asynchronously. Inference before the model is ready throws ModuleNotLoaded.
Interrupt LLM generation before unmounting. Unmounting while isGenerating is true crashes. Call llm.interrupt() and wait for isGenerating === false before navigating away.
Use quantized model variants on mobile. Full-precision variants exceed device memory on most phones. Every supported model ships a _QUANTIZED variant — prefer it unless you've measured otherwise.
Audio for speech-to-text and VAD must be 16 kHz mono. Mismatched sample rates produce silently garbled transcriptions. Decode with new AudioContext({ sampleRate: 16000 }).
Audio from text-to-speech is 24 kHz. Create the playback context with new AudioContext({ sampleRate: 24000 }).
The New Architecture (Fabric) is required. Old architecture is unsupported. Expo Go is unsupported — use a custom dev build (npx expo prebuild). iOS release builds need a real device (the simulator lacks the Metal APIs ExecuTorch relies on).

Minimal Setup

// App.tsx (Expo)
import { initExecutorch } from 'react-native-executorch';
import { ExpoResourceFetcher } from 'react-native-executorch-expo-resource-fetcher';

initExecutorch({ resourceFetcher: ExpoResourceFetcher });

// App.tsx (bare React Native)
import { initExecutorch } from 'react-native-executorch';
import { BareResourceFetcher } from 'react-native-executorch-bare-resource-fetcher';

initExecutorch({ resourceFetcher: BareResourceFetcher });

Full setup, Metro config for bundled .pte files, custom adapters, model-loading strategies, and error handling: see setup.md.

Hook Quick Reference

| Hook | Purpose | Reference | |---|---|---| | useLLM | Text generation, chat, tool calling, VLM | llm.md | | useClassification | Image categorisation | vision.md | | useObjectDetection | Bounding-box detection (YOLO26, RF-DETR, SSDLite) | vision.md | | useSemanticSegmentation | Per-pixel class segmentation | vision.md | | useInstanceSegmentation | Per-instance segmentation | vision.md | | usePoseEstimation | COCO 17-keypoint human pose | vision.md | | useStyleTransfer | Artistic image filters | vision.md | | useTextToImage | Stable Diffusion image generation | vision.md | | useImageEmbeddings | CLIP image embeddings | vision.md | | useOCR | Horizontal text OCR | vision.md | | useVerticalOCR | Vertical text OCR (experimental, CJK) | vision.md | | useTextEmbeddings | Sentence embeddings for similarity / RAG | vision.md | | useSpeechToText | Whisper transcription (batch + streaming) | speech.md | | useTextToSpeech | Kokoro TTS (batch + streaming, phoneme input) | speech.md | | useVAD | FSMN voice activity detection | speech.md | | useTokenizer | HuggingFace-compatible tokenization | setup.md | | usePrivacyFilter | On-device PII / privacy redaction | setup.md | | useExecutorchModule | Custom .pte model inference | setup.md |

Every hook also has a non-React Module counterpart (e.g. LLMModule.fromModelName(...), ClassificationModule.fromModelName(...)) for use outside React components.

Common Pitfalls

| Symptom | Likely cause | Fix | |---|---|---| | ResourceFetcherAdapterNotInitialized | initExecutorch not called | Call it at app entry with an adapter | | ModuleNotLoaded | Inference before model finished loading | Gate calls on isReady | | MemoryAllocationFailed on launch | Model too large for device | Switch to _QUANTIZED variant or smaller parameter count | | App crashes on screen navigation | Unmount during active generation | llm.interrupt() and await isGenerating === false | | Whisper produces garbled text | Wrong sample rate | Decode audio at 16 kHz mono | | TTS output sounds chipmunked | Playback context at wrong rate | Create AudioContext({ sampleRate: 24000 }) | | Build fails on iOS simulator (release) | Simulator lacks Metal APIs | Build release on real device |

Full error code list and recovery patterns: setup.md.

References

| File | When to read | |---|---| | llm.md | useLLM functional + managed modes, tool calling, structured output (JSON Schema / Zod), interrupting, vision-language models, generation config | | vision.md | Image classification, object detection, semantic + instance segmentation, pose estimation, OCR (horizontal + vertical), style transfer, text-to-image, image + text embeddings | | speech.md | Speech-to-text (Whisper batch + streaming with timestamps), text-to-speech (Kokoro batch + streaming, phoneme input, voice catalogue), voice activity detection, audio sample-rate requirements | | setup.md | initExecutorch, Expo / bare resource-fetcher adapters, model loading strategies, Metro config, error codes and recovery, useExecutorchModule for custom .pte models, useTokenizer, usePrivacyFilter, full model catalogue |

External Resources

Official docs: https://docs.swmansion.com/react-native-executorch
API reference: https://docs.swmansion.com/react-native-executorch/docs/api-reference
Source: https://github.com/software-mansion/react-native-executorch
Pre-exported models: https://huggingface.co/software-mansion

software-mansion-labs/on-device-ai

skills/react-native-best-practices/references/on-device-ai/SKILL.md

Build on-device AI features in React Native and Expo apps with React Native ExecuTorch. Use when adding AI to a mobile app without cloud dependencies — chatbots, image classification, object detection, OCR, semantic or instance segmentation, style transfer, image generation, pose estimation, speech-to-text, text-to-speech, voice activity detection, semantic search with embeddings, tokenization, privacy / PII redaction, or vision-language image understanding. Also use when mentioning offline / on-device / privacy AI, reducing cloud cost or latency, or managing ML models. Covers initExecutorch and every hook (useLLM, useClassification, useObjectDetection, useOCR, useSemanticSegmentation, useInstanceSegmentation, useStyleTransfer, useTextToImage, useImageEmbeddings, usePoseEstimation, useSpeechToText, useTextToSpeech, useVAD, useTextEmbeddings, useTokenizer, usePrivacyFilter, useExecutorchModule), tool calling, structured output, VLMs, Expo and bare resource-fetcher adapters, and error handling.

213 stars

tools

Updated Jun 10, 2026

$ install --global

skillsauth

npx skillsauth add software-mansion-labs/skills on-device-ai

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jun 10, 2026, 5:53 AM17.5s5 files scanned

SKILL.md

name:: on-device-ai
description:: Build on-device AI features in React Native and Expo apps with React Native ExecuTorch. Use when adding AI to a mobile app without cloud dependencies — chatbots, image classification, object detection, OCR, semantic or instance segmentation, style transfer, image generation, pose estimation, speech-to-text, text-to-speech, voice activity detection, semantic search with embeddings, tokenization, privacy / PII redaction, or vision-language image understanding. Also use when mentioning offline / on-device / privacy AI, reducing cloud cost or latency, or managing ML models. Covers initExecutorch and every hook (useLLM, useClassification, useObjectDetection, useOCR, useSemanticSegmentation, useInstanceSegmentation, useStyleTransfer, useTextToImage, useImageEmbeddings, usePoseEstimation, useSpeechToText, useTextToSpeech, useVAD, useTextEmbeddings, useTokenizer, usePrivacyFilter, useExecutorchModule), tool calling, structured output, VLMs, Expo and bare resource-fetcher adapters, and error handling.

React Native ExecuTorch

Software Mansion's production patterns for on-device AI in React Native and Expo using React Native ExecuTorch.

Decision Tree

What does the feature need?
│
├── Generate / chat with text?
│   └── useLLM                                          → see llm.md
│       ├── Plain chat → standard useLLM
│       ├── Image + text input → useLLM with a VLM model (LFM2_VL_*)
│       ├── Tool / function calling → configure with toolsConfig
│       └── Structured JSON output → getStructuredOutputPrompt
│
├── Understand or transform images?
│   ├── What is in this image? → useClassification      → see vision.md
│   ├── Where are the objects? → useObjectDetection     → see vision.md
│   ├── Per-pixel class → useSemanticSegmentation       → see vision.md
│   ├── Per-instance segmentation → useInstanceSegmentation → see vision.md
│   ├── Human pose keypoints → usePoseEstimation        → see vision.md
│   ├── Read text from image → useOCR / useVerticalOCR  → see vision.md
│   ├── Apply artistic style → useStyleTransfer         → see vision.md
│   ├── Generate image from prompt → useTextToImage     → see vision.md
│   └── Embed image as vector → useImageEmbeddings      → see vision.md
│
├── Speech / audio?
│   ├── Transcribe speech → useSpeechToText             → see speech.md
│   ├── Synthesize speech → useTextToSpeech             → see speech.md
│   └── Detect speech segments → useVAD                 → see speech.md
│
├── Text utilities?
│   ├── Embed text as vector → useTextEmbeddings        → see vision.md
│   ├── Count or inspect tokens → useTokenizer          → see setup.md
│   └── Redact PII from text → usePrivacyFilter         → see setup.md
│
├── Full RAG pipeline (retrieval + generation + vector store)?
│   └── react-native-rag (sibling library)              → see setup.md
│
└── Custom `.pte` model not covered by a dedicated hook?
    └── useExecutorchModule                             → see setup.md

Critical Rules

Call initExecutorch() at app entry, before any other API. The library does not bundle a network/file layer — you must register a resource-fetcher adapter (ExpoResourceFetcher for Expo, BareResourceFetcher for bare RN). Any hook called before initialization throws ResourceFetcherAdapterNotInitialized.
Check isReady before calling forward / generate / transcribe. All hooks load asynchronously. Inference before the model is ready throws ModuleNotLoaded.
Interrupt LLM generation before unmounting. Unmounting while isGenerating is true crashes. Call llm.interrupt() and wait for isGenerating === false before navigating away.
Use quantized model variants on mobile. Full-precision variants exceed device memory on most phones. Every supported model ships a _QUANTIZED variant — prefer it unless you've measured otherwise.
Audio for speech-to-text and VAD must be 16 kHz mono. Mismatched sample rates produce silently garbled transcriptions. Decode with new AudioContext({ sampleRate: 16000 }).
Audio from text-to-speech is 24 kHz. Create the playback context with new AudioContext({ sampleRate: 24000 }).
The New Architecture (Fabric) is required. Old architecture is unsupported. Expo Go is unsupported — use a custom dev build (npx expo prebuild). iOS release builds need a real device (the simulator lacks the Metal APIs ExecuTorch relies on).

Minimal Setup

// App.tsx (Expo)
import { initExecutorch } from 'react-native-executorch';
import { ExpoResourceFetcher } from 'react-native-executorch-expo-resource-fetcher';

initExecutorch({ resourceFetcher: ExpoResourceFetcher });

// App.tsx (bare React Native)
import { initExecutorch } from 'react-native-executorch';
import { BareResourceFetcher } from 'react-native-executorch-bare-resource-fetcher';

initExecutorch({ resourceFetcher: BareResourceFetcher });

Full setup, Metro config for bundled .pte files, custom adapters, model-loading strategies, and error handling: see setup.md.

Hook Quick Reference

Every hook also has a non-React Module counterpart (e.g. LLMModule.fromModelName(...), ClassificationModule.fromModelName(...)) for use outside React components.

Common Pitfalls

Full error code list and recovery patterns: setup.md.

References

External Resources

Official docs: https://docs.swmansion.com/react-native-executorch
API reference: https://docs.swmansion.com/react-native-executorch/docs/api-reference
Source: https://github.com/software-mansion/react-native-executorch
Pre-exported models: https://huggingface.co/software-mansion

Related Skills

software-mansion-labs/migrate-to-detour

development

VerifiedTrustedCommunity

Use when the user mentions migrating deep links, switching away from Branch or AppsFlyer, replacing their deep linking SDK, setting up Detour deep linking for the first time, or asks how Branch/AppsFlyer concepts map to Detour. Covers the complete migration end to end - Detour Dashboard configuration, Universal Links and App Links setup, SDK swap with code examples, and analytics migration. Works across Android, iOS, React Native, and Flutter.

216SKILL.mdUpdated Jun 13, 2026

software-mansion-labs/migrate-to-detour

software-mansion-labs/detour-onboarding

development

VerifiedTrustedCommunity

Complete onboarding guide for developers who are new to Detour, the open-source deferred deep linking SDK by Software Mansion. Use this skill whenever a user asks what Detour is, how to get started with Detour, how to set up deep linking with Detour, how to install the Detour SDK, how to configure the Detour dashboard, or how deferred deep linking works. Also use it when the user has no prior deep linking setup and wants to add deep links to their app. Covers everything from zero to production: account setup, dashboard configuration, Universal Links and App Links, platform SDK integration for React Native, iOS, Android, and Flutter, analytics, and architecture.

216SKILL.mdUpdated Jun 13, 2026

software-mansion-labs/detour-onboarding

software-mansion-labs/fishjam-react-native-client

tools

VerifiedTrustedCommunity

React Native / Expo SDK for Fishjam — video/audio streaming on iOS and Android. Use when writing a React Native or Expo app that calls Fishjam, configures the Fishjam Expo plugin, sets up permissions, runs background streaming, integrates CallKit, or renders RTCView. Trigger on: '@fishjam-cloud/react-native-client', 'fishjam expo plugin', 'FishjamProvider mobile', 'useCameraPermissions', 'useMicrophonePermissions', 'useForegroundService', 'useCallKit', 'useCallKitEvent', 'useCallKitService', 'RTCView', 'RTCPIPView', 'ScreenCapturePickerView', 'startPIP', 'stopPIP', 'AudioDeviceType', 'useAudioOutput', '@fishjam-cloud/react-native-webrtc', 'fishjam react native', 'expo fishjam', 'fishjam ios', 'fishjam android', 'broadcast extension'. Re-exports @fishjam-cloud/react-client hooks plus mobile-only: permissions, foreground service, iOS broadcast extension, audio routing, CallKit, Expo config plugin.

214SKILL.mdUpdated Jun 11, 2026

software-mansion-labs/fishjam-react-native-client

software-mansion-labs/fishjam-react-client

tools

VerifiedTrustedCommunity

Browser-only React SDK for Fishjam — joining rooms, capturing camera/microphone/screen, displaying peers, and acting as a livestream streamer or viewer in a React web app. Use whenever the user is writing a React app in a browser that calls Fishjam APIs, sets up FishjamProvider, or uses any Fishjam React hook. Trigger on: '@fishjam-cloud/react-client', 'FishjamProvider', 'useConnection', 'useCamera', 'useMicrophone', 'useScreenShare', 'usePeers', 'useDataChannel', 'useVAD', 'useLivestreamStreamer', 'useLivestreamViewer', 'useCustomSource', 'useInitializeDevices', 'useUpdatePeerMetadata', 'useSandbox', 'PeerWithTracks', 'joinRoom', 'peerToken', 'fishjamId', 'fishjam react', '@fishjam-cloud/ts-client', 'FishjamClient ts-client'. Covers the provider, the full hook catalog, simulcast configuration, custom sources, data channels, VAD, livestream WHEP playback, device persistence, and reconnection. Briefly notes when to drop down to @fishjam-cloud/ts-client for non-React or worker contexts.

214SKILL.mdUpdated Jun 11, 2026

software-mansion-labs/fishjam-react-client

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/software-mansion-labs/skills.git

# Copy into Claude Code skills folder (global)
cp -r skills/skills/react-native-best-practices/references/on-device-ai ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

software-mansion-labs/skills

213 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT