Capacitor plugin for ONNX Runtime inference on Android, iOS and Web.
2.0.0 removes the plugin-side model cache. The plugin no longer downloads, validates, or stores model files — it is now a thin wrapper around ONNX Runtime sessions. The host app owns model storage and provides bytes (web) or a filesystem path (native).
LoadModelInputno longer acceptsurl,sha256,forceRedownload, ortimeoutMs. Pass eitherfilePath(iOS/Android) ormodelBuffer: Uint8Array(web).LoadModelResultno longer includesstatus(cache_hit/downloaded).- Methods
clearModelandclearAllCachehave been removed.release(modelId, version)still releases the in-memory ORT session. CapacitorOnnxWeb.setWebConfigno longer acceptscacheStorage— onlywasmPath.- Error codes
NETWORK_ERROR,INTEGRITY_ERROR, andMODEL_INTEGRITY_ERRORare no longer reachable.
Before:
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
url: 'https://example.com/model.onnx',
sha256: 'abc...',
});After (native, iOS/Android):
// Download/cache the model in your app code, e.g. via @capacitor/filesystem.
// Then pass the absolute or file:// path to the plugin.
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/data/user/0/com.app/files/models/demo-model-1.0.0.onnx',
});After (web):
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});Passing modelBuffer on iOS/Android or filePath on web rejects with MODEL_INVALID — the Capacitor bridge serializes Uint8Array inefficiently (base64 / number array), so native callers must always use filesystem paths.
pnpm add @cantoo/capacitor-onnx
pnpm cap sync android
pnpm cap sync iospnpm cap sync android registers the plugin automatically; no manual MainActivity edits are required. The host app must satisfy:
minSdk≥ 24 (Android 7.0).compileSdk≥ 34.- JDK ≥ 17 on the build machine. The plugin targets Java 17 bytecode (
sourceCompatibility/targetCompatibility/kotlinOptions.jvmTarget = '17'), so any newer JDK (e.g. 21) also works — 17 is just the floor.
The com.microsoft.onnxruntime:onnxruntime-android dependency is bundled by the plugin's build.gradle — you do not need to add it yourself. Tune execution providers and threading through sessionOptions (see docs/android-optimization.md).
iOS supports both CocoaPods (default for Capacitor apps) and Swift Package Manager.
CocoaPods (recommended for Capacitor apps). pnpm cap sync ios registers the plugin automatically: the generated Podfile picks up CantooCapacitorOnnx.podspec from node_modules/@cantoo/capacitor-onnx, and pod install resolves onnxruntime-objc transitively. No manual Xcode steps are required.
Swift Package Manager (alternative). If the host app prefers SPM, skip the Podfile entry and add the plugin as a local package in Xcode (Package Dependencies → +, pointing to node_modules/@cantoo/capacitor-onnx). Xcode resolves onnxruntime-swift-package-manager transitively. Add the CapacitorOnnx product to the App target.
Requirements either way:
- Minimum deployment target: iOS 14.
- The native bridge is registered automatically via
CapacitorOnnxPlugin.m; no additional Swift code is required.
onnxruntime-web requires the page to be served as a cross-origin isolated context — without it the multi-threaded WASM backend falls back (or fails) and SharedArrayBuffer is unavailable. The host page must be served with the following response headers:
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp
Plus, any cross-origin asset the page loads (model files, WASM artifacts, fonts, images) needs Cross-Origin-Resource-Policy: cross-origin (or same-site) on its response, otherwise it will be blocked under COEP. CDN/Storage hosting your .onnx artifacts must also send permissive CORS headers (Access-Control-Allow-Origin).
For Web-only hosts (without Capacitor), import from the dedicated Web entrypoint and configure the WASM path before loadModel:
import { CapacitorOnnxWeb } from '@cantoo/capacitor-onnx/web';
CapacitorOnnxWeb.setWebConfig({
wasmPath: '/ort-wasm/',
});Symptoms of missing isolation/CORS: SharedArrayBuffer is not defined, NetworkError when fetching .wasm, or models silently downgrading to single-threaded execution.
The package exports:
CapacitorOnnxCapacitorOnnxWeb(from@cantoo/capacitor-onnx/webfor non-Capacitor hosts)- TypeScript interfaces from
definitions
| Method | Signature | Purpose | Notes |
|---|---|---|---|
loadModel |
(input: LoadModelInput) => Promise<LoadModelResult> |
Creates an ONNX Runtime session from the model bytes (web) or file path (native), and optionally warms it up. Must be called once per modelId+version before run. |
Native: pass filePath (absolute path or file:// URI). Web: pass modelBuffer: Uint8Array. Pass warmupInputs (a Record<string, RawTensor> keyed by model input name) to pay first-inference cost upfront, and sessionOptions to pick the execution provider / thread counts. The result includes executionProviderUsed. |
run |
(input: RunInput) => Promise<RunResult> |
Runs inference on a previously loaded session. | Pass inputs as a Record<string, RawTensor> keyed by the model's ONNX input names. Calls to the same modelId+version are serialized by a per-session lock; different models run in parallel. Returns { outputs, latencyMs }, where outputs is keyed by the model's output names. Pre/post-processing is the consumer's responsibility. |
release |
(input: ReleaseModelInput) => Promise<void> |
Releases the in-memory ONNX session for the given modelId+version. |
Use to free RAM/GPU memory when you are done with a model. The host app is responsible for managing model files on disk. |
Type definitions for every input/result (e.g. LoadModelInput, RawTensor, SessionOptionsInput, PluginError) live in src/definitions.ts.
import { Capacitor } from '@capacitor/core';
import { CapacitorOnnx } from '@cantoo/capacitor-onnx';
async function loadDemoModel() {
if (Capacitor.getPlatform() === 'web') {
const response = await fetch('https://example.com/model.onnx');
const modelBuffer = new Uint8Array(await response.arrayBuffer());
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
modelBuffer,
});
return;
}
// On iOS/Android, the host app is responsible for downloading
// the model to the filesystem (e.g. via @capacitor/filesystem).
await CapacitorOnnx.loadModel({
modelId: 'demo-model',
version: '1.0.0',
filePath: '/absolute/path/to/model.onnx',
});
}
await loadDemoModel();
const { outputs } = await CapacitorOnnx.run({
modelId: 'demo-model',
version: '1.0.0',
inputs: {
input_values: {
type: 'float32',
dims: [1, 16000],
data: [/* normalized audio samples */],
},
attention_mask: {
type: 'int64',
dims: [1, 16000],
data: [/* 1s for real samples, 0s for padding */],
},
},
});
const logits = outputs.logits;
console.log(logits.dims, logits.data.length);
await CapacitorOnnx.release({ modelId: 'demo-model', version: '1.0.0' });loadModelsupports optionalwarmupInputs: Record<string, RawTensor>to pre-run the session with sample tensors keyed by model input name (e.g.{ input_values: { type: 'float32', dims: [1, 16000], data: [...] } }). Warmup is skipped whenwarmupInputsis omitted.loadModelreturnsexecutionProviderUsedwith the provider that was actually initialized.- Web provider selection supports
sessionOptions.executionProviderwithauto,wasm,webgpu,webnnplus native aliases (cpu/nnapi/coremlmapped towasmin Web). - In Web
automode, provider resolution tries accelerated providers first (webgpu,webnn) and falls back towasm. - iOS provider mapping:
cpu→ CPU,nnapi/coreml→ CoreML,auto→ CoreML with CPU fallback, web providers (wasm/webgpu/webnn) → CPU. runtakesinputskeyed by ONNX input name and returns every model output inoutputskeyed by ONNX output name. Android acceptsfloat32,int64,int32,bool,uint8; iOS accepts the same set exceptbool(the ONNX Runtime Obj-C API exposes no bool tensor type).float16/uint32are web-only. Unsupported types are rejected on native with a structured error.- Output shape & dtype: each
RunResult.outputstensor carries the shape and dtype ORT materialized — Web readsort.Tensor.dims/.type, Android readsOnnxTensor.info.shape/.type, iOS readstensorTypeAndShapeInfo().shape/.elementType. No heuristic, no symbolic dims (-1) in the result. - Errors are normalized with structured fields (
code,message,retryable,correlationId,details).
- Testing scripts and validation flow: docs/testing-scripts.md
MIT