[iOS-SDK] TTS integration + ONNX runtime integration to run TTS models #43

sanchitmonga22 · 2025-08-21T23:58:16Z

Description

This pull request introduces Text-to-Speech (TTS) capabilities to the iOS SDK. It achieves this by integrating the ONNX Runtime, allowing the SDK to execute various TTS models. This integration offers the following key features:

ONNX Runtime Integration: Enables the execution of pre-trained TTS models using the ONNX (Open Neural Network Exchange) format.
TTS Functionality: Provides the SDK with the ability to synthesize spoken audio from text input.
Modular Design: The changes are structured to be easily maintainable and extensible for future TTS model integrations.

Type of Change

Bug fix
New feature
Documentation update
Refactoring

Testing

Tests pass locally
Added/updated tests for changes

Labels

Please add the appropriate label(s):

iOS SDK - Changes to iOS/Swift SDK
Android SDK - Changes to Android/Kotlin SDK
iOS Sample - Changes to iOS example app
Android Sample - Changes to Android example app

Checklist

Code follows project style guidelines
Self-review completed
Documentation updated (if needed)

Summary by CodeRabbit

New Features
- Added on-device Sherpa‑ONNX TTS with multiple voices, streaming synthesis, and playback controls; selectable via settings. UI now shows the active TTS model. Added new downloadable TTS models (Kitten, VITS, Kokoro, Matcha).
- Introduced modular voice/LLM architecture with runtime provider selection; new LLMSwift backend option.
- Packaged WhisperKit transcription as a module.
Documentation
- Added module development guide, Sherpa‑ONNX TTS build/setup docs, and module READMEs.
Chores
- Enhanced PR template with platform testing checkboxes and screenshots section.
- Expanded .gitignore for large binary frameworks.

- Introduced new extensions for LLM and voice modules in RunAnywhereSDK, enhancing modularity and service creation. - Implemented LLMModuleFactory and VoiceModuleFactory for streamlined service instantiation based on available modules. - Added protocols for LLMService and SpeechToTextService to standardize module interactions. - Created comprehensive configuration structures for LLM and voice modules, improving flexibility and usability. - Established a ModuleIntegrationHelper for downloading models with progress tracking and managing module lifecycles. - Documented module development guidelines to assist future integrations and ensure consistency across modules.

- Introduced the SherpaONNXTTS module, including core components such as SherpaONNXTTSService, SherpaONNXConfiguration, SherpaONNXModelManager, SherpaONNXDownloadStrategy, and SherpaONNXWrapper. - Implemented a robust model registration and download strategy for managing TTS models and their dependencies. - Established a comprehensive configuration structure for the TTS engine, allowing for flexible model management and synthesis options. - Enhanced VoiceCapabilityService to support dynamic loading of the SherpaONNXTTS service based on configuration. - Documented module development guidelines and integration patterns for future reference and consistency.

- Introduced `build_frameworks.sh` to automate the cloning and building of Sherpa-ONNX XCFrameworks. - Added `Package.resolved` to manage dependencies for the SherpaONNXTTS module. - Updated `Package.swift` to include binary targets for the newly built XCFrameworks. - Created a comprehensive `README.md` for module setup, features, and integration instructions. - Implemented module map for C++ interop and added Objective-C++ bridge header and implementation for seamless integration with the Sherpa-ONNX C API. - Cleaned up the project structure and ensured adherence to SOLID principles for maintainability and scalability.

…TTS module - Created a comprehensive `NEXT_STEPS.md` file outlining completed tasks and immediate next steps for the SherpaONNXTTS module. - Updated `Package.swift` to include public headers and C++ settings for better integration with the Objective-C++ bridge. - Introduced `SherpaONNXBridge.mm` for Objective-C++ implementation, facilitating seamless interaction with the Sherpa-ONNX C API. - Added unit tests in `SherpaONNXTTSTests.swift` to validate service initialization, configuration, model types, and error handling.

…odule - Created `BUILD_DOCUMENTATION.md` detailing the end-to-end process for building and integrating the Sherpa-ONNX TTS module with the RunAnywhere Swift SDK. - Updated `Package.swift` to support newer platform versions and include the `SherpaONNXBridge` target for improved integration. - Introduced `SherpaONNXBridge.h` and `SherpaONNXBridge.mm` for Objective-C++ bridging to the Sherpa-ONNX C API. - Enhanced `SherpaONNXWrapper.swift` to utilize the new bridge, improving TTS functionality and performance. - Added XCFrameworks for `onnxruntime` and `sherpa-onnx`, ensuring multi-platform support and optimized builds.

…rocessing capabilities - Introduced LLMSwift module for LLM integration, including adapter and service implementations. - Added WhisperKitTranscription module for speech-to-text functionality, featuring a custom download strategy and service for transcription. - Updated project configuration to include new dependencies and removed obsolete references to WhisperKit. - Enhanced documentation for both modules, detailing installation, usage, and error handling. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability.

… code clarity - Added LLM.swift as a dependency from GitHub to enhance LLM integration. - Removed legacy text-to-voice handling from WhisperKitAdapter and WhisperKitService, simplifying the codebase to focus solely on speech-to-text functionality. - Cleaned up comments in WhisperKitService to reflect current implementation without legacy references. - Updated project documentation to reflect changes in module structure and dependencies.

- Updated VoiceAssistantView to display the current TTS model from the view model instead of a static label. - Enhanced VoiceAssistantViewModel by adding a new property for the TTS model. - Improved SherpaONNXWrapper to use consistent property names for voice identifiers and streamlined audio playback management in SherpaONNXTTSService. - Refactored SherpaONNXDownloadStrategy to simplify the download process and improve error handling.

- Introduced new TTS models for SherpaONNX, including Kitten TTS Nano and VITS English US, with detailed metadata and download information. - Updated FrameworkRecommender to include performance ratings for SherpaONNX, enhancing model selection capabilities. - Modified LLMFramework to support SherpaONNX as a new framework type, ensuring proper handling for text-to-voice functionality.

sanchitmonga22 · 2025-08-30T19:44:28Z

@coderabbitai review

coderabbitai · 2025-08-30T19:44:35Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-08-30T19:44:37Z

Walkthrough

Migrates iOS example from remote packages to local modules, introduces Sherpa-ONNX TTS and WhisperKit transcription modules, and expands SDK with module infrastructure, TTS provider selection, and new enums/configs. Adds LLMSwift package, extensive Sherpa-ONNX bridge and service, download strategies, tests, scripts, and documentation. Minor UI and model list updates.

Changes

Cohort / File(s)	Summary
Repo metadata `/.github/pull_request_template.md`, `/.gitignore`	PR template adds testing checkboxes and screenshots section. Gitignore adds Sherpa-ONNX XCFramework paths and external source directory.
Xcode project integration (example app) `examples/ios/RunAnywhereAI/.../project.pbxproj`, `.../project.xcworkspace/.../Package.resolved`, `.../App/RunAnywhereAIApp.swift`	Switches remote WhisperKit/LLM to local packages: SherpaONNXTTS, WhisperKitTranscription, LLMSwift. Updates build phases/dependencies and imports. Package.resolved hash tweak.
Example app features `.../Features/Models/ModelListViewModel.swift`, `.../Features/Voice/VoiceAssistantView.swift`, `.../Features/Voice/VoiceAssistantViewModel.swift`	Adds two ONNX TTS models. ModelBadge now shows dynamic TTS model. ViewModel adds ttsModel and switches TTS config to SherpaONNX factory.
LLMSwift module `sdk/runanywhere-swift/Modules/LLMSwift/*`	New SwiftPM package exporting LLMSwift target with deps on LLM.swift and RunAnywhereSDK. Adds error type, service refactor to LLMSwiftError and template resolver utility, README, and lockfile.
Sherpa-ONNX TTS module (SwiftPM) `sdk/runanywhere-swift/Modules/SherpaONNXTTS/*`	New SPM package for SherpaONNXTTS with binary XCFramework targets and ObjC++ bridge (header/mm), Swift wrapper, public configuration/error types, service implementation, model manager, download strategy, module map, tests, build/setup scripts, LFS attrs, and documentation.
WhisperKit Transcription module `sdk/runanywhere-swift/Modules/WhisperKitTranscription/*`	New SPM package for transcription: public error type, adapter tweaks (remove text-to-voice), public download strategy, service logging changes, re-export file, README, and lockfile.
SDK module infrastructure (public extensions) `sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/*`, `.../RunAnywhereSDK+Voice.swift`, `.../RunAnywhereSDK+LLMModules.swift`	Adds module discovery/availability, storage and cache API, model registry/download helpers, lifecycle protocols, registry, voice and LLM module factories/creators, and adjusts internal TTS lookup call site.
SDK voice and framework enums/routing `.../Core/Models/Framework/LLMFramework.swift`, `.../Core/Models/Framework/FrameworkModality.swift`, `.../Capabilities/Compatibility/Services/FrameworkRecommender.swift`, `.../Capabilities/Voice/Services/VoiceCapabilityService.swift`, `.../Public/Models/Voice/VoiceTTSConfig.swift`	Adds sherpaONNX framework case and display name, maps modality to textToVoice, updates recommender scores, changes TTS discovery to accept config and dynamically load Sherpa TTS, and extends VoiceTTSConfig with provider/modelId and factory methods.
Documentation and plans `sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md`, `thoughts/shared/plans/*`	Adds module development guide and Sherpa-ONNX integration/bridge implementation plans.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant App as RunAnywhereAI App
  participant VCS as VoiceCapabilityService
  participant SDK as RunAnywhereSDK
  participant SysTTS as SystemTextToSpeechService
  participant Sherpa as SherpaONNXTTSService

  Note over App,VCS: TTS selection based on VoiceTTSConfig
  User->>App: Start voice session
  App->>VCS: findTTSService(for: VoiceTTSConfig)
  alt provider = sherpaONNX
    VCS->>SDK: isModuleAvailable("SherpaONNXTTS.SherpaONNXTTSService")
    alt available
      VCS->>Sherpa: init()
      VCS-->>App: SherpaONNXTTSService
    else not available/fails
      VCS->>SysTTS: init()
      VCS-->>App: System TTS (fallback)
    end
  else provider = system or nil
    VCS->>SysTTS: init()
    VCS-->>App: System TTS
  end
  App->>+Sherpa: synthesize(text, options)
  Sherpa-->>-App: audio Data / Stream
  App-->>User: Playback

sequenceDiagram
  autonumber
  participant Service as SherpaONNXTTSService
  participant SDK as RunAnywhereSDK
  participant ModelMgr as SherpaONNXModelManager
  participant DL as DownloadManager
  participant Wrapper as SherpaONNXWrapper
  participant Bridge as SherpaONNXBridge

  Note over Service,SDK: Initialization and model setup
  Service->>SDK: registerModuleDownloadStrategy(SherpaONNXDownloadStrategy)
  Service->>ModelMgr: registerModels()
  ModelMgr->>SDK: registerModuleModels(models)
  Service->>SDK: getModelLocalPath(for: modelId)
  alt not downloaded
    Service->>DL: download(modelId) with progress
    DL-->>Service: completion
    Service->>SDK: getModelLocalPath(for: modelId)
  end
  Service->>Wrapper: init(configuration)
  Wrapper->>Bridge: initWithModelPath(..., modelType, ...)
  Bridge-->>Wrapper: ready
  Wrapper-->>Service: ready

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

A rabbit taps the build drum, light and quick,
Sherpa sings in ONNX, voices pick.
WhisperKit listens, LLMSwift thinks—
Modules dance with tidy links.
Scripts fetch frameworks, caches bloom;
Tap-tap, compile—new tunes resume. 🐇🎶

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch smonga/tts_handling

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 41

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (4)

17-22: Streaming thresholds use bytes as if they were samples (0.5s/0.1s are off by 4×).

minAudioLength and contextOverlap are compared to Data.byteCount. Fix by using bytes-per-sample.

Apply this diff:

-    private var audioAccumulator = Data()
-    private let minAudioLength = 8000  // 500ms at 16kHz
-    private let contextOverlap = 1600   // 100ms overlap for context
+    private var audioAccumulator = Data()
+    private let sampleRate = 16_000
+    private let bytesPerSample = MemoryLayout<Float>.size // adjust if using Int16 pipeline
+    private var minAudioBytes: Int { (sampleRate / 2) * bytesPerSample }     // 500ms
+    private var contextOverlapBytes: Int { (sampleRate / 10) * bytesPerSample } // 100ms

-                        // Process when we have enough audio (500ms)
-                        if audioBuffer.count >= minAudioLength {
+                        // Process when we have enough audio (500ms)
+                        if audioBuffer.count >= minAudioBytes {

-                            // Keep last 100ms for context continuity
-                            audioBuffer = Data(audioBuffer.suffix(contextOverlap))
+                            // Keep last 100ms for context continuity
+                            audioBuffer = Data(audioBuffer.suffix(contextOverlapBytes))

Also applies to: 339-341, 382-384

318-331: Streaming early-return bug: after initializing, the function returns and never processes audio.

The return inside the guard’s else exits the Task. Restructure to proceed after lazy init.

Apply this diff:

-                    // Ensure WhisperKit is loaded
-                    guard let whisperKit = self.whisperKit else {
-                        if self.isInitialized {
-                            // Already initialized, but whisperKit is nil
-                            throw VoiceError.serviceNotInitialized
-                        } else {
-                            // Not initialized, try to initialize with default model
-                            try await self.initialize(modelPath: nil)
-                            guard self.whisperKit != nil else {
-                                throw VoiceError.serviceNotInitialized
-                            }
-                        }
-                        return
-                    }
+                    // Ensure WhisperKit is loaded
+                    let whisperKit: WhisperKit = {
+                        if let wk = self.whisperKit { return wk }
+                        return self.whisperKit! // set below after init
+                    }()
+                    if self.whisperKit == nil {
+                        if self.isInitialized {
+                            throw VoiceError.serviceNotInitialized
+                        }
+                        try await self.initialize(modelPath: nil)
+                        guard let wk = self.whisperKit else { throw VoiceError.serviceNotInitialized }
+                        // use wk below
+                    }

342-345: Same raw Data→Float32 assumption in streaming path.

Repeat of the format issue above; convert Int16 to Float or use a definitive format.

Apply this diff in both places:

-                            let floatArray = audioBuffer.withUnsafeBytes { buffer in
-                                Array(buffer.bindMemory(to: Float.self))
-                            }
+                            let floatArray: [Float]
+                            if audioBuffer.count % MemoryLayout<Float>.size == 0 {
+                                floatArray = audioBuffer.withUnsafeBytes { buf in
+                                    Array(buf.bindMemory(to: Float.self))
+                                }
+                            } else {
+                                let i16 = audioBuffer.withUnsafeBytes { buf in
+                                    Array(buf.bindMemory(to: Int16.self))
+                                }
+                                floatArray = i16.map { Float($0) / 32768.0 }
+                            }

Also applies to: 390-393

156-169: Honor caller options for task and language; don’t force English transcription.

For non-English or translation, current behavior is incorrect.

Apply this diff:

-        let decodingOptions = DecodingOptions(
-            task: .transcribe,
-            language: "en",  // Force English to avoid language detection issues
+        let decodingOptions = DecodingOptions(
+            task: (options.task == .translate ? .translate : .transcribe),
+            language: options.language.rawValue,
             temperature: 0.0,
             temperatureFallbackCount: 1,
             sampleLength: 224,
             usePrefillPrompt: false,
-            detectLanguage: false,  // Force English instead of auto-detect
+            detectLanguage: false,
             skipSpecialTokens: true,
             withoutTimestamps: true,
             compressionRatioThreshold: 2.4,
             logProbThreshold: -1.0,
             noSpeechThreshold: noSpeechThresh
         )

If you support “auto” language in options, set language to nil and detectLanguage = true accordingly.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (1)

118-126: Set request timeouts and use request-based download to avoid hangs.

URLSession.shared.download(from:) uses default timeouts. Use URLRequest with a timeout.

Apply this diff in both places:
-                    let (localURL, response) = try await URLSession.shared.download(from: fileURL)
+                    var req = URLRequest(url: fileURL)
+                    req.timeoutInterval = 60
+                    let (localURL, response) = try await URLSession.shared.download(for: req)
Also applies to: 178-185

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (2)

82-90: Observer token is dropped immediately; model changes won’t be observed

addObserver(forName:...) returns a token you must retain. As written, the token is discarded, so no notifications will ever fire.

Fix with Combine (you already import it) to auto-manage lifetimes:

@@
-        // Listen for model changes
-        NotificationCenter.default.addObserver(
-            forName: Notification.Name("ModelLoaded"),
-            object: nil,
-            queue: .main
-        ) { [weak self] notification in
-            Task { @MainActor in
-                self?.updateModelInfo()
-            }
-        }
+        // Listen for model changes
+        NotificationCenter.default
+            .publisher(for: Notification.Name("ModelLoaded"))
+            .receive(on: RunLoop.main)
+            .sink { [weak self] _ in
+                self?.updateModelInfo()
+            }
+            .store(in: &cancellables)

And add storage:

@@
 class VoiceAssistantViewModel: ObservableObject {
@@
     private let audioCapture = AudioCapture()
+    private var cancellables = Set<AnyCancellable>()

318-324: Stop capture and tear down on pipeline errors to avoid resource leaks

On .pipelineError, audio capture continues and the task/pipeline aren’t torn down.

             case .pipelineError(let error):
                 errorMessage = error.localizedDescription
                 sessionState = .error(error.localizedDescription)
                 isProcessing = false
                 isListening = false
                 logger.error("Pipeline error: \(error)")
+                // Ensure resources are released on failure
+                audioCapture.stopContinuousCapture()
+                pipelineTask?.cancel()
+                pipelineTask = nil
+                voicePipeline = nil

sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (1)

239-291: Streaming token limit increments twice per token

tokenCount is incremented twice, halving the effective maxTokens. Increment once.

-            var tokenCount = 0
+            var tokenCount = 0
@@
-                for await token in response {
-                    tokenCount += 1
+                for await token in response {
+                    tokenCount += 1
@@
-                    // Check token limit (approximate - actual tokenization may differ)
-                    tokenCount += 1
-                    if tokenCount >= maxTokens {
+                    // Check token limit (approximate - actual tokenization may differ)
+                    if tokenCount >= maxTokens {
                         break
                     }

examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)

147-163: Drop LLM from packageProductDependencies (LLMSwift replaces it).

Avoid linking both. If LLM is still needed, remove LLMSwift instead.
 				541E22712E3BE21300EBF8FA /* RunAnywhereSDK */,
-				543028452E442716008361DD /* LLM */,
 				548CA0762E56D0DC0061CCF5 /* FluidAudioDiarization */,
 				5479377D2E57DF7600CB9251 /* LLMSwift */,
 				54760D382E57E06100A03191 /* WhisperKitTranscription */,
 				54509A592E57FB2E00E24F06 /* SherpaONNXTTS */,
68-89: Remove explicit LLM linking from the Frameworks build phase
LLMSwift’s Swift-PM package already brings in LLM.swift transitively; keeping both risks duplicate symbols. In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj, delete the 543028462E442716008361DD /* LLM in Frameworks */ entry from the PBXFrameworksBuildPhase file list.

🧹 Nitpick comments (94)

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Package.resolved (3)
1-77: Consider dropping this nested Package.resolved (use only a single top-level lockfile).

Keeping a lockfile per submodule often causes churn and conflicting pins across Xcode/SwiftPM versions. Prefer a single root Package.resolved or ignore lockfiles in nested modules.

If you decide to remove it from this module:
-{
-  "pins" : [
-    ...
-  ],
-  "version" : 2
-}
4-74: Prefer constraining versions in Package.swift over relying on Package.resolved.

For library packages, clients ignore your lockfile. Encode your semver policy explicitly in Package.swift (e.g., .upToNextMinor for bugfix-only) so consumers resolve within intended bounds.

4-74: Remove unused dependencies: SherpaONNXTTS doesn’t import Alamofire, DeviceKit, Files, GRDB.swift, Pulse, swift-asn1, swift-crypto, or ZIPFoundation—remove them from Package.swift.
.gitignore (1)
52-55: Resolve LFS vs ignore conflict for XCFrameworks; also remove redundant EXTERNAL entry and keep a placeholder.

Currently, XCFrameworks are ignored here while the module’s .gitattributes attempts to store them via Git LFS—only one policy should exist. If the intent is “do not commit frameworks; build locally,” keep this ignore and add a placeholder exception; also drop the redundant EXTERNAL/sherpa-onnx entry because EXTERNAL/ is already ignored later.

Apply:
-# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build)
-sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/
-EXTERNAL/sherpa-onnx/
+# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build)
+sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/
+!sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/.gitkeep
sdk/runanywhere-swift/Modules/SherpaONNXTTS/.gitattributes (1)
1-4: Align LFS patterns with repo policy (ignored vs tracked binaries).

If XCFrameworks remain ignored, this LFS config is moot and confusing. If you plan to track binaries, narrow the scope to explicit frameworks to avoid sweeping other artifacts.

Option A (recommended if binaries stay ignored): remove this file entirely.

Option B (track specific frameworks via LFS):
-*.xcframework filter=lfs diff=lfs merge=lfs -text
-*.a filter=lfs diff=lfs merge=lfs -text
-XCFrameworks/** filter=lfs diff=lfs merge=lfs -text
+# Track only the shipped XCFrameworks
+XCFrameworks/SherpaONNXFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text
+XCFrameworks/ONNXRuntimeFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text
.github/pull_request_template.md (2)
12-13: Capitalize product/language names; minor wording polish.

Use “MacBook” and “Swift” for consistency.
-- [ ] Tested on Macbook if swift changes
-- [ ] Tested on Tablet/iPad if swift changes
+- [ ] Tested on MacBook (if Swift changes)
+- [ ] Tested on iPad/Tablet (if Swift changes)
28-29: Prompt for TTS evidence (audio) alongside UI screenshots.

Given TTS, ask submitters for short audio samples and device details.
-## Screenshots - Attach all the relevant UI changes screenshots for iOS/Android and MacOS/Tablet/large screen sizes
-- 
+## Screenshots & Media
+- Attach relevant UI screenshots for iOS/Android and macOS/iPad (large screens).
+- For TTS changes, attach short audio samples (or links) and note device model, iOS version, sample rate, and latency.
sdk/runanywhere-swift/Sources/RunAnywhere/Core/Models/Framework/FrameworkModality.swift (1)
45-46: Modality mapping for SherpaONNX is correct and matches intent.

Primary and supported modalities set to textToVoice; isVoiceFramework already covers both voice directions.

Optional: co-locate .sherpaONNX with other voice frameworks in the same switch group for readability.
-    case .whisperKit, .openAIWhisper:
-        return .voiceToText
-    case .sherpaONNX:
-        return .textToVoice
+    case .whisperKit, .openAIWhisper:
+        return .voiceToText
+    case .sherpaONNX:
+        return .textToVoice
Also applies to: 68-69
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (2)
191-196: Remove per-call model enumeration; it’s expensive and noisy.

Fetching available models on every transcription adds latency and log spam. Gate behind DEBUG and/or log once at init.

Apply this diff:
-        do {
-            let availableModels = try await WhisperKit.fetchAvailableModels()
-            logger.info("  Available models: \(availableModels)")
-        } catch {
-            logger.info("  Could not fetch available models: \(error)")
-        }
+        #if DEBUG
+        do {
+            let availableModels = try await WhisperKit.fetchAvailableModels()
+            logger.debug("Available models: \(availableModels)")
+        } catch {
+            logger.debug("Could not fetch available models: \(error)")
+        }
+        #endif
128-136: Padding with random noise makes outputs nondeterministic. Consider deterministic dither or zeros.

Random noise complicates testing and reproducibility.

Replace with a fixed very low-amplitude dither (e.g., a repeating sequence) or zeros. I can provide a deterministic generator if desired.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitAdapter.swift (1)

35-57: Adapter caching flow looks good; small resilience suggestion.

Looks solid. Consider marking cleanupStaleCache as @mainactor (or move cache state into an actor) to avoid races if adapters are used across threads.

sdk/runanywhere-swift/Modules/LLMSwift/Package.resolved (2)

76-82: Pre-release swift-syntax pin may require a newer toolchain than 5.9.

Pinned to 602.0.0-prerelease-2025-08-11; with tools 5.9 this could fail. Confirm your CI/Xcode version, or pin LLM.swift to a revision that resolves to a Swift 5.9-compatible swift-syntax.

Run CI with Xcode showing swiftc -version, or update the dependency pin accordingly. I can help pick a compatible rev.

1-95: Consider not committing Package.resolved for library-style modules.

Package.resolved is best for apps; libraries should allow clients to resolve. Keeping it may force consumers onto your exact graph.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (3)

57-70: Base URL derivation: handle non-/resolve/main/ URLs more robustly.

If the provided URL points directly to a file blob or a different branch/tag, current logic falls back to a fixed repo. Consider parsing owner/repo/path and preserving branch/tag if present.

100-105: Creating analytics/weights subdirs unconditionally.

Not harmful, but only weights/ exists in your file lists; consider creating subdirs lazily per needed file path.

216-233: mapToHuggingFacePath(): unconditional dropLast() can mis-map IDs without a hash suffix.

If modelId has no trailing hash, dropLast removes a real token. Only drop when a suffix matches a known hash pattern.

I can push a regex-based variant if helpful.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/TEAM_WORKFLOW.md (3)
160-169: Cache key should include the sherpa-onnx ref for deterministic CI.

Hashing only setup_frameworks.sh risks stale caches when upstream changes. Include a pinned tag/commit (or env) in the key.

Add to the example:
env:
  SHERPA_ONNX_REF: vX.Y.Z  # or a commit SHA

- uses: actions/cache@v3
  with:
    path: EXTERNAL/sherpa-onnx
-   key: sherpa-onnx-${{ hashFiles('**/setup_frameworks.sh') }}
+   key: sherpa-onnx-${{ env.SHERPA_ONNX_REF }}-${{ hashFiles('**/setup_frameworks.sh') }}
106-120: Strengthen the Git LFS guidance for existing binaries.

If binaries were ever committed without LFS, devs will need migration to avoid bloating history.

Augment with:
# For repos that previously committed binaries:
git lfs migrate import --include="*.xcframework,*.a"
211-215: Make “pin to specific sherpa-onnx commit/tag” actionable.

Add an explicit example of how to set and propagate the ref used by scripts and CI to avoid accidental upgrades.

Suggested addition:
# In setup/build scripts
: "${SHERPA_ONNX_REF:=vX.Y.Z}"
git fetch --tags
git checkout "$SHERPA_ONNX_REF"
thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md (5)
41-45: Remove or use the sampleRate initializer param.

Sherpa-ONNX exposes sample rate from the engine; passing it in here is misleading unless it configures resampling. Either wire it to config or drop it.

98-111: Verify model-type fields against the actual C API (likely mismatches).

“kitten” looks invalid; common TTS configs are vits/kokoro/etc. Field names like config.model.kitten.* may not exist.

I can align this section to the current C API once you confirm the targeted sherpa-onnx version/tag.

168-176: Import the correct module name in Swift.

import SherpaONNXFramework may be incorrect if the module map exports SherpaONNXBridge. Import should match the module.modulemap “module” name.

333-346: Avoid Data→Array→Data copies for volume; use Accelerate.

For large buffers this double copy is costly; vDSP scales in-place efficiently.

Example:
import Accelerate

private func applyVolume(to audioData: Data, volume: Float) -> Data {
    guard volume != 1.0 else { return audioData }
    var out = Data(count: audioData.count)
    audioData.withUnsafeBytes { inBuf in
        out.withUnsafeMutableBytes { outBuf in
            let n = audioData.count / MemoryLayout<Float>.size
            vDSP_vsmul(inBuf.bindMemory(to: Float.self).baseAddress!, 1,
                       [volume],
                       outBuf.bindMemory(to: Float.self).baseAddress!, 1,
                       vDSP_Length(n))
        }
    }
    return out
}
428-434: module.modulemap: consider ‘explicit’ module and header placement.

Mark the module explicit and ensure the header path matches the packaged layout to avoid ambiguous imports when combined with other ObjC++ modules.
-module SherpaONNXBridge {
+explicit module SherpaONNXBridge {
     header "SherpaONNXBridge.h"
     export *
 }
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftTemplateResolver.swift (1)

15-21: Minor: avoid repeated string scanning.

Cache lowercased filename once (already done) and consider a lookup table or ordered rules to simplify maintenance.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/README.md (2)

45-51: Avoid static line-number references in docs.

“Garbled output detection (lines 435-477)” will drift. Describe the behavior, not the line range, or link to a symbol.

89-91: Confirm and pin dependency/version constraints.

Verify the minimum OS versions and “WhisperKit 0.10.2+” are accurate for this PR branch; consider pinning an exact tag in Package.swift examples.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/build_frameworks.sh (1)
49-57: Pre-flight dependency checks improve UX.

Fail early if required tools are missing (git, cmake, xcodebuild).
command -v cmake >/dev/null || { echo -e "${RED}❌ cmake not found${NC}"; exit 1; }
xcodebuild -version >/dev/null 2>&1 || { echo -e "${RED}❌ Xcode CLTs not found${NC}"; exit 1; }
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.resolved (1)

1-121: Remove leaf-module Package.resolved files
Detected Package.resolved in module folders under sdk/runanywhere-swift/Modules (SherpaONNXTTS, FluidAudioDiarization, LLMSwift, WhisperKitTranscription). Retain only the root sdk/runanywhere-swift/Package.resolved to reduce churn and avoid lockfile conflicts.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXModelManager.swift (3)
25-100: Cache model definitions and pin URLs by revision.

Avoid rebuilding arrays repeatedly; keep a cached list.

“resolve/main” is mutable; prefer immutable, revision-pinned URLs for reproducible downloads.
-    private func createModelDefinitions() -> [ModelInfo] {
-        return [
+    private lazy var modelsCache: [ModelInfo] = {
+        [
             // Kitten ...
             ModelInfo(
-                downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/model.onnx"),
+                downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/model.onnx"),
                 ...
                 alternativeDownloadURLs: [
-                    URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/voices.json"),
+                    URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/voices.json"),
                     ...
                 ].compactMap { $0 }
             ),
             // Repeat for other models...
-        ]
-    }
+        ]
+    }()
Also: if these files are required assets (not alternates), align naming with your DownloadStrategy (e.g., “additionalFiles”) to avoid misinterpretation.

102-105: Don’t regenerate to search; use the cache or registry.

Rebuilding the array for each lookup is wasteful. Use modelsCache.
-    func getModel(by id: String) -> ModelInfo? {
-        return createModelDefinitions().first { $0.id == id }
-    }
+    func getModel(by id: String) -> ModelInfo? {
+        return modelsCache.first { $0.id == id }
+    }
107-114: Implement a basic device-aware selector.

Small heuristic beats a hardcoded ID and prevents oversized models on constrained devices.
-    func selectOptimalModel() -> String {
-        // TODO: Implement device capability detection
-        // Consider available memory, CPU performance, etc.
-        // For now, return the smallest model
-        return "sherpa-kitten-nano-v0.1"
-    }
+    func selectOptimalModel() -> String {
+        let mem = ProcessInfo.processInfo.physicalMemory
+        if mem >= 3_000_000_000 { return "sherpa-kokoro-en-v0.19" }
+        return "sherpa-kitten-nano-v0.1"
+    }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.h (1)
59-61: Prefer invalidate and make it idempotent.

Name conveys lifecycle intent better than destroy; ensure multiple calls are safe.
-- (void)destroy;
+- (void)invalidate;
Update implementation accordingly.
thoughts/shared/plans/sherpa_onnx_tts_complete_plan.md (1)

434-443: Turn performance targets into CI checks.

Add simple benchmarks or smoke tests to fail PRs when RTF/memory regress beyond thresholds.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleCore.swift (3)
63-74: De-dupe model registration and avoid silent failures.

Guard duplicate IDs or let RegistryService upsert; also prefer structured logging over print.
-        guard let registry = serviceContainer.modelRegistry as? RegistryService else {
-            print("[RunAnywhereSDK] Failed to register module models: Registry service not available")
+        guard let registry = serviceContainer.modelRegistry as? RegistryService else {
+            // TODO: Inject logger; avoid print in public API.
             return
         }
-        for model in models {
-            registry.registerModel(model)
-        }
+        let unique = Dictionary(grouping: models, by: { $0.id }).compactMap { $0.value.first }
+        unique.forEach { registry.registerModel($0) }
76-85: Async not needed here.

Method is synchronous; consider dropping async to avoid misleading callers.
-    public func getModelLocalPath(for modelId: String) async -> URL? {
+    public func getModelLocalPath(for modelId: String) -> URL? {
         guard let model = serviceContainer.modelRegistry.getModel(by: modelId) else {
             return nil
         }
         return model.localPath
     }
37-45: Avoid duplicating cache-clearing logic.

Delegate to the file manager’s clearModuleCache to keep one source of truth.
-    public func clearModuleCache(moduleId: String) throws {
-        let baseFolder = serviceContainer.fileManager.getBaseFolder()
-        if let cacheFolder = try? baseFolder.subfolder(named: "Cache"),
-           let moduleFolder = try? cacheFolder.subfolder(named: moduleId) {
-            try moduleFolder.delete()
-        }
-    }
+    public func clearModuleCache(moduleId: String) throws {
+        try serviceContainer.fileManager.clearModuleCache(moduleId)
+    }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.mm (4)
24-26: Remove unused headers

<vector> and <string> aren’t used.
-#include <vector>
-#include <string>
156-163: Clamp speed and harden speaker validation

Avoid out-of-range values reaching the C API; clamp speed and guard zero speakers.
     // Validate speaker ID
-    if (speakerId < 0 || speakerId >= _numSpeakers) {
+    if (_numSpeakers <= 0 || speakerId < 0 || speakerId >= _numSpeakers) {
         NSLog(@"[SherpaONNXBridge] Invalid speaker ID: %ld (max: %d)",
               (long)speakerId, _numSpeakers - 1);
         speakerId = 0; // Default to first speaker
     }
 
+    // Clamp speed to a sane range [0.25, 4.0]
+    float clampedSpeed = fmaxf(0.25f, fminf(speed, 4.0f));
+
     // Generate audio
     const SherpaOnnxGeneratedAudio *audio = SherpaOnnxOfflineTtsGenerate(
         tts,
         [text UTF8String],
         (int32_t)speakerId,
-        speed
+        clampedSpeed
     );
Also applies to: 165-171

181-183: Use size_t for byte count (overflow-safe)

Avoid implicit signed-to-unsigned conversion and potential overflow on large buffers.
-    NSData *audioData = [NSData dataWithBytes:audio->samples
-                                        length:audio->n * sizeof(float)];
+    size_t byteCount = (size_t)audio->n * sizeof(float);
+    NSData *audioData = [NSData dataWithBytes:audio->samples length:byteCount];
276-283: Reset cached properties on destroy

Minor hygiene: reset _sampleRate/_numSpeakers after destroying tts.
     if (tts) {
         SherpaOnnxDestroyOfflineTts(tts);
         tts = nullptr;
     }
+    _sampleRate = 0;
+    _numSpeakers = 0;
examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift (1)
10-11: Conditionally import optional modules

Prevents build issues when these modules aren’t present in some configurations (e.g., CI, non-target platforms).
-import LLMSwift
-import WhisperKitTranscription
+#if canImport(LLMSwift)
+import LLMSwift
+#endif
+#if canImport(WhisperKitTranscription)
+import WhisperKitTranscription
+#endif
examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved (1)

70-74: LLM.swift pins are consistent across the repo
Both the example and SDK Package.resolved files reference https://github.com/eastriverlee/LLM.swift at revision 4c4e909ac4758c628c9cd263a0c25b6edff5526d.
Optional: pin LLM.swift to a semantic version tag in your Package.swift manifest to prevent drift.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.swift (1)

6-11: Platform matrix OK; consider documenting why iOS 16+/macOS 13+ are required

Matches WhisperKit’s requirements. Add a brief comment to prevent regressions.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift (1)
66-66: Use a fallback label when TTS model name is empty.

Mirror the LLM badge behavior to avoid showing a blank value before the view model is ready.
-ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel, color: .purple)
+ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel.isEmpty ? "Loading..." : viewModel.ttsModel, color: .purple)
Also applies to: 268-268
sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Compatibility/Services/FrameworkRecommender.swift (1)

161-163: Sherpa-ONNX scoring hooks added—LGTM.

Scores are consistent with adjacent framework ranges. Consider a minor bonus in calculateFormatScore for (.sherpaONNX, .onnx) to reflect native format preference, if you find selection too neutral across ONNX-capable frameworks.

Also applies to: 195-197, 228-230, 261-263
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/module.modulemap (1)
1-4: Consider declaring language requirements for the bridge.

If the bridge header pulls ObjC/C++ (likely given .mm usage), declaring requirements reduces miscompilation risks across toolchains.
 module SherpaONNXTTSBridge {
+    requires objc, cplusplus
     header "Internal/Bridge/SherpaONNXBridge.h"
     export *
 }
If the header is pure C with extern "C" guards, this change is optional; otherwise it helps ensure correct compilation modes.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (3)
22-22: Keep ttsModel in sync with actual TTS config

The UI-facing ttsModel is hardcoded and never updated when you build the pipeline. Set it based on the selected provider/voice to avoid drift.

Apply after you construct config (see next comment’s diff) or set it alongside selected voice/model:
-    @Published var ttsModel: String = "SherpaONNX"
+    @Published var ttsModel: String = "SherpaONNX"
And later (after config creation):
self.ttsModel = "SherpaONNX • expr-voice-2-f"
142-154: Avoid hardcoding TTS model/voice; expose configuration and clamp rate

Hardcoded IDs will break on devices without those assets and make A/B testing hard. Surface these as inputs or settings, and clamp rate to valid provider bounds to prevent undefined behavior.

Suggested refactor in-place:
-        let config = ModularPipelineConfig(
+        let selectedModelId = "sherpa-kitten-nano-v0.1" // TODO: load from settings/user selection
+        let selectedVoice = "expr-voice-2-f"            // TODO: load from settings/user selection
+        let selectedRate: Float = max(0.5, min(2.0, 1.0)) // clamp [0.5, 2.0] (verify provider bounds)
+        let config = ModularPipelineConfig(
             components: [.vad, .stt, .llm, .tts],
             vad: VADConfig(),
             stt: VoiceSTTConfig(modelId: whisperModelName),
             llm: VoiceLLMConfig(modelId: "default", systemPrompt: "You are a helpful voice assistant. Keep responses concise and conversational."),
-            tts: VoiceTTSConfig.sherpaONNX(
-                modelId: "sherpa-kitten-nano-v0.1",
-                voice: "expr-voice-2-f",
-                rate: 1.0
-            )
+            tts: VoiceTTSConfig.sherpaONNX(
+                modelId: selectedModelId,
+                voice: selectedVoice,
+                rate: selectedRate
+            )
         )
+        self.ttsModel = "SherpaONNX • \(selectedVoice)"
Please verify the acceptable rate range for Sherpa-ONNX on iOS and whether model/voice IDs match the download strategy. If desired, I can wire this to a settings store.

83-83: Prefer a typed notification name

Avoid stringly-typed "ModelLoaded". Define extension Notification.Name { static let modelLoaded = Notification.Name("ModelLoaded") } and use .modelLoaded for safety and discoverability.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/NEXT_STEPS.md (2)

133-138: Reality-check performance targets per device class

Targets like “TTFT <100ms” and “RTF >10x” may vary widely by model/device. Consider stating them as goals and adding a simple benchmarking harness (os_signpost + metrics) to validate.

127-130: Avoid committing large frameworks; prefer CI artifact caching

Beyond Git LFS, consider excluding XCFrameworks/ from VCS and producing them via CI with cached artifacts to keep the repo lean.

sdk/runanywhere-swift/Modules/LLMSwift/README.md (3)

41-51: Include adapter registration context and order

Mention this should be called early (e.g., app launch) before any model loads to avoid fallback adapters.

95-115: Document cancellation and backpressure behavior

Add a note on whether generation calls are cancelable, thread-safe, and how many concurrent generations the service supports.

165-171: Clarify defaults are examples, not guarantees

Context length, history limit, timeout, and memory estimation may vary by model/hardware. Rephrase as “defaults (configurable)” and link to the knobs.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/VoiceError.swift (2)
3-10: Consider Sendable/I18N and richer context

If VoiceError crosses concurrency boundaries, adopt Sendable (or document it doesn’t). Also consider localizable strings and attaching context (e.g., sample rate/channels) to unsupportedAudioFormat.

Example:
-public enum VoiceError: LocalizedError {
+public enum VoiceError: LocalizedError {
     case serviceNotInitialized
     case modelNotFound(String)
     case transcriptionFailed(Error)
     case insufficientMemory
-    case unsupportedAudioFormat
+    case unsupportedAudioFormat(expectedHz: Int, expectedChannels: Int)
11-24: Expose failureReason/recoverySuggestion for user guidance

Add failureReason/recoverySuggestion to improve UX messages (e.g., suggest closing background apps on low memory).
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftError.swift (2)
3-9: Preserve underlying cause for generation failures

Carrying only a String drops root-cause details. Include an optional underlying Error to aid debugging and telemetry. If concurrency requires it later, consider documenting Sendable constraints.
-public enum LLMSwiftError: LocalizedError {
+public enum LLMSwiftError: LocalizedError {
     case modelLoadFailed
     case initializationFailed
-    case generationFailed(String)
+    case generationFailed(String, underlying: Error? = nil)
     case templateResolutionFailed(String)
10-21: Include underlying error in description (when available)

Small improvement to surface details in logs while keeping messages user-friendly.
-        case .generationFailed(let reason):
-            return "Generation failed: \(reason)"
+        case .generationFailed(let reason, let underlying):
+            let detail = underlying.map { " (\($0.localizedDescription))" } ?? ""
+            return "Generation failed: \(reason)\(detail)"
sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md (4)
35-47: Add language to fenced code block (markdownlint MD040).

Specify a language for the "Directory Layout" fence to satisfy linters and improve rendering.
-```
+```text
 Modules/YourModule/
 ├── Package.swift              # SPM package definition
 ...
---

`100-111`: **Avoid nil URL in example code.**

URL(string:) can return nil; tighten the example to a guaranteed URL to reduce copy/paste footguns.

```diff
-            downloadURL: URL(string: "https://example.com/model.bin"),
+            downloadURL: URL(string: "https://example.com/model.bin")!, // safe in docs, or show:
+            // guard let url = URL(string: "https://example.com/model.bin") else { return }
151-153: Avoid top-level registration calls in library targets.

Top-level code (registerModuleDownloadStrategy) in SPM libraries can execute at import time and is discouraged. Show it inside init() or initialize().
-// Register strategy
-sdk.registerModuleDownloadStrategy(YourDownloadStrategy())
+// Register strategy during service setup (e.g., in init or initialize)
+sdk.registerModuleDownloadStrategy(YourDownloadStrategy())
251-273: Replace emphasis-as-heading (MD036) with proper headings.

Conform to markdownlint and improve skimmability.
-**Option A: Async Init (FluidAudioDiarization style)**
+#### Option A: Async Init (FluidAudioDiarization style)
 ...
-**Option B: Two-Phase (SherpaONNXTTS style)**
+#### Option B: Two-Phase (SherpaONNXTTS style)
sdk/runanywhere-swift/Modules/SherpaONNXTTS/README.md (3)
162-170: Add language to the architecture tree code fence (MD040).

Specify a language (text) for better rendering and to satisfy linters.
-```
+```text
 SherpaONNXTTS/
 ├── Sources/
 ...
---

`49-72`: **Clarify audio format returned by synthesize.**

Document PCM format (e.g., 16‑bit little‑endian, mono, sample rate) or provide an AudioBuffer/AVAudioPCMBuffer return type to reduce integration ambiguity.

Would you like a snippet showing returning AVAudioPCMBuffer and an example AVAudioEngine player?

---

`84-97`: **Streaming usage: show cancellation/backpressure handling.**

Add an example of cancelling the stream and noting per-chunk size to guide implementers integrating with audio queues.

</blockquote></details>
<details>
<summary>sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Voice/Services/VoiceCapabilityService.swift (1)</summary><blockquote>

`156-166`: **Log provider and modelId for better diagnosis.**

Include provider/modelId in the debug log to trace selection decisions.

```diff
-        logger.debug("Finding TTS service")
+        logger.debug("Finding TTS service (provider=\(ttsConfig?.provider.rawValue ?? "nil"), modelId=\(ttsConfig?.modelId ?? "nil"))")
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Models/Voice/VoiceTTSConfig.swift (1)

33-47: Consider input validation helpers.

Optional: add static clamps for rate/pitch/volume (e.g., 0.5–2.0, 0–1) to keep configs sane across providers.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/setup_frameworks.sh (4)
40-48: Remove $? check; rely on pipefail.

With pipefail, a failing curl|tar will exit non‑zero. Simplify and handle errors uniformly.
-    curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR"
-
-    if [ $? -eq 0 ]; then
-        echo "✅ Successfully downloaded pre-built frameworks!"
-        exit 0
-    else
-        echo "❌ Download failed. Falling back to local build..."
-    fi
+    if curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR"; then
+        echo "✅ Successfully downloaded pre-built frameworks!"
+        exit 0
+    fi
+    echo "❌ Download failed. Falling back to local build..."
75-77: Branch name safety.

Upstream default may be main; use the default remote branch to avoid failures on repos that renamed master→main.
-    git pull origin master
+    git pull --ff-only origin "$(git remote show origin | awk '/HEAD branch/ {print $NF}')"
91-93: Avoid hard-coding onnxruntime path/version.

The sherpa-onnx build layout/version can change. Prefer a find-based copy with validation.
-cp -R "build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework" "$XCFRAMEWORKS_DIR/"
+ONNXRT_SRC="$(fd -t d -a '^onnxruntime\.xcframework$' build-ios | head -n1)"
+cp -R "$ONNXRT_SRC" "$XCFRAMEWORKS_DIR/"
95-110: Add checksum print to aid cache debugging.

Printing shas helps teams verify identical artifacts.
     echo "✅ Framework setup completed successfully!"
 
     # Show framework sizes
     echo "📊 Framework sizes:"
     du -sh "$XCFRAMEWORKS_DIR"/*
 
+    echo "🔐 Checksums:"
+    (cd "$XCFRAMEWORKS_DIR" && shasum -a 256 -b sherpa-onnx.xcframework/Info.plist onnxruntime.xcframework/Info.plist)
+
     echo ""
     echo "🎉 SherpaONNX TTS is ready to use!"
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Tests/SherpaONNXTTSTests/SherpaONNXTTSTests.swift (3)
55-70: Avoid hard-coding model IDs in tests

Tie optimal-model assertions to registered content rather than a literal string to reduce brittleness across future catalog changes.
-        let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1")
+        let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1")
         XCTAssertNotNil(kittenModel)
-        XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1")
+        XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1")
@@
-        let optimalModel = manager.selectOptimalModel()
-        XCTAssertEqual(optimalModel, "sherpa-kitten-nano-v0.1")
+        let optimalModel = manager.selectOptimalModel()
+        XCTAssertEqual(optimalModel, kittenModel?.id)
27-35: Cover all cases via CaseIterable to catch future enum additions

Iterate SherpaONNXModelType.allCases to ensure new cases get tested automatically.
-        let modelTypes: [SherpaONNXModelType] = [.kitten, .kokoro, .vits, .matcha, .piper]
+        let modelTypes = SherpaONNXModelType.allCases
74-82: Skip async init when frameworks are absent

Proactively skip rather than implicitly succeed without initialize(); makes intent explicit.
     func testServiceInitializationAsync() async throws {
-        // This test would require XCFrameworks to be built
-        // For now, we just test that the service can be created
+        // This test would require XCFrameworks to be built
+        try XCTSkipIf(true, "Sherpa-ONNX XCFrameworks not available in CI yet")
         let service = SherpaONNXTTSService()
         XCTAssertNotNil(service)
 
         // Initialization would fail without frameworks
         // So we don't call initialize() in this test
     }
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+LLMModules.swift (1)
80-103: Heuristic selection should be case-insensitive and consider file extensions

Model IDs may vary in case; checking by extension is more robust.
-        if modelId.contains("mlx") && sdk.isMLXAvailable {
+        if modelId.lowercased().contains("mlx") && sdk.isMLXAvailable {
             return await sdk.createModuleLLMService(.mlx)
         }
 
-        if modelId.contains("gguf") && sdk.isLLMSwiftAvailable {
+        if modelId.lowercased().contains("gguf") && sdk.isLLMSwiftAvailable {
             return await sdk.createModuleLLMService(.llmSwift)
         }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXDownloadStrategy.swift (1)

89-96: Robustness: verify required files in addition to a marker

A simple marker can become stale; optionally validate expected filenames exist.
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (5)
57-66: Duplicate readiness guard

Second guard duplicates the nil-check right above; remove it for clarity.
-            // Validate model readiness with a simple test prompt
-            logger.info("🧪 Validating model readiness with test prompt")
-            guard let llm = self.llm else {
-                throw FrameworkError(
-                    framework: .llamaCpp,
-                    underlying: LLMSwiftError.modelLoadFailed,
-                    context: "Failed to initialize LLM.swift with model at \(modelPath)"
-                )
-            }
+            // Validate model readiness with a simple test prompt
+            logger.info("🧪 Validating model readiness with test prompt (skipped)")
129-137: Don’t log full prompts in production logs

Reduce risk of PII leakage; log a bounded prefix.
-            logger.info("📝 Full prompt being sent to LLM:")
-            logger.info("---START PROMPT---")
-            logger.info("\(fullPrompt)")
-            logger.info("---END PROMPT---")
+            logger.info("📝 Prompt (prefix 512 chars): \(fullPrompt.prefix(512))")
333-337: Implement generation options or document placeholders

applyGenerationOptions currently does nothing; either wire supported parameters or mark unsupported explicitly.

318-322: Context memory estimate: tie to actual context length

If max context differs from 2048, adjust estimate accordingly.

1-5: Prefer import os over import os.log with Logger

Minor consistency nit.
-import os.log
+import os
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXConfiguration.swift (1)

65-79: Consider using ByteCount for clarity of memory sizes

estimatedMemoryUsage is an Int; consider documenting units or using a typealias for bytes.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/BUILD_DOCUMENTATION.md (2)
86-92: Avoid hardcoding ONNX Runtime versioned paths in manual copy instructions

The docs pin onnxruntime.xcframework under .../ios-onnxruntime/1.17.1/..., while later steps use a wildcard. This is brittle and will break on version bumps.

Suggested edit:
- cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework XCFrameworks/
+ cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/*/onnxruntime.xcframework XCFrameworks/
25-26: Clarify actual platform support vs. bridge availability

You advertise multi-platform (iOS, macOS, tvOS, watchOS), but the bridge and binary frameworks are gated only for iOS/macOS in code. Call this out explicitly to prevent confusion for tvOS/watchOS integrators.

Would you like me to PR a short “Platform Support” section noting that tvOS/watchOS builds are currently not supported by the Sherpa bridge and binaries?

Also applies to: 123-128
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Bridge/SherpaONNXWrapper.swift (2)

100-151: Streaming implementation chunks raw bytes with float-size assumptions

Chunking uses MemoryLayout<Float>.size and a fixed 16kHz without confirming the actual sample format/rate returned by the bridge. This risks producing malformed chunks and drift.

Derive chunk size from sampleRate() and the bridge’s actual sample format (e.g., f32 vs s16), or expose metadata from the bridge.

If the bridge returns raw PCM, document the format and consider emitting WAV-framed chunks for consumers that expect containerized audio.

Also applies to: 131-144

66-98: Pitch and volume arguments ignored

You accept pitch and volume but do not pass them to the bridge. If unsupported, document this and consider applying gain/pitch-shift client-side or dropping the parameters from this API.

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+VoiceModules.swift (1)

37-39: Prefer unified logging over print

Use the SDK’s logging mechanism (os.Logger or a shared logger) instead of print for consistency and to avoid noisy production logs.

Also applies to: 75-77, 128-138, 165-177

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXTTSService.swift (2)

124-151: Initialize path: swallow errors and race on voice set

You set currentVoice by queuing setVoice in a fire-and-forget Task and ignore errors. If this fails, the service state is inconsistent.

Either make currentVoice setter async or document that it’s best-effort and log any failure explicitly.

6-9: Optional: Make service discoverable via Objective‑C if you keep reflection

If you choose reflection over a registry, mark the class @objc(SherpaONNXTTSService) and inherit from NSObject to allow NSClassFromString discovery, and expose an init() or static factory the loader can call.

I can generate a small factory shim if you want to keep runtime discovery without direct imports.

Also applies to: 376-381, 413-425

examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)

245-253: Confirm no duplication between root package and per-module packages.

You reference "../../../sdk/runanywhere-swift" and the individual module packages. Ensure the root package does not also expose LLMSwift/WhisperKitTranscription/SherpaONNXTTS to avoid duplicate targets.

705-716: Stabilize local package paths.

Relative paths are fragile if the project is moved. Prefer anchoring paths to $(SRCROOT) via workspace-level SPM or consolidating under a single aggregator Package.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift (2)
182-187: Return a stable, deterministic ordering.

Makes UI/debug output predictable.
-    public func getAllModules() -> [ModuleConfiguration] {
-        queue.sync {
-            Array(registeredModules.values)
-        }
-    }
+    public func getAllModules() -> [ModuleConfiguration] {
+        queue.sync {
+            registeredModules.values.sorted { $0.moduleId < $1.moduleId }
+        }
+    }
70-77: Mark update check as TODO or surface “not implemented.”

Stub returning false can hide real update availability.

coderabbitai · 2025-08-30T19:58:02Z

examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj

 /* Begin PBXBuildFile section */
 		541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };
 		543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };
+		54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };
+		54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };
+		5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };
 		547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };


🛠️ Refactor suggestion

Delete PBXBuildFile for LLM to complete the migration.

Leaving the PBXBuildFile will keep stale references.

- 543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/* Begin PBXBuildFile section */

541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };

543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };

54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };

54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };

5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };

547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };

/* Begin PBXBuildFile section */

541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };

54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };

54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };

5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };

547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };

🤖 Prompt for AI Agents

In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj around lines 9 to 15, the PBXBuildFile entry for "LLM in Frameworks" remains and must be removed to avoid stale references; delete the entire line "543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };" and ensure any commas or separators in the surrounding PBXBuildFile section remain syntactically correct (adjust trailing/leading commas or line endings) so the project.pbxproj stays valid.

coderabbitai · 2025-08-30T19:58:02Z

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift

+        // Sherpa Kitten TTS Nano - Smallest, fastest TTS model (fixed URL)
+        ModelInfo(
+            id: "sherpa-kitten-nano-v0.1",
+            name: "Kitten TTS Nano (Sherpa-ONNX)",
+            format: .onnx,
+            downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx"),
+            estimatedMemory: 50_000_000, // 50MB
+            contextLength: 0, // Not applicable for TTS models
+            downloadSize: 25_000_000, // ~25MB
+            compatibleFrameworks: [.sherpaONNX],
+            preferredFramework: .sherpaONNX,
+            metadata: ModelInfoMetadata(
+                tags: ["tts", "sherpa-onnx", "lightweight"],
+                description: "Lightweight neural TTS with 8 expressive voices"
+            ),
+            supportsThinking: false
+        ),
+


💡 Verification agent

🧩 Analysis chain

ONNX TTS entries look correct; verify URLs and registry uniqueness.

Both models are well-specified. Please verify:

URLs resolve and sizes roughly match downloadSize.

Model IDs are unique across predefinedModels.

Sherpa-ONNX path requires no auxiliary assets (e.g., speaker metadata) for these two models.

Run this script to check reachability/size and duplicate IDs:

Also applies to: 188-205

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Validate the two ONNX URLs resolve and report a Content-Length for url in \ "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx" \ "https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx" do echo "Checking: $url" code=$(curl -sIL -o /dev/null -w '%{http_code}' "$url") size=$(curl -sIL "$url" | rg -i '^content-length:' | awk '{print $2}' | tr -d '\r') echo "HTTP $code, Content-Length: ${size:-unknown}" done # 2) Detect duplicate model IDs in ModelListViewModel.swift file="examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift" echo "Scanning for duplicate IDs in $file" rg -nP '\bid:\s*"([^"]+)"' "$file" | sed -E 's/.*id:\s*"([^"]+)".*/\1/' | sort | uniq -d | awk 'NF{print "Duplicate ID: "$0}' || true

Length of output: 2295

🏁 Script executed:

#!/bin/bash sed -n '185,220p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift

Length of output: 1645

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Kitten siblings non-onnx:" curl -s https://huggingface.co/api/models/KittenML/kitten-tts-nano-0.1 | jq -r '.siblings[].rfilename' | grep -v '\.onnx$' || echo "None" echo "Piper siblings in en/en_US/lessac/medium non-onnx:" curl -s https://huggingface.co/api/models/rhasspy/piper-voices | jq -r '.siblings[].rfilename' | grep '^en/en_US/lessac/medium/' | grep -v '\.onnx$' || echo "None"

Length of output: 678

Include auxiliary assets for ONNX TTS models
URLs resolve (Kitten ~23.8 MB; Piper ~63.2 MB) and model IDs are unique. Both repos expose non-ONNX files—Kitten: voices.npz; Piper: en_US-lessac-medium.onnx.json—which may be required at runtime. Confirm their necessity and extend your model definitions to download or document these assets.

🤖 Prompt for AI Agents

In examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift around lines 170 to 187, the ONNX TTS model entry for "sherpa-kitten-nano-v0.1" omits auxiliary assets (e.g., voices.npz) that the repo exposes and may be required at runtime; update the ModelInfo for Kitten (and any Piper entries) to either include these auxiliary asset URLs in a new metadata.assets or downloadURLs array and mark them as required, or add explicit documentation fields noting which auxiliary files are needed and where to fetch them; ensure the model IDs remain unique, add estimated sizes for the assets, and update any download/installer logic to fetch and validate these auxiliary files alongside the .onnx file.

coderabbitai · 2025-08-30T19:58:02Z

sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md

+// 2. Implement service protocol
+public class MyModuleService: YourServiceProtocol {
+    private let sdk = RunAnywhereSDK.shared
+
+    public init() {
+        // Register models, strategies, etc.
+    }
+}


⚠️ Potential issue

Fix protocol/type naming inconsistency in samples.

Quick Start shows MyModuleService: YourServiceProtocol, while Step-by-Step uses YourModuleService: YourProtocol. Pick one and use it consistently to avoid confusion.

Apply:

- public class YourModuleService: YourProtocol { + public class YourModuleService: YourServiceProtocol {

Also applies to: 85-91

🤖 Prompt for AI Agents

In sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md around lines 18-25 (and also applicable to lines 85-91), the sample class and protocol/type names are inconsistent (MyModuleService: YourServiceProtocol vs YourModuleService: YourProtocol); pick one naming convention and make both samples identical. Replace occurrences so both snippets use the same service and protocol names (for example MyModuleService implementing YourServiceProtocol) and update any related references in those line ranges to match exactly.

coderabbitai · 2025-08-30T19:58:02Z

sdk/runanywhere-swift/Modules/LLMSwift/Package.swift

+        // LLM.swift dependency - using latest from GitHub main branch
+        .package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"),
+        // Reference to main SDK for protocols
+        .package(path: "../../"),
+    ],


🛠️ Refactor suggestion

Avoid tracking a moving target; pin LLM.swift to a tag or revision.

Depending on branch "main" risks breakages; you already pinned a revision in Package.resolved. Pin here too.

Apply this diff (using the resolved commit):

- .package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"), + .package(url: "https://github.com/eastriverlee/LLM.swift", revision: "4c4e909ac4758c628c9cd263a0c25b6edff5526d"),

🤖 Prompt for AI Agents

In sdk/runanywhere-swift/Modules/LLMSwift/Package.swift around lines 19 to 23, the LLM.swift dependency is pinned to branch "main" which is a moving target; replace the branch specifier with the exact revision (commit hash) recorded in Package.resolved. Edit the .package(...) entry to use .revision("<<COMMIT_HASH_FROM_Package.resolved>>") (or the appropriate tag string if Package.resolved shows a tag) instead of branch: "main", then run swift package resolve to verify and commit the updated Package.swift.

coderabbitai · 2025-08-30T19:58:03Z

sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift

 public class LLMSwiftService: LLMService {
    private var llm: LLM?
    private var modelPath: String?
    private var _modelInfo: LoadedModelInfo?
-    // Removed context property - no longer using Context type
    private let hardwareConfig: HardwareConfiguration?
-    private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService")
+    private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")


⚠️ Potential issue

Expose via Obj‑C and provide a public initializer for reflection-based creation

To work with NSClassFromString, LLMSwiftService must be an NSObject subclass with a public init.

-public class LLMSwiftService: LLMService { +@objc(LLMSwiftService) +public class LLMSwiftService: NSObject, LLMService { @@ - init(hardwareConfig: HardwareConfiguration? = nil) { - self.hardwareConfig = hardwareConfig - } + public override init() { + self.hardwareConfig = nil + super.init() + } + + public init(hardwareConfig: HardwareConfiguration? = nil) { + self.hardwareConfig = hardwareConfig + super.init() + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public class LLMSwiftService: LLMService {

private var llm: LLM?

private var modelPath: String?

private var _modelInfo: LoadedModelInfo?

// Removed context property - no longer using Context type

private let hardwareConfig: HardwareConfiguration?

private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService")

private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")

@objc(LLMSwiftService)

public class LLMSwiftService: NSObject, LLMService {

private var llm: LLM?

private var modelPath: String?

private var _modelInfo: LoadedModelInfo?

private let hardwareConfig: HardwareConfiguration?

private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")

public override init() {

self.hardwareConfig = nil

super.init()

}

public init(hardwareConfig: HardwareConfiguration? = nil) {

self.hardwareConfig = hardwareConfig

super.init()

}

// … rest of implementation …

}

coderabbitai · 2025-08-30T19:58:06Z

...ift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift

+    /// Register a module
+    public func register(_ configuration: ModuleConfiguration) {
+        queue.async(flags: .barrier) {
+            self.registeredModules[configuration.moduleId] = configuration
+        }
+    }


🛠️ Refactor suggestion

Make registration synchronous to guarantee visibility upon return.

async(flags: .barrier) returns before the write completes, causing racy reads immediately after register.

- public func register(_ configuration: ModuleConfiguration) { - queue.async(flags: .barrier) { - self.registeredModules[configuration.moduleId] = configuration - } - } + public func register(_ configuration: ModuleConfiguration) { + queue.sync(flags: .barrier) { + self.registeredModules[configuration.moduleId] = configuration + } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// Register a module

public func register(_ configuration: ModuleConfiguration) {

queue.async(flags: .barrier) {

self.registeredModules[configuration.moduleId] = configuration

}

}

/// Register a module

public func register(_ configuration: ModuleConfiguration) {

queue.sync(flags: .barrier) {

self.registeredModules[configuration.moduleId] = configuration

}

}

🤖 Prompt for AI Agents

In sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift around lines 175 to 180, the register(_:) method currently uses queue.async(flags: .barrier) which returns before the write completes and can cause racy reads; change it to perform the write synchronously (use queue.sync(flags: .barrier)) so the registration is visible to callers on return, and guard against potential deadlock by ensuring register is never called from the same serial/target queue (or use a re-entrant-safe mechanism) before switching to sync.

coderabbitai · 2025-08-30T19:58:06Z

...-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+VoiceModules.swift

+    /// Create Sherpa-ONNX TTS if module is available
+    private func createSherpaONNXTTS() -> (any TextToSpeechService)? {
+        // Check if SherpaONNXTTS module is available
+        let className = "SherpaONNXTTS.SherpaONNXTTSService"
+
+        guard NSClassFromString(className) != nil else {
+            print("[RunAnywhereSDK] SherpaONNXTTS module not found. Add it to your app dependencies.")
+            return nil
+        }
+
+        // Module should register itself and provide factory when imported
+        // For now, return nil - actual instantiation will be handled by the module
+        return nil
+    }


🛠️ Refactor suggestion

⚠️ Potential issue

Dynamic module discovery cannot instantiate pure-Swift classes

NSClassFromString("SherpaONNXTTS.SherpaONNXTTSService") will return nil for a Swift class that doesn’t inherit from NSObject or have @objc. Even if the class is found, you return nil instead of constructing the service, so you’ll always fall back to System TTS.

Options:

Adopt a small registry: the module calls RunAnywhereSDK.registerTTSFactory { SherpaONNXTTSService(sdk: .shared) } at load; this extension pulls from the registry.

Or make SherpaONNXTTSService @objc(SherpaONNXTTSService) and subclass NSObject, then reflect and instantiate, and call await service.initialize().

Example registry sketch:

+public typealias TTSFactory = () async -> (any TextToSpeechService) +private var ttsFactories: [String: TTSFactory] = [:] +public static func registerTTSFactory(name: String, factory: @escaping TTSFactory) { ttsFactories[name] = factory } +public static func resolveTTSFactory(name: String) -> TTSFactory? { ttsFactories[name] }

Then:

- return createSherpaONNXTTS() + if let f = RunAnywhereSDK.resolveTTSFactory(name: "SherpaONNXTTS") { return await f() } + return nil

Also applies to: 100-116, 123-139

coderabbitai · 2025-08-30T19:58:06Z

...-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+VoiceModules.swift

+        // Fallback to system TTS
+        print("[VoiceModuleFactory] Using System TTS")
+        return SystemTextToSpeechService()
+    }
+
+    /// Create TTS service based on configuration
+    public static func createTTSService(from config: VoiceTTSConfig) async -> any TextToSpeechService {
+        let sdk = RunAnywhereSDK.shared
+
+        switch config.provider {
+        case .system:
+            return SystemTextToSpeechService()
+
+        case .sherpaONNX:
+            if sdk.isSherpaONNXTTSAvailable {
+                // Module is available but needs proper instantiation
+                // For now, fallback to system until module provides factory
+                print("[VoiceModuleFactory] Sherpa-ONNX module detected, awaiting factory implementation")
+            }
+            // Fallback to system TTS
+            return SystemTextToSpeechService()
+


💡 Verification agent

❓ Verification inconclusive

Factory returns System TTS even when module is present

You detect the Sherpa module then still return SystemTextToSpeechService(). This hides the new feature.

Wire the actual instantiation (via registry or reflection as above) and initialize the service before returning it.

Implement Sherpa-ONNX TTS instantiation in createTTSService
In RunAnywhereSDK+VoiceModules.swift’s createTTSService(from:), the .sherpaONNX case always falls back to SystemTextToSpeechService()—even when sdk.isSherpaONNXTTSAvailable is true. Wire up the actual Sherpa-ONNX implementation (via the registry/reflection mechanism) and return it instead of the system fallback.

coderabbitai · 2025-08-30T19:58:06Z

thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md

+    func synthesizeStream(text: String, rate: Float, pitch: Float, volume: Float) -> AsyncThrowingStream<Data, Error> {
+        AsyncThrowingStream { continuation in
+            Task {
+                do {
+                    // For streaming, we need to chunk the text
+                    let sentences = splitIntoSentences(text)
+
+                    for sentence in sentences {
+                        guard let audioChunk = bridge?.synthesizeText(
+                            sentence,
+                            speakerId: getCurrentSpeakerId(),
+                            speed: rate
+                        ) else {
+                            throw SherpaONNXError.synthesisFailure("Stream synthesis failed")
+                        }
+
+                        continuation.yield(applyVolume(to: audioChunk, volume: volume))
+                    }
+
+                    continuation.finish()
+                } catch {
+                    continuation.finish(throwing: error)
+                }
+            }
+        }
+    }
+


🛠️ Refactor suggestion

Serialize access to the native TTS handle.

The plan calls out non-thread-safety, but the stream method can concurrently call into bridge. Wrap all native calls on a serial queue or an actor.

Apply a serial queue:

final class SherpaONNXWrapper { - private var bridge: SherpaONNXBridge? + private var bridge: SherpaONNXBridge? + private let ttsQueue = DispatchQueue(label: "com.runanywhere.sherpa.tts") @@ - guard let audioData = bridge.synthesizeText( + let audioData = try await withCheckedThrowingContinuation { cont in + ttsQueue.async { + let data = bridge.synthesizeText( text, speakerId: speakerId, speed: rate - ) else { + ) + guard let data else { cont.resume(throwing: SherpaONNXError.synthesisFailure("Failed to generate audio")); return } + cont.resume(returning: data) + } + }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 220 to 246, the streaming method calls into the non-thread-safe native `bridge` from potentially concurrent contexts; serialize all access to `bridge` by dispatching synthesizeText calls onto a dedicated serial queue or by routing them through an actor, await the serialized call result before yielding to the continuation, and capture any thrown errors to finish the continuation with that error; ensure continuation.yield/finish are invoked from the Task context after the serialized bridge call completes and propagate errors from the bridge back to the caller.

coderabbitai · 2025-08-30T19:58:06Z

thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md

+Create `build_sherpa_onnx.sh`:
+```bash
+#!/bin/bash
+
+# Build Sherpa-ONNX XCFrameworks for iOS
+
+set -e
+
+SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
+PROJECT_ROOT="$SCRIPT_DIR/../.."
+EXTERNAL_DIR="$PROJECT_ROOT/EXTERNAL"
+MODULE_DIR="$PROJECT_ROOT/sdk/runanywhere-swift/Modules/SherpaONNXTTS"
+
+echo "🔨 Building Sherpa-ONNX XCFrameworks..."
+
+# Clone if not exists
+if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
+    echo "📥 Cloning sherpa-onnx..."
+    git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
+fi
+
+# Build
+cd "$EXTERNAL_DIR/sherpa-onnx"
+echo "🏗️ Building for iOS..."
+./build-ios.sh
+
+# Copy frameworks
+echo "📦 Copying XCFrameworks..."
+mkdir -p "$MODULE_DIR/XCFrameworks"
+cp -r build-ios/sherpa-onnx.xcframework "$MODULE_DIR/XCFrameworks/"
+cp -r build-ios/onnxruntime.xcframework "$MODULE_DIR/XCFrameworks/"
+
+echo "✅ Build complete!"
+echo "📍 Frameworks located at: $MODULE_DIR/XCFrameworks/"
+```


🛠️ Refactor suggestion

Pin the upstream ref and harden the build script.

Use a fixed tag/commit and stricter shell options to avoid accidental upgrades and silent failures.

-#!/bin/bash -set -e +#!/usr/bin/env bash +set -euo pipefail +IFS=$'\n\t' @@ -if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then +REF="${SHERPA_ONNX_REF:-vX.Y.Z}" # set a default or pass in env +if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then echo "📥 Cloning sherpa-onnx..." - git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx" + git clone --depth 1 --branch "$REF" https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx" else - echo "📥 Pulling latest changes..." - cd "$EXTERNAL_DIR/sherpa-onnx" - git pull + cd "$EXTERNAL_DIR/sherpa-onnx" + git fetch --tags --depth 1 origin "$REF" + git checkout -f "$REF" fi

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 351 to 385, the new build_sherpa_onnx.sh script lacks a pinned upstream ref and is brittle; update the script to use strict shell options (set -euo pipefail) and pin the repository to a specific tag or commit (use git clone --branch <TAG> or clone then git checkout <COMMIT>), verify the clone succeeded and the expected build script (build-ios.sh) exists and is executable before running it, fail with clear error messages if checkout/clone or build steps fail, and add checks that the expected XCFramework output paths exist before copying (exit nonzero with a helpful message if not).

coderabbitai

Review continued from previous batch...

coderabbitai · 2025-08-30T19:59:12Z

...ift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift

+    public func registerAndDownloadModel(
+        name: String,
+        url: URL,
+        framework: LLMFramework,
+        additionalFiles: [URL] = []
+    ) async throws -> URL {
+        // Create model info with additional files
+        let model = sdk.addModelFromURL(
+            name: name,
+            url: url,
+            framework: framework,
+            estimatedSize: nil
+        )
+
+        // Download main model and additional files
+        return try await downloadModelWithProgress(model.id)
+    }


💡 Verification agent

🧩 Analysis chain

additionalFiles is unused. Implement or drop from API.

Current docs promise behavior that isn’t implemented.

Option A (remove now, reintroduce when ready):

- /// - additionalFiles: Additional files to download + /// - additionalFiles: (Removed) Additional files to download /// - Returns: Local path to downloaded model public func registerAndDownloadModel( name: String, url: URL, - framework: LLMFramework, - additionalFiles: [URL] = [] + framework: LLMFramework ) async throws -> URL { // Create model info with additional files let model = sdk.addModelFromURL( name: name, url: url, framework: framework, estimatedSize: nil ) - // Download main model and additional files + // Download main model return try await downloadModelWithProgress(model.id) }

Option B (implement): loop and register extra assets via the SDK’s API (tell me the method names and I’ll wire it up).

🏁 Script executed:

#!/bin/bash # Search for definition of addModelFromURL to inspect its signature rg -nP --type swift 'func addModelFromURL' -C3 # Search for any usage of the `additionalFiles` parameter across the codebase rg -n --type swift 'additionalFiles' -C3

Length of output: 3907

Remove unused additionalFiles parameter
The additionalFiles argument is never consumed—drop it from the method signature and documentation.

- /// - additionalFiles: Additional files to download + /// - additionalFiles: (removed) /// - Returns: Local path to downloaded model public func registerAndDownloadModel( name: String, url: URL, - framework: LLMFramework, - additionalFiles: [URL] = [] + framework: LLMFramework ) async throws -> URL { - // Download main model and additional files + // Download main model return try await downloadModelWithProgress(model.id) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

public func registerAndDownloadModel(

name: String,

url: URL,

framework: LLMFramework,

additionalFiles: [URL] = []

) async throws -> URL {

// Create model info with additional files

let model = sdk.addModelFromURL(

name: name,

url: url,

framework: framework,

estimatedSize: nil

)

// Download main model and additional files

return try await downloadModelWithProgress(model.id)

}

/// - additionalFiles: (removed)

/// - Returns: Local path to downloaded model

public func registerAndDownloadModel(

name: String,

url: URL,

framework: LLMFramework

) async throws -> URL {

// Create model info with additional files

let model = sdk.addModelFromURL(

name: name,

url: url,

framework: framework,

estimatedSize: nil

)

// Download main model

return try await downloadModelWithProgress(model.id)

}

🤖 Prompt for AI Agents

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift lines 52-68: the additionalFiles parameter is unused and should be removed from the method signature and any public documentation; update the function signature to remove additionalFiles, remove any references to it in the implementation (no other code changes needed inside since it was unused), update all call sites and tests to call registerAndDownloadModel(name:url:framework:) without the extra argument, and update API docs/comments to reflect the new signature.

sanchitmonga22 added 4 commits August 21, 2025 16:09

sanchitmonga22 requested a review from shubhammalhotra28 August 22, 2025 00:07

sanchitmonga22 changed the title ~~Smonga/tts handling~~ [iOS-SDK] TTS integration + ONNX runtime integration to run TTS models Aug 22, 2025

sanchitmonga22 added the ios-sdk label Aug 22, 2025

sanchitmonga22 changed the base branch from smonga/voice_refactor to main August 22, 2025 00:44

sanchitmonga22 added 8 commits August 21, 2025 17:55

Update pull_request_template.md

ebbb383

adding plans for llm swift extraction and whisper kit extraction

9d12027

removing docs

61e737d

adding the dependency

e1c4027

coderabbitai bot reviewed Aug 30, 2025

View reviewed changes

[iOS-SDK] TTS integration + ONNX runtime integration to run TTS models #43

Are you sure you want to change the base?

[iOS-SDK] TTS integration + ONNX runtime integration to run TTS models #43

Uh oh!

Conversation

sanchitmonga22 commented Aug 21, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Labels

Checklist

Summary by CodeRabbit

Uh oh!

sanchitmonga22 commented Aug 30, 2025

Uh oh!

coderabbitai bot commented Aug 30, 2025

Uh oh!

coderabbitai bot commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sanchitmonga22 commented Aug 21, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 30, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)