Skip to content

Conversation

@sanchitmonga22
Copy link
Contributor

@sanchitmonga22 sanchitmonga22 commented Aug 21, 2025

Description

This pull request introduces Text-to-Speech (TTS) capabilities to the iOS SDK. It achieves this by integrating the ONNX Runtime, allowing the SDK to execute various TTS models. This integration offers the following key features:

  • ONNX Runtime Integration: Enables the execution of pre-trained TTS models using the ONNX (Open Neural Network Exchange) format.
  • TTS Functionality: Provides the SDK with the ability to synthesize spoken audio from text input.
  • Modular Design: The changes are structured to be easily maintainable and extensible for future TTS model integrations.

Type of Change

  • Bug fix
  • New feature
  • Documentation update
  • Refactoring

Testing

  • Tests pass locally
  • Added/updated tests for changes

Labels

Please add the appropriate label(s):

  • iOS SDK - Changes to iOS/Swift SDK
  • Android SDK - Changes to Android/Kotlin SDK
  • iOS Sample - Changes to iOS example app
  • Android Sample - Changes to Android example app

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Documentation updated (if needed)

Summary by CodeRabbit

  • New Features

    • Added on-device Sherpa‑ONNX TTS with multiple voices, streaming synthesis, and playback controls; selectable via settings. UI now shows the active TTS model. Added new downloadable TTS models (Kitten, VITS, Kokoro, Matcha).
    • Introduced modular voice/LLM architecture with runtime provider selection; new LLMSwift backend option.
    • Packaged WhisperKit transcription as a module.
  • Documentation

    • Added module development guide, Sherpa‑ONNX TTS build/setup docs, and module READMEs.
  • Chores

    • Enhanced PR template with platform testing checkboxes and screenshots section.
    • Expanded .gitignore for large binary frameworks.

- Introduced new extensions for LLM and voice modules in RunAnywhereSDK, enhancing modularity and service creation.
- Implemented LLMModuleFactory and VoiceModuleFactory for streamlined service instantiation based on available modules.
- Added protocols for LLMService and SpeechToTextService to standardize module interactions.
- Created comprehensive configuration structures for LLM and voice modules, improving flexibility and usability.
- Established a ModuleIntegrationHelper for downloading models with progress tracking and managing module lifecycles.
- Documented module development guidelines to assist future integrations and ensure consistency across modules.
- Introduced the SherpaONNXTTS module, including core components such as SherpaONNXTTSService, SherpaONNXConfiguration, SherpaONNXModelManager, SherpaONNXDownloadStrategy, and SherpaONNXWrapper.
- Implemented a robust model registration and download strategy for managing TTS models and their dependencies.
- Established a comprehensive configuration structure for the TTS engine, allowing for flexible model management and synthesis options.
- Enhanced VoiceCapabilityService to support dynamic loading of the SherpaONNXTTS service based on configuration.
- Documented module development guidelines and integration patterns for future reference and consistency.
- Introduced `build_frameworks.sh` to automate the cloning and building of Sherpa-ONNX XCFrameworks.
- Added `Package.resolved` to manage dependencies for the SherpaONNXTTS module.
- Updated `Package.swift` to include binary targets for the newly built XCFrameworks.
- Created a comprehensive `README.md` for module setup, features, and integration instructions.
- Implemented module map for C++ interop and added Objective-C++ bridge header and implementation for seamless integration with the Sherpa-ONNX C API.
- Cleaned up the project structure and ensured adherence to SOLID principles for maintainability and scalability.
…TTS module

- Created a comprehensive `NEXT_STEPS.md` file outlining completed tasks and immediate next steps for the SherpaONNXTTS module.
- Updated `Package.swift` to include public headers and C++ settings for better integration with the Objective-C++ bridge.
- Introduced `SherpaONNXBridge.mm` for Objective-C++ implementation, facilitating seamless interaction with the Sherpa-ONNX C API.
- Added unit tests in `SherpaONNXTTSTests.swift` to validate service initialization, configuration, model types, and error handling.
@sanchitmonga22 sanchitmonga22 changed the title Smonga/tts handling [iOS-SDK] TTS integration + ONNX runtime integration to run TTS models Aug 22, 2025
…odule

- Created `BUILD_DOCUMENTATION.md` detailing the end-to-end process for building and integrating the Sherpa-ONNX TTS module with the RunAnywhere Swift SDK.
- Updated `Package.swift` to support newer platform versions and include the `SherpaONNXBridge` target for improved integration.
- Introduced `SherpaONNXBridge.h` and `SherpaONNXBridge.mm` for Objective-C++ bridging to the Sherpa-ONNX C API.
- Enhanced `SherpaONNXWrapper.swift` to utilize the new bridge, improving TTS functionality and performance.
- Added XCFrameworks for `onnxruntime` and `sherpa-onnx`, ensuring multi-platform support and optimized builds.
@sanchitmonga22 sanchitmonga22 changed the base branch from smonga/voice_refactor to main August 22, 2025 00:44
…rocessing capabilities

- Introduced LLMSwift module for LLM integration, including adapter and service implementations.
- Added WhisperKitTranscription module for speech-to-text functionality, featuring a custom download strategy and service for transcription.
- Updated project configuration to include new dependencies and removed obsolete references to WhisperKit.
- Enhanced documentation for both modules, detailing installation, usage, and error handling.
- Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability.
… code clarity

- Added LLM.swift as a dependency from GitHub to enhance LLM integration.
- Removed legacy text-to-voice handling from WhisperKitAdapter and WhisperKitService, simplifying the codebase to focus solely on speech-to-text functionality.
- Cleaned up comments in WhisperKitService to reflect current implementation without legacy references.
- Updated project documentation to reflect changes in module structure and dependencies.
- Updated VoiceAssistantView to display the current TTS model from the view model instead of a static label.
- Enhanced VoiceAssistantViewModel by adding a new property for the TTS model.
- Improved SherpaONNXWrapper to use consistent property names for voice identifiers and streamlined audio playback management in SherpaONNXTTSService.
- Refactored SherpaONNXDownloadStrategy to simplify the download process and improve error handling.
- Introduced new TTS models for SherpaONNX, including Kitten TTS Nano and VITS English US, with detailed metadata and download information.
- Updated FrameworkRecommender to include performance ratings for SherpaONNX, enhancing model selection capabilities.
- Modified LLMFramework to support SherpaONNX as a new framework type, ensuring proper handling for text-to-voice functionality.
@sanchitmonga22
Copy link
Contributor Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Aug 30, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Aug 30, 2025

Walkthrough

Migrates iOS example from remote packages to local modules, introduces Sherpa-ONNX TTS and WhisperKit transcription modules, and expands SDK with module infrastructure, TTS provider selection, and new enums/configs. Adds LLMSwift package, extensive Sherpa-ONNX bridge and service, download strategies, tests, scripts, and documentation. Minor UI and model list updates.

Changes

Cohort / File(s) Summary
Repo metadata
/.github/pull_request_template.md, /.gitignore
PR template adds testing checkboxes and screenshots section. Gitignore adds Sherpa-ONNX XCFramework paths and external source directory.
Xcode project integration (example app)
examples/ios/RunAnywhereAI/.../project.pbxproj, .../project.xcworkspace/.../Package.resolved, .../App/RunAnywhereAIApp.swift
Switches remote WhisperKit/LLM to local packages: SherpaONNXTTS, WhisperKitTranscription, LLMSwift. Updates build phases/dependencies and imports. Package.resolved hash tweak.
Example app features
.../Features/Models/ModelListViewModel.swift, .../Features/Voice/VoiceAssistantView.swift, .../Features/Voice/VoiceAssistantViewModel.swift
Adds two ONNX TTS models. ModelBadge now shows dynamic TTS model. ViewModel adds ttsModel and switches TTS config to SherpaONNX factory.
LLMSwift module
sdk/runanywhere-swift/Modules/LLMSwift/*
New SwiftPM package exporting LLMSwift target with deps on LLM.swift and RunAnywhereSDK. Adds error type, service refactor to LLMSwiftError and template resolver utility, README, and lockfile.
Sherpa-ONNX TTS module (SwiftPM)
sdk/runanywhere-swift/Modules/SherpaONNXTTS/*
New SPM package for SherpaONNXTTS with binary XCFramework targets and ObjC++ bridge (header/mm), Swift wrapper, public configuration/error types, service implementation, model manager, download strategy, module map, tests, build/setup scripts, LFS attrs, and documentation.
WhisperKit Transcription module
sdk/runanywhere-swift/Modules/WhisperKitTranscription/*
New SPM package for transcription: public error type, adapter tweaks (remove text-to-voice), public download strategy, service logging changes, re-export file, README, and lockfile.
SDK module infrastructure (public extensions)
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/*, .../RunAnywhereSDK+Voice.swift, .../RunAnywhereSDK+LLMModules.swift
Adds module discovery/availability, storage and cache API, model registry/download helpers, lifecycle protocols, registry, voice and LLM module factories/creators, and adjusts internal TTS lookup call site.
SDK voice and framework enums/routing
.../Core/Models/Framework/LLMFramework.swift, .../Core/Models/Framework/FrameworkModality.swift, .../Capabilities/Compatibility/Services/FrameworkRecommender.swift, .../Capabilities/Voice/Services/VoiceCapabilityService.swift, .../Public/Models/Voice/VoiceTTSConfig.swift
Adds sherpaONNX framework case and display name, maps modality to textToVoice, updates recommender scores, changes TTS discovery to accept config and dynamically load Sherpa TTS, and extends VoiceTTSConfig with provider/modelId and factory methods.
Documentation and plans
sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md, thoughts/shared/plans/*
Adds module development guide and Sherpa-ONNX integration/bridge implementation plans.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant App as RunAnywhereAI App
  participant VCS as VoiceCapabilityService
  participant SDK as RunAnywhereSDK
  participant SysTTS as SystemTextToSpeechService
  participant Sherpa as SherpaONNXTTSService

  Note over App,VCS: TTS selection based on VoiceTTSConfig
  User->>App: Start voice session
  App->>VCS: findTTSService(for: VoiceTTSConfig)
  alt provider = sherpaONNX
    VCS->>SDK: isModuleAvailable("SherpaONNXTTS.SherpaONNXTTSService")
    alt available
      VCS->>Sherpa: init()
      VCS-->>App: SherpaONNXTTSService
    else not available/fails
      VCS->>SysTTS: init()
      VCS-->>App: System TTS (fallback)
    end
  else provider = system or nil
    VCS->>SysTTS: init()
    VCS-->>App: System TTS
  end
  App->>+Sherpa: synthesize(text, options)
  Sherpa-->>-App: audio Data / Stream
  App-->>User: Playback
Loading
sequenceDiagram
  autonumber
  participant Service as SherpaONNXTTSService
  participant SDK as RunAnywhereSDK
  participant ModelMgr as SherpaONNXModelManager
  participant DL as DownloadManager
  participant Wrapper as SherpaONNXWrapper
  participant Bridge as SherpaONNXBridge

  Note over Service,SDK: Initialization and model setup
  Service->>SDK: registerModuleDownloadStrategy(SherpaONNXDownloadStrategy)
  Service->>ModelMgr: registerModels()
  ModelMgr->>SDK: registerModuleModels(models)
  Service->>SDK: getModelLocalPath(for: modelId)
  alt not downloaded
    Service->>DL: download(modelId) with progress
    DL-->>Service: completion
    Service->>SDK: getModelLocalPath(for: modelId)
  end
  Service->>Wrapper: init(configuration)
  Wrapper->>Bridge: initWithModelPath(..., modelType, ...)
  Bridge-->>Wrapper: ready
  Wrapper-->>Service: ready
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

A rabbit taps the build drum, light and quick,
Sherpa sings in ONNX, voices pick.
WhisperKit listens, LLMSwift thinks—
Modules dance with tidy links.
Scripts fetch frameworks, caches bloom;
Tap-tap, compile—new tunes resume. 🐇🎶

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch smonga/tts_handling

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 41

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (10)
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (4)

17-22: Streaming thresholds use bytes as if they were samples (0.5s/0.1s are off by 4×).

minAudioLength and contextOverlap are compared to Data.byteCount. Fix by using bytes-per-sample.

Apply this diff:

-    private var audioAccumulator = Data()
-    private let minAudioLength = 8000  // 500ms at 16kHz
-    private let contextOverlap = 1600   // 100ms overlap for context
+    private var audioAccumulator = Data()
+    private let sampleRate = 16_000
+    private let bytesPerSample = MemoryLayout<Float>.size // adjust if using Int16 pipeline
+    private var minAudioBytes: Int { (sampleRate / 2) * bytesPerSample }     // 500ms
+    private var contextOverlapBytes: Int { (sampleRate / 10) * bytesPerSample } // 100ms
-                        // Process when we have enough audio (500ms)
-                        if audioBuffer.count >= minAudioLength {
+                        // Process when we have enough audio (500ms)
+                        if audioBuffer.count >= minAudioBytes {
-                            // Keep last 100ms for context continuity
-                            audioBuffer = Data(audioBuffer.suffix(contextOverlap))
+                            // Keep last 100ms for context continuity
+                            audioBuffer = Data(audioBuffer.suffix(contextOverlapBytes))

Also applies to: 339-341, 382-384


318-331: Streaming early-return bug: after initializing, the function returns and never processes audio.

The return inside the guard’s else exits the Task. Restructure to proceed after lazy init.

Apply this diff:

-                    // Ensure WhisperKit is loaded
-                    guard let whisperKit = self.whisperKit else {
-                        if self.isInitialized {
-                            // Already initialized, but whisperKit is nil
-                            throw VoiceError.serviceNotInitialized
-                        } else {
-                            // Not initialized, try to initialize with default model
-                            try await self.initialize(modelPath: nil)
-                            guard self.whisperKit != nil else {
-                                throw VoiceError.serviceNotInitialized
-                            }
-                        }
-                        return
-                    }
+                    // Ensure WhisperKit is loaded
+                    let whisperKit: WhisperKit = {
+                        if let wk = self.whisperKit { return wk }
+                        return self.whisperKit! // set below after init
+                    }()
+                    if self.whisperKit == nil {
+                        if self.isInitialized {
+                            throw VoiceError.serviceNotInitialized
+                        }
+                        try await self.initialize(modelPath: nil)
+                        guard let wk = self.whisperKit else { throw VoiceError.serviceNotInitialized }
+                        // use wk below
+                    }

342-345: Same raw Data→Float32 assumption in streaming path.

Repeat of the format issue above; convert Int16 to Float or use a definitive format.

Apply this diff in both places:

-                            let floatArray = audioBuffer.withUnsafeBytes { buffer in
-                                Array(buffer.bindMemory(to: Float.self))
-                            }
+                            let floatArray: [Float]
+                            if audioBuffer.count % MemoryLayout<Float>.size == 0 {
+                                floatArray = audioBuffer.withUnsafeBytes { buf in
+                                    Array(buf.bindMemory(to: Float.self))
+                                }
+                            } else {
+                                let i16 = audioBuffer.withUnsafeBytes { buf in
+                                    Array(buf.bindMemory(to: Int16.self))
+                                }
+                                floatArray = i16.map { Float($0) / 32768.0 }
+                            }

Also applies to: 390-393


156-169: Honor caller options for task and language; don’t force English transcription.

For non-English or translation, current behavior is incorrect.

Apply this diff:

-        let decodingOptions = DecodingOptions(
-            task: .transcribe,
-            language: "en",  // Force English to avoid language detection issues
+        let decodingOptions = DecodingOptions(
+            task: (options.task == .translate ? .translate : .transcribe),
+            language: options.language.rawValue,
             temperature: 0.0,
             temperatureFallbackCount: 1,
             sampleLength: 224,
             usePrefillPrompt: false,
-            detectLanguage: false,  // Force English instead of auto-detect
+            detectLanguage: false,
             skipSpecialTokens: true,
             withoutTimestamps: true,
             compressionRatioThreshold: 2.4,
             logProbThreshold: -1.0,
             noSpeechThreshold: noSpeechThresh
         )

If you support “auto” language in options, set language to nil and detectLanguage = true accordingly.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (1)

118-126: Set request timeouts and use request-based download to avoid hangs.

URLSession.shared.download(from:) uses default timeouts. Use URLRequest with a timeout.

Apply this diff in both places:

-                    let (localURL, response) = try await URLSession.shared.download(from: fileURL)
+                    var req = URLRequest(url: fileURL)
+                    req.timeoutInterval = 60
+                    let (localURL, response) = try await URLSession.shared.download(for: req)

Also applies to: 178-185

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (2)

82-90: Observer token is dropped immediately; model changes won’t be observed

addObserver(forName:...) returns a token you must retain. As written, the token is discarded, so no notifications will ever fire.

Fix with Combine (you already import it) to auto-manage lifetimes:

@@
-        // Listen for model changes
-        NotificationCenter.default.addObserver(
-            forName: Notification.Name("ModelLoaded"),
-            object: nil,
-            queue: .main
-        ) { [weak self] notification in
-            Task { @MainActor in
-                self?.updateModelInfo()
-            }
-        }
+        // Listen for model changes
+        NotificationCenter.default
+            .publisher(for: Notification.Name("ModelLoaded"))
+            .receive(on: RunLoop.main)
+            .sink { [weak self] _ in
+                self?.updateModelInfo()
+            }
+            .store(in: &cancellables)

And add storage:

@@
 class VoiceAssistantViewModel: ObservableObject {
@@
     private let audioCapture = AudioCapture()
+    private var cancellables = Set<AnyCancellable>()

318-324: Stop capture and tear down on pipeline errors to avoid resource leaks

On .pipelineError, audio capture continues and the task/pipeline aren’t torn down.

             case .pipelineError(let error):
                 errorMessage = error.localizedDescription
                 sessionState = .error(error.localizedDescription)
                 isProcessing = false
                 isListening = false
                 logger.error("Pipeline error: \(error)")
+                // Ensure resources are released on failure
+                audioCapture.stopContinuousCapture()
+                pipelineTask?.cancel()
+                pipelineTask = nil
+                voicePipeline = nil
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (1)

239-291: Streaming token limit increments twice per token

tokenCount is incremented twice, halving the effective maxTokens. Increment once.

-            var tokenCount = 0
+            var tokenCount = 0
@@
-                for await token in response {
-                    tokenCount += 1
+                for await token in response {
+                    tokenCount += 1
@@
-                    // Check token limit (approximate - actual tokenization may differ)
-                    tokenCount += 1
-                    if tokenCount >= maxTokens {
+                    // Check token limit (approximate - actual tokenization may differ)
+                    if tokenCount >= maxTokens {
                         break
                     }
examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)

147-163: Drop LLM from packageProductDependencies (LLMSwift replaces it).

Avoid linking both. If LLM is still needed, remove LLMSwift instead.

 				541E22712E3BE21300EBF8FA /* RunAnywhereSDK */,
-				543028452E442716008361DD /* LLM */,
 				548CA0762E56D0DC0061CCF5 /* FluidAudioDiarization */,
 				5479377D2E57DF7600CB9251 /* LLMSwift */,
 				54760D382E57E06100A03191 /* WhisperKitTranscription */,
 				54509A592E57FB2E00E24F06 /* SherpaONNXTTS */,

68-89: Remove explicit LLM linking from the Frameworks build phase
LLMSwift’s Swift-PM package already brings in LLM.swift transitively; keeping both risks duplicate symbols. In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj, delete the 543028462E442716008361DD /* LLM in Frameworks */ entry from the PBXFrameworksBuildPhase file list.

🧹 Nitpick comments (94)
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Package.resolved (3)

1-77: Consider dropping this nested Package.resolved (use only a single top-level lockfile).

Keeping a lockfile per submodule often causes churn and conflicting pins across Xcode/SwiftPM versions. Prefer a single root Package.resolved or ignore lockfiles in nested modules.

If you decide to remove it from this module:

-{
-  "pins" : [
-    ...
-  ],
-  "version" : 2
-}

4-74: Prefer constraining versions in Package.swift over relying on Package.resolved.

For library packages, clients ignore your lockfile. Encode your semver policy explicitly in Package.swift (e.g., .upToNextMinor for bugfix-only) so consumers resolve within intended bounds.


4-74: Remove unused dependencies: SherpaONNXTTS doesn’t import Alamofire, DeviceKit, Files, GRDB.swift, Pulse, swift-asn1, swift-crypto, or ZIPFoundation—remove them from Package.swift.

.gitignore (1)

52-55: Resolve LFS vs ignore conflict for XCFrameworks; also remove redundant EXTERNAL entry and keep a placeholder.

Currently, XCFrameworks are ignored here while the module’s .gitattributes attempts to store them via Git LFS—only one policy should exist. If the intent is “do not commit frameworks; build locally,” keep this ignore and add a placeholder exception; also drop the redundant EXTERNAL/sherpa-onnx entry because EXTERNAL/ is already ignored later.

Apply:

-# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build)
-sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/
-EXTERNAL/sherpa-onnx/
+# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build)
+sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/
+!sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/.gitkeep
sdk/runanywhere-swift/Modules/SherpaONNXTTS/.gitattributes (1)

1-4: Align LFS patterns with repo policy (ignored vs tracked binaries).

If XCFrameworks remain ignored, this LFS config is moot and confusing. If you plan to track binaries, narrow the scope to explicit frameworks to avoid sweeping other artifacts.

Option A (recommended if binaries stay ignored): remove this file entirely.

Option B (track specific frameworks via LFS):

-*.xcframework filter=lfs diff=lfs merge=lfs -text
-*.a filter=lfs diff=lfs merge=lfs -text
-XCFrameworks/** filter=lfs diff=lfs merge=lfs -text
+# Track only the shipped XCFrameworks
+XCFrameworks/SherpaONNXFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text
+XCFrameworks/ONNXRuntimeFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text
.github/pull_request_template.md (2)

12-13: Capitalize product/language names; minor wording polish.

Use “MacBook” and “Swift” for consistency.

-- [ ] Tested on Macbook if swift changes
-- [ ] Tested on Tablet/iPad if swift changes
+- [ ] Tested on MacBook (if Swift changes)
+- [ ] Tested on iPad/Tablet (if Swift changes)

28-29: Prompt for TTS evidence (audio) alongside UI screenshots.

Given TTS, ask submitters for short audio samples and device details.

-## Screenshots - Attach all the relevant UI changes screenshots for iOS/Android and MacOS/Tablet/large screen sizes
-- 
+## Screenshots & Media
+- Attach relevant UI screenshots for iOS/Android and macOS/iPad (large screens).
+- For TTS changes, attach short audio samples (or links) and note device model, iOS version, sample rate, and latency.
sdk/runanywhere-swift/Sources/RunAnywhere/Core/Models/Framework/FrameworkModality.swift (1)

45-46: Modality mapping for SherpaONNX is correct and matches intent.

Primary and supported modalities set to textToVoice; isVoiceFramework already covers both voice directions.

Optional: co-locate .sherpaONNX with other voice frameworks in the same switch group for readability.

-    case .whisperKit, .openAIWhisper:
-        return .voiceToText
-    case .sherpaONNX:
-        return .textToVoice
+    case .whisperKit, .openAIWhisper:
+        return .voiceToText
+    case .sherpaONNX:
+        return .textToVoice

Also applies to: 68-69

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (2)

191-196: Remove per-call model enumeration; it’s expensive and noisy.

Fetching available models on every transcription adds latency and log spam. Gate behind DEBUG and/or log once at init.

Apply this diff:

-        do {
-            let availableModels = try await WhisperKit.fetchAvailableModels()
-            logger.info("  Available models: \(availableModels)")
-        } catch {
-            logger.info("  Could not fetch available models: \(error)")
-        }
+        #if DEBUG
+        do {
+            let availableModels = try await WhisperKit.fetchAvailableModels()
+            logger.debug("Available models: \(availableModels)")
+        } catch {
+            logger.debug("Could not fetch available models: \(error)")
+        }
+        #endif

128-136: Padding with random noise makes outputs nondeterministic. Consider deterministic dither or zeros.

Random noise complicates testing and reproducibility.

Replace with a fixed very low-amplitude dither (e.g., a repeating sequence) or zeros. I can provide a deterministic generator if desired.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitAdapter.swift (1)

35-57: Adapter caching flow looks good; small resilience suggestion.

Looks solid. Consider marking cleanupStaleCache as @mainactor (or move cache state into an actor) to avoid races if adapters are used across threads.

sdk/runanywhere-swift/Modules/LLMSwift/Package.resolved (2)

76-82: Pre-release swift-syntax pin may require a newer toolchain than 5.9.

Pinned to 602.0.0-prerelease-2025-08-11; with tools 5.9 this could fail. Confirm your CI/Xcode version, or pin LLM.swift to a revision that resolves to a Swift 5.9-compatible swift-syntax.

Run CI with Xcode showing swiftc -version, or update the dependency pin accordingly. I can help pick a compatible rev.


1-95: Consider not committing Package.resolved for library-style modules.

Package.resolved is best for apps; libraries should allow clients to resolve. Keeping it may force consumers onto your exact graph.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (3)

57-70: Base URL derivation: handle non-/resolve/main/ URLs more robustly.

If the provided URL points directly to a file blob or a different branch/tag, current logic falls back to a fixed repo. Consider parsing owner/repo/path and preserving branch/tag if present.


100-105: Creating analytics/weights subdirs unconditionally.

Not harmful, but only weights/ exists in your file lists; consider creating subdirs lazily per needed file path.


216-233: mapToHuggingFacePath(): unconditional dropLast() can mis-map IDs without a hash suffix.

If modelId has no trailing hash, dropLast removes a real token. Only drop when a suffix matches a known hash pattern.

I can push a regex-based variant if helpful.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/TEAM_WORKFLOW.md (3)

160-169: Cache key should include the sherpa-onnx ref for deterministic CI.

Hashing only setup_frameworks.sh risks stale caches when upstream changes. Include a pinned tag/commit (or env) in the key.

Add to the example:

env:
  SHERPA_ONNX_REF: vX.Y.Z  # or a commit SHA

- uses: actions/cache@v3
  with:
    path: EXTERNAL/sherpa-onnx
-   key: sherpa-onnx-${{ hashFiles('**/setup_frameworks.sh') }}
+   key: sherpa-onnx-${{ env.SHERPA_ONNX_REF }}-${{ hashFiles('**/setup_frameworks.sh') }}

106-120: Strengthen the Git LFS guidance for existing binaries.

If binaries were ever committed without LFS, devs will need migration to avoid bloating history.

Augment with:

# For repos that previously committed binaries:
git lfs migrate import --include="*.xcframework,*.a"

211-215: Make “pin to specific sherpa-onnx commit/tag” actionable.

Add an explicit example of how to set and propagate the ref used by scripts and CI to avoid accidental upgrades.

Suggested addition:

# In setup/build scripts
: "${SHERPA_ONNX_REF:=vX.Y.Z}"
git fetch --tags
git checkout "$SHERPA_ONNX_REF"
thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md (5)

41-45: Remove or use the sampleRate initializer param.

Sherpa-ONNX exposes sample rate from the engine; passing it in here is misleading unless it configures resampling. Either wire it to config or drop it.


98-111: Verify model-type fields against the actual C API (likely mismatches).

“kitten” looks invalid; common TTS configs are vits/kokoro/etc. Field names like config.model.kitten.* may not exist.

I can align this section to the current C API once you confirm the targeted sherpa-onnx version/tag.


168-176: Import the correct module name in Swift.

import SherpaONNXFramework may be incorrect if the module map exports SherpaONNXBridge. Import should match the module.modulemap “module” name.


333-346: Avoid Data→Array→Data copies for volume; use Accelerate.

For large buffers this double copy is costly; vDSP scales in-place efficiently.

Example:

import Accelerate

private func applyVolume(to audioData: Data, volume: Float) -> Data {
    guard volume != 1.0 else { return audioData }
    var out = Data(count: audioData.count)
    audioData.withUnsafeBytes { inBuf in
        out.withUnsafeMutableBytes { outBuf in
            let n = audioData.count / MemoryLayout<Float>.size
            vDSP_vsmul(inBuf.bindMemory(to: Float.self).baseAddress!, 1,
                       [volume],
                       outBuf.bindMemory(to: Float.self).baseAddress!, 1,
                       vDSP_Length(n))
        }
    }
    return out
}

428-434: module.modulemap: consider ‘explicit’ module and header placement.

Mark the module explicit and ensure the header path matches the packaged layout to avoid ambiguous imports when combined with other ObjC++ modules.

-module SherpaONNXBridge {
+explicit module SherpaONNXBridge {
     header "SherpaONNXBridge.h"
     export *
 }
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftTemplateResolver.swift (1)

15-21: Minor: avoid repeated string scanning.

Cache lowercased filename once (already done) and consider a lookup table or ordered rules to simplify maintenance.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/README.md (2)

45-51: Avoid static line-number references in docs.

“Garbled output detection (lines 435-477)” will drift. Describe the behavior, not the line range, or link to a symbol.


89-91: Confirm and pin dependency/version constraints.

Verify the minimum OS versions and “WhisperKit 0.10.2+” are accurate for this PR branch; consider pinning an exact tag in Package.swift examples.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/build_frameworks.sh (1)

49-57: Pre-flight dependency checks improve UX.

Fail early if required tools are missing (git, cmake, xcodebuild).

command -v cmake >/dev/null || { echo -e "${RED}❌ cmake not found${NC}"; exit 1; }
xcodebuild -version >/dev/null 2>&1 || { echo -e "${RED}❌ Xcode CLTs not found${NC}"; exit 1; }
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.resolved (1)

1-121: Remove leaf-module Package.resolved files
Detected Package.resolved in module folders under sdk/runanywhere-swift/Modules (SherpaONNXTTS, FluidAudioDiarization, LLMSwift, WhisperKitTranscription). Retain only the root sdk/runanywhere-swift/Package.resolved to reduce churn and avoid lockfile conflicts.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXModelManager.swift (3)

25-100: Cache model definitions and pin URLs by revision.

  • Avoid rebuilding arrays repeatedly; keep a cached list.
  • “resolve/main” is mutable; prefer immutable, revision-pinned URLs for reproducible downloads.
-    private func createModelDefinitions() -> [ModelInfo] {
-        return [
+    private lazy var modelsCache: [ModelInfo] = {
+        [
             // Kitten ...
             ModelInfo(
-                downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/model.onnx"),
+                downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/model.onnx"),
                 ...
                 alternativeDownloadURLs: [
-                    URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/voices.json"),
+                    URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/voices.json"),
                     ...
                 ].compactMap { $0 }
             ),
             // Repeat for other models...
-        ]
-    }
+        ]
+    }()

Also: if these files are required assets (not alternates), align naming with your DownloadStrategy (e.g., “additionalFiles”) to avoid misinterpretation.


102-105: Don’t regenerate to search; use the cache or registry.

Rebuilding the array for each lookup is wasteful. Use modelsCache.

-    func getModel(by id: String) -> ModelInfo? {
-        return createModelDefinitions().first { $0.id == id }
-    }
+    func getModel(by id: String) -> ModelInfo? {
+        return modelsCache.first { $0.id == id }
+    }

107-114: Implement a basic device-aware selector.

Small heuristic beats a hardcoded ID and prevents oversized models on constrained devices.

-    func selectOptimalModel() -> String {
-        // TODO: Implement device capability detection
-        // Consider available memory, CPU performance, etc.
-        // For now, return the smallest model
-        return "sherpa-kitten-nano-v0.1"
-    }
+    func selectOptimalModel() -> String {
+        let mem = ProcessInfo.processInfo.physicalMemory
+        if mem >= 3_000_000_000 { return "sherpa-kokoro-en-v0.19" }
+        return "sherpa-kitten-nano-v0.1"
+    }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.h (1)

59-61: Prefer invalidate and make it idempotent.

Name conveys lifecycle intent better than destroy; ensure multiple calls are safe.

-- (void)destroy;
+- (void)invalidate;

Update implementation accordingly.

thoughts/shared/plans/sherpa_onnx_tts_complete_plan.md (1)

434-443: Turn performance targets into CI checks.

Add simple benchmarks or smoke tests to fail PRs when RTF/memory regress beyond thresholds.

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleCore.swift (3)

63-74: De-dupe model registration and avoid silent failures.

Guard duplicate IDs or let RegistryService upsert; also prefer structured logging over print.

-        guard let registry = serviceContainer.modelRegistry as? RegistryService else {
-            print("[RunAnywhereSDK] Failed to register module models: Registry service not available")
+        guard let registry = serviceContainer.modelRegistry as? RegistryService else {
+            // TODO: Inject logger; avoid print in public API.
             return
         }
-        for model in models {
-            registry.registerModel(model)
-        }
+        let unique = Dictionary(grouping: models, by: { $0.id }).compactMap { $0.value.first }
+        unique.forEach { registry.registerModel($0) }

76-85: Async not needed here.

Method is synchronous; consider dropping async to avoid misleading callers.

-    public func getModelLocalPath(for modelId: String) async -> URL? {
+    public func getModelLocalPath(for modelId: String) -> URL? {
         guard let model = serviceContainer.modelRegistry.getModel(by: modelId) else {
             return nil
         }
         return model.localPath
     }

37-45: Avoid duplicating cache-clearing logic.

Delegate to the file manager’s clearModuleCache to keep one source of truth.

-    public func clearModuleCache(moduleId: String) throws {
-        let baseFolder = serviceContainer.fileManager.getBaseFolder()
-        if let cacheFolder = try? baseFolder.subfolder(named: "Cache"),
-           let moduleFolder = try? cacheFolder.subfolder(named: moduleId) {
-            try moduleFolder.delete()
-        }
-    }
+    public func clearModuleCache(moduleId: String) throws {
+        try serviceContainer.fileManager.clearModuleCache(moduleId)
+    }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.mm (4)

24-26: Remove unused headers

<vector> and <string> aren’t used.

-#include <vector>
-#include <string>

156-163: Clamp speed and harden speaker validation

Avoid out-of-range values reaching the C API; clamp speed and guard zero speakers.

     // Validate speaker ID
-    if (speakerId < 0 || speakerId >= _numSpeakers) {
+    if (_numSpeakers <= 0 || speakerId < 0 || speakerId >= _numSpeakers) {
         NSLog(@"[SherpaONNXBridge] Invalid speaker ID: %ld (max: %d)",
               (long)speakerId, _numSpeakers - 1);
         speakerId = 0; // Default to first speaker
     }
 
+    // Clamp speed to a sane range [0.25, 4.0]
+    float clampedSpeed = fmaxf(0.25f, fminf(speed, 4.0f));
+
     // Generate audio
     const SherpaOnnxGeneratedAudio *audio = SherpaOnnxOfflineTtsGenerate(
         tts,
         [text UTF8String],
         (int32_t)speakerId,
-        speed
+        clampedSpeed
     );

Also applies to: 165-171


181-183: Use size_t for byte count (overflow-safe)

Avoid implicit signed-to-unsigned conversion and potential overflow on large buffers.

-    NSData *audioData = [NSData dataWithBytes:audio->samples
-                                        length:audio->n * sizeof(float)];
+    size_t byteCount = (size_t)audio->n * sizeof(float);
+    NSData *audioData = [NSData dataWithBytes:audio->samples length:byteCount];

276-283: Reset cached properties on destroy

Minor hygiene: reset _sampleRate/_numSpeakers after destroying tts.

     if (tts) {
         SherpaOnnxDestroyOfflineTts(tts);
         tts = nullptr;
     }
+    _sampleRate = 0;
+    _numSpeakers = 0;
examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift (1)

10-11: Conditionally import optional modules

Prevents build issues when these modules aren’t present in some configurations (e.g., CI, non-target platforms).

-import LLMSwift
-import WhisperKitTranscription
+#if canImport(LLMSwift)
+import LLMSwift
+#endif
+#if canImport(WhisperKitTranscription)
+import WhisperKitTranscription
+#endif
examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved (1)

70-74: LLM.swift pins are consistent across the repo
Both the example and SDK Package.resolved files reference https://github.com/eastriverlee/LLM.swift at revision 4c4e909ac4758c628c9cd263a0c25b6edff5526d.
Optional: pin LLM.swift to a semantic version tag in your Package.swift manifest to prevent drift.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.swift (1)

6-11: Platform matrix OK; consider documenting why iOS 16+/macOS 13+ are required

Matches WhisperKit’s requirements. Add a brief comment to prevent regressions.

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift (1)

66-66: Use a fallback label when TTS model name is empty.

Mirror the LLM badge behavior to avoid showing a blank value before the view model is ready.

-ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel, color: .purple)
+ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel.isEmpty ? "Loading..." : viewModel.ttsModel, color: .purple)

Also applies to: 268-268

sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Compatibility/Services/FrameworkRecommender.swift (1)

161-163: Sherpa-ONNX scoring hooks added—LGTM.

Scores are consistent with adjacent framework ranges. Consider a minor bonus in calculateFormatScore for (.sherpaONNX, .onnx) to reflect native format preference, if you find selection too neutral across ONNX-capable frameworks.

Also applies to: 195-197, 228-230, 261-263

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/module.modulemap (1)

1-4: Consider declaring language requirements for the bridge.

If the bridge header pulls ObjC/C++ (likely given .mm usage), declaring requirements reduces miscompilation risks across toolchains.

 module SherpaONNXTTSBridge {
+    requires objc, cplusplus
     header "Internal/Bridge/SherpaONNXBridge.h"
     export *
 }

If the header is pure C with extern "C" guards, this change is optional; otherwise it helps ensure correct compilation modes.

examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (3)

22-22: Keep ttsModel in sync with actual TTS config

The UI-facing ttsModel is hardcoded and never updated when you build the pipeline. Set it based on the selected provider/voice to avoid drift.

Apply after you construct config (see next comment’s diff) or set it alongside selected voice/model:

-    @Published var ttsModel: String = "SherpaONNX"
+    @Published var ttsModel: String = "SherpaONNX"

And later (after config creation):

self.ttsModel = "SherpaONNX • expr-voice-2-f"

142-154: Avoid hardcoding TTS model/voice; expose configuration and clamp rate

Hardcoded IDs will break on devices without those assets and make A/B testing hard. Surface these as inputs or settings, and clamp rate to valid provider bounds to prevent undefined behavior.

Suggested refactor in-place:

-        let config = ModularPipelineConfig(
+        let selectedModelId = "sherpa-kitten-nano-v0.1" // TODO: load from settings/user selection
+        let selectedVoice = "expr-voice-2-f"            // TODO: load from settings/user selection
+        let selectedRate: Float = max(0.5, min(2.0, 1.0)) // clamp [0.5, 2.0] (verify provider bounds)
+        let config = ModularPipelineConfig(
             components: [.vad, .stt, .llm, .tts],
             vad: VADConfig(),
             stt: VoiceSTTConfig(modelId: whisperModelName),
             llm: VoiceLLMConfig(modelId: "default", systemPrompt: "You are a helpful voice assistant. Keep responses concise and conversational."),
-            tts: VoiceTTSConfig.sherpaONNX(
-                modelId: "sherpa-kitten-nano-v0.1",
-                voice: "expr-voice-2-f",
-                rate: 1.0
-            )
+            tts: VoiceTTSConfig.sherpaONNX(
+                modelId: selectedModelId,
+                voice: selectedVoice,
+                rate: selectedRate
+            )
         )
+        self.ttsModel = "SherpaONNX • \(selectedVoice)"

Please verify the acceptable rate range for Sherpa-ONNX on iOS and whether model/voice IDs match the download strategy. If desired, I can wire this to a settings store.


83-83: Prefer a typed notification name

Avoid stringly-typed "ModelLoaded". Define extension Notification.Name { static let modelLoaded = Notification.Name("ModelLoaded") } and use .modelLoaded for safety and discoverability.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/NEXT_STEPS.md (2)

133-138: Reality-check performance targets per device class

Targets like “TTFT <100ms” and “RTF >10x” may vary widely by model/device. Consider stating them as goals and adding a simple benchmarking harness (os_signpost + metrics) to validate.


127-130: Avoid committing large frameworks; prefer CI artifact caching

Beyond Git LFS, consider excluding XCFrameworks/ from VCS and producing them via CI with cached artifacts to keep the repo lean.

sdk/runanywhere-swift/Modules/LLMSwift/README.md (3)

41-51: Include adapter registration context and order

Mention this should be called early (e.g., app launch) before any model loads to avoid fallback adapters.


95-115: Document cancellation and backpressure behavior

Add a note on whether generation calls are cancelable, thread-safe, and how many concurrent generations the service supports.


165-171: Clarify defaults are examples, not guarantees

Context length, history limit, timeout, and memory estimation may vary by model/hardware. Rephrase as “defaults (configurable)” and link to the knobs.

sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/VoiceError.swift (2)

3-10: Consider Sendable/I18N and richer context

If VoiceError crosses concurrency boundaries, adopt Sendable (or document it doesn’t). Also consider localizable strings and attaching context (e.g., sample rate/channels) to unsupportedAudioFormat.

Example:

-public enum VoiceError: LocalizedError {
+public enum VoiceError: LocalizedError {
     case serviceNotInitialized
     case modelNotFound(String)
     case transcriptionFailed(Error)
     case insufficientMemory
-    case unsupportedAudioFormat
+    case unsupportedAudioFormat(expectedHz: Int, expectedChannels: Int)

11-24: Expose failureReason/recoverySuggestion for user guidance

Add failureReason/recoverySuggestion to improve UX messages (e.g., suggest closing background apps on low memory).

sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftError.swift (2)

3-9: Preserve underlying cause for generation failures

Carrying only a String drops root-cause details. Include an optional underlying Error to aid debugging and telemetry. If concurrency requires it later, consider documenting Sendable constraints.

-public enum LLMSwiftError: LocalizedError {
+public enum LLMSwiftError: LocalizedError {
     case modelLoadFailed
     case initializationFailed
-    case generationFailed(String)
+    case generationFailed(String, underlying: Error? = nil)
     case templateResolutionFailed(String)

10-21: Include underlying error in description (when available)

Small improvement to surface details in logs while keeping messages user-friendly.

-        case .generationFailed(let reason):
-            return "Generation failed: \(reason)"
+        case .generationFailed(let reason, let underlying):
+            let detail = underlying.map { " (\($0.localizedDescription))" } ?? ""
+            return "Generation failed: \(reason)\(detail)"
sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md (4)

35-47: Add language to fenced code block (markdownlint MD040).

Specify a language for the "Directory Layout" fence to satisfy linters and improve rendering.

-```
+```text
 Modules/YourModule/
 ├── Package.swift              # SPM package definition
 ...

---

`100-111`: **Avoid nil URL in example code.**

URL(string:) can return nil; tighten the example to a guaranteed URL to reduce copy/paste footguns.

```diff
-            downloadURL: URL(string: "https://example.com/model.bin"),
+            downloadURL: URL(string: "https://example.com/model.bin")!, // safe in docs, or show:
+            // guard let url = URL(string: "https://example.com/model.bin") else { return }

151-153: Avoid top-level registration calls in library targets.

Top-level code (registerModuleDownloadStrategy) in SPM libraries can execute at import time and is discouraged. Show it inside init() or initialize().

-// Register strategy
-sdk.registerModuleDownloadStrategy(YourDownloadStrategy())
+// Register strategy during service setup (e.g., in init or initialize)
+sdk.registerModuleDownloadStrategy(YourDownloadStrategy())

251-273: Replace emphasis-as-heading (MD036) with proper headings.

Conform to markdownlint and improve skimmability.

-**Option A: Async Init (FluidAudioDiarization style)**
+#### Option A: Async Init (FluidAudioDiarization style)
 ...
-**Option B: Two-Phase (SherpaONNXTTS style)**
+#### Option B: Two-Phase (SherpaONNXTTS style)
sdk/runanywhere-swift/Modules/SherpaONNXTTS/README.md (3)

162-170: Add language to the architecture tree code fence (MD040).

Specify a language (text) for better rendering and to satisfy linters.

-```
+```text
 SherpaONNXTTS/
 ├── Sources/
 ...

---

`49-72`: **Clarify audio format returned by synthesize.**

Document PCM format (e.g., 16‑bit little‑endian, mono, sample rate) or provide an AudioBuffer/AVAudioPCMBuffer return type to reduce integration ambiguity.

Would you like a snippet showing returning AVAudioPCMBuffer and an example AVAudioEngine player?

---

`84-97`: **Streaming usage: show cancellation/backpressure handling.**

Add an example of cancelling the stream and noting per-chunk size to guide implementers integrating with audio queues.

</blockquote></details>
<details>
<summary>sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Voice/Services/VoiceCapabilityService.swift (1)</summary><blockquote>

`156-166`: **Log provider and modelId for better diagnosis.**

Include provider/modelId in the debug log to trace selection decisions.

```diff
-        logger.debug("Finding TTS service")
+        logger.debug("Finding TTS service (provider=\(ttsConfig?.provider.rawValue ?? "nil"), modelId=\(ttsConfig?.modelId ?? "nil"))")
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Models/Voice/VoiceTTSConfig.swift (1)

33-47: Consider input validation helpers.

Optional: add static clamps for rate/pitch/volume (e.g., 0.5–2.0, 0–1) to keep configs sane across providers.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/setup_frameworks.sh (4)

40-48: Remove $? check; rely on pipefail.

With pipefail, a failing curl|tar will exit non‑zero. Simplify and handle errors uniformly.

-    curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR"
-
-    if [ $? -eq 0 ]; then
-        echo "✅ Successfully downloaded pre-built frameworks!"
-        exit 0
-    else
-        echo "❌ Download failed. Falling back to local build..."
-    fi
+    if curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR"; then
+        echo "✅ Successfully downloaded pre-built frameworks!"
+        exit 0
+    fi
+    echo "❌ Download failed. Falling back to local build..."

75-77: Branch name safety.

Upstream default may be main; use the default remote branch to avoid failures on repos that renamed master→main.

-    git pull origin master
+    git pull --ff-only origin "$(git remote show origin | awk '/HEAD branch/ {print $NF}')"

91-93: Avoid hard-coding onnxruntime path/version.

The sherpa-onnx build layout/version can change. Prefer a find-based copy with validation.

-cp -R "build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework" "$XCFRAMEWORKS_DIR/"
+ONNXRT_SRC="$(fd -t d -a '^onnxruntime\.xcframework$' build-ios | head -n1)"
+cp -R "$ONNXRT_SRC" "$XCFRAMEWORKS_DIR/"

95-110: Add checksum print to aid cache debugging.

Printing shas helps teams verify identical artifacts.

     echo "✅ Framework setup completed successfully!"
 
     # Show framework sizes
     echo "📊 Framework sizes:"
     du -sh "$XCFRAMEWORKS_DIR"/*
 
+    echo "🔐 Checksums:"
+    (cd "$XCFRAMEWORKS_DIR" && shasum -a 256 -b sherpa-onnx.xcframework/Info.plist onnxruntime.xcframework/Info.plist)
+
     echo ""
     echo "🎉 SherpaONNX TTS is ready to use!"
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Tests/SherpaONNXTTSTests/SherpaONNXTTSTests.swift (3)

55-70: Avoid hard-coding model IDs in tests

Tie optimal-model assertions to registered content rather than a literal string to reduce brittleness across future catalog changes.

-        let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1")
+        let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1")
         XCTAssertNotNil(kittenModel)
-        XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1")
+        XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1")
@@
-        let optimalModel = manager.selectOptimalModel()
-        XCTAssertEqual(optimalModel, "sherpa-kitten-nano-v0.1")
+        let optimalModel = manager.selectOptimalModel()
+        XCTAssertEqual(optimalModel, kittenModel?.id)

27-35: Cover all cases via CaseIterable to catch future enum additions

Iterate SherpaONNXModelType.allCases to ensure new cases get tested automatically.

-        let modelTypes: [SherpaONNXModelType] = [.kitten, .kokoro, .vits, .matcha, .piper]
+        let modelTypes = SherpaONNXModelType.allCases

74-82: Skip async init when frameworks are absent

Proactively skip rather than implicitly succeed without initialize(); makes intent explicit.

     func testServiceInitializationAsync() async throws {
-        // This test would require XCFrameworks to be built
-        // For now, we just test that the service can be created
+        // This test would require XCFrameworks to be built
+        try XCTSkipIf(true, "Sherpa-ONNX XCFrameworks not available in CI yet")
         let service = SherpaONNXTTSService()
         XCTAssertNotNil(service)
 
         // Initialization would fail without frameworks
         // So we don't call initialize() in this test
     }
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+LLMModules.swift (1)

80-103: Heuristic selection should be case-insensitive and consider file extensions

Model IDs may vary in case; checking by extension is more robust.

-        if modelId.contains("mlx") && sdk.isMLXAvailable {
+        if modelId.lowercased().contains("mlx") && sdk.isMLXAvailable {
             return await sdk.createModuleLLMService(.mlx)
         }
 
-        if modelId.contains("gguf") && sdk.isLLMSwiftAvailable {
+        if modelId.lowercased().contains("gguf") && sdk.isLLMSwiftAvailable {
             return await sdk.createModuleLLMService(.llmSwift)
         }
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXDownloadStrategy.swift (1)

89-96: Robustness: verify required files in addition to a marker

A simple marker can become stale; optionally validate expected filenames exist.

sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (5)

57-66: Duplicate readiness guard

Second guard duplicates the nil-check right above; remove it for clarity.

-            // Validate model readiness with a simple test prompt
-            logger.info("🧪 Validating model readiness with test prompt")
-            guard let llm = self.llm else {
-                throw FrameworkError(
-                    framework: .llamaCpp,
-                    underlying: LLMSwiftError.modelLoadFailed,
-                    context: "Failed to initialize LLM.swift with model at \(modelPath)"
-                )
-            }
+            // Validate model readiness with a simple test prompt
+            logger.info("🧪 Validating model readiness with test prompt (skipped)")

129-137: Don’t log full prompts in production logs

Reduce risk of PII leakage; log a bounded prefix.

-            logger.info("📝 Full prompt being sent to LLM:")
-            logger.info("---START PROMPT---")
-            logger.info("\(fullPrompt)")
-            logger.info("---END PROMPT---")
+            logger.info("📝 Prompt (prefix 512 chars): \(fullPrompt.prefix(512))")

333-337: Implement generation options or document placeholders

applyGenerationOptions currently does nothing; either wire supported parameters or mark unsupported explicitly.


318-322: Context memory estimate: tie to actual context length

If max context differs from 2048, adjust estimate accordingly.


1-5: Prefer import os over import os.log with Logger

Minor consistency nit.

-import os.log
+import os
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXConfiguration.swift (1)

65-79: Consider using ByteCount for clarity of memory sizes

estimatedMemoryUsage is an Int; consider documenting units or using a typealias for bytes.

sdk/runanywhere-swift/Modules/SherpaONNXTTS/BUILD_DOCUMENTATION.md (2)

86-92: Avoid hardcoding ONNX Runtime versioned paths in manual copy instructions

The docs pin onnxruntime.xcframework under .../ios-onnxruntime/1.17.1/..., while later steps use a wildcard. This is brittle and will break on version bumps.

Suggested edit:

- cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework XCFrameworks/
+ cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/*/onnxruntime.xcframework XCFrameworks/

25-26: Clarify actual platform support vs. bridge availability

You advertise multi-platform (iOS, macOS, tvOS, watchOS), but the bridge and binary frameworks are gated only for iOS/macOS in code. Call this out explicitly to prevent confusion for tvOS/watchOS integrators.

Would you like me to PR a short “Platform Support” section noting that tvOS/watchOS builds are currently not supported by the Sherpa bridge and binaries?

Also applies to: 123-128

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Bridge/SherpaONNXWrapper.swift (2)

100-151: Streaming implementation chunks raw bytes with float-size assumptions

Chunking uses MemoryLayout<Float>.size and a fixed 16kHz without confirming the actual sample format/rate returned by the bridge. This risks producing malformed chunks and drift.

  • Derive chunk size from sampleRate() and the bridge’s actual sample format (e.g., f32 vs s16), or expose metadata from the bridge.
  • If the bridge returns raw PCM, document the format and consider emitting WAV-framed chunks for consumers that expect containerized audio.

Also applies to: 131-144


66-98: Pitch and volume arguments ignored

You accept pitch and volume but do not pass them to the bridge. If unsupported, document this and consider applying gain/pitch-shift client-side or dropping the parameters from this API.

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+VoiceModules.swift (1)

37-39: Prefer unified logging over print

Use the SDK’s logging mechanism (os.Logger or a shared logger) instead of print for consistency and to avoid noisy production logs.

Also applies to: 75-77, 128-138, 165-177

sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXTTSService.swift (2)

124-151: Initialize path: swallow errors and race on voice set

You set currentVoice by queuing setVoice in a fire-and-forget Task and ignore errors. If this fails, the service state is inconsistent.

Either make currentVoice setter async or document that it’s best-effort and log any failure explicitly.


6-9: Optional: Make service discoverable via Objective‑C if you keep reflection

If you choose reflection over a registry, mark the class @objc(SherpaONNXTTSService) and inherit from NSObject to allow NSClassFromString discovery, and expose an init() or static factory the loader can call.

I can generate a small factory shim if you want to keep runtime discovery without direct imports.

Also applies to: 376-381, 413-425

examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)

245-253: Confirm no duplication between root package and per-module packages.

You reference "../../../sdk/runanywhere-swift" and the individual module packages. Ensure the root package does not also expose LLMSwift/WhisperKitTranscription/SherpaONNXTTS to avoid duplicate targets.


705-716: Stabilize local package paths.

Relative paths are fragile if the project is moved. Prefer anchoring paths to $(SRCROOT) via workspace-level SPM or consolidating under a single aggregator Package.

sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift (2)

182-187: Return a stable, deterministic ordering.

Makes UI/debug output predictable.

-    public func getAllModules() -> [ModuleConfiguration] {
-        queue.sync {
-            Array(registeredModules.values)
-        }
-    }
+    public func getAllModules() -> [ModuleConfiguration] {
+        queue.sync {
+            registeredModules.values.sorted { $0.moduleId < $1.moduleId }
+        }
+    }

70-77: Mark update check as TODO or surface “not implemented.”

Stub returning false can hide real update availability.

Comment on lines 9 to 15
/* Begin PBXBuildFile section */
541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };
543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };
54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };
54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };
5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };
547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Delete PBXBuildFile for LLM to complete the migration.

Leaving the PBXBuildFile will keep stale references.

-		543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* Begin PBXBuildFile section */
541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };
543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };
54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };
54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };
5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };
547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };
/* Begin PBXBuildFile section */
541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; };
54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; };
54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; };
5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; };
547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; };
🤖 Prompt for AI Agents
In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj around
lines 9 to 15, the PBXBuildFile entry for "LLM in Frameworks" remains and must
be removed to avoid stale references; delete the entire line
"543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile;
productRef = 543028452E442716008361DD /* LLM */; };" and ensure any commas or
separators in the surrounding PBXBuildFile section remain syntactically correct
(adjust trailing/leading commas or line endings) so the project.pbxproj stays
valid.

Comment on lines +170 to +187
// Sherpa Kitten TTS Nano - Smallest, fastest TTS model (fixed URL)
ModelInfo(
id: "sherpa-kitten-nano-v0.1",
name: "Kitten TTS Nano (Sherpa-ONNX)",
format: .onnx,
downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx"),
estimatedMemory: 50_000_000, // 50MB
contextLength: 0, // Not applicable for TTS models
downloadSize: 25_000_000, // ~25MB
compatibleFrameworks: [.sherpaONNX],
preferredFramework: .sherpaONNX,
metadata: ModelInfoMetadata(
tags: ["tts", "sherpa-onnx", "lightweight"],
description: "Lightweight neural TTS with 8 expressive voices"
),
supportsThinking: false
),

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

ONNX TTS entries look correct; verify URLs and registry uniqueness.

Both models are well-specified. Please verify:

  • URLs resolve and sizes roughly match downloadSize.
  • Model IDs are unique across predefinedModels.
  • Sherpa-ONNX path requires no auxiliary assets (e.g., speaker metadata) for these two models.

Run this script to check reachability/size and duplicate IDs:

Also applies to: 188-205


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Validate the two ONNX URLs resolve and report a Content-Length
for url in \
  "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx" \
  "https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx"
do
  echo "Checking: $url"
  code=$(curl -sIL -o /dev/null -w '%{http_code}' "$url")
  size=$(curl -sIL "$url" | rg -i '^content-length:' | awk '{print $2}' | tr -d '\r')
  echo "HTTP $code, Content-Length: ${size:-unknown}"
done

# 2) Detect duplicate model IDs in ModelListViewModel.swift
file="examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift"
echo "Scanning for duplicate IDs in $file"
rg -nP '\bid:\s*"([^"]+)"' "$file" | sed -E 's/.*id:\s*"([^"]+)".*/\1/' | sort | uniq -d | awk 'NF{print "Duplicate ID: "$0}' || true

Length of output: 2295


🏁 Script executed:

#!/bin/bash
sed -n '185,220p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift

Length of output: 1645


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Kitten siblings non-onnx:"
curl -s https://huggingface.co/api/models/KittenML/kitten-tts-nano-0.1 | jq -r '.siblings[].rfilename' | grep -v '\.onnx$' || echo "None"

echo "Piper siblings in en/en_US/lessac/medium non-onnx:"
curl -s https://huggingface.co/api/models/rhasspy/piper-voices | jq -r '.siblings[].rfilename' | grep '^en/en_US/lessac/medium/' | grep -v '\.onnx$' || echo "None"

Length of output: 678


Include auxiliary assets for ONNX TTS models
URLs resolve (Kitten ~23.8 MB; Piper ~63.2 MB) and model IDs are unique. Both repos expose non-ONNX files—Kitten: voices.npz; Piper: en_US-lessac-medium.onnx.json—which may be required at runtime. Confirm their necessity and extend your model definitions to download or document these assets.

🤖 Prompt for AI Agents
In
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift
around lines 170 to 187, the ONNX TTS model entry for "sherpa-kitten-nano-v0.1"
omits auxiliary assets (e.g., voices.npz) that the repo exposes and may be
required at runtime; update the ModelInfo for Kitten (and any Piper entries) to
either include these auxiliary asset URLs in a new metadata.assets or
downloadURLs array and mark them as required, or add explicit documentation
fields noting which auxiliary files are needed and where to fetch them; ensure
the model IDs remain unique, add estimated sizes for the assets, and update any
download/installer logic to fetch and validate these auxiliary files alongside
the .onnx file.

Comment on lines +18 to +25
// 2. Implement service protocol
public class MyModuleService: YourServiceProtocol {
private let sdk = RunAnywhereSDK.shared

public init() {
// Register models, strategies, etc.
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix protocol/type naming inconsistency in samples.

Quick Start shows MyModuleService: YourServiceProtocol, while Step-by-Step uses YourModuleService: YourProtocol. Pick one and use it consistently to avoid confusion.

Apply:

- public class YourModuleService: YourProtocol {
+ public class YourModuleService: YourServiceProtocol {

Also applies to: 85-91

🤖 Prompt for AI Agents
In sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md around lines 18-25
(and also applicable to lines 85-91), the sample class and protocol/type names
are inconsistent (MyModuleService: YourServiceProtocol vs YourModuleService:
YourProtocol); pick one naming convention and make both samples identical.
Replace occurrences so both snippets use the same service and protocol names
(for example MyModuleService implementing YourServiceProtocol) and update any
related references in those line ranges to match exactly.

Comment on lines +19 to +23
// LLM.swift dependency - using latest from GitHub main branch
.package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"),
// Reference to main SDK for protocols
.package(path: "../../"),
],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid tracking a moving target; pin LLM.swift to a tag or revision.

Depending on branch "main" risks breakages; you already pinned a revision in Package.resolved. Pin here too.

Apply this diff (using the resolved commit):

-        .package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"),
+        .package(url: "https://github.com/eastriverlee/LLM.swift", revision: "4c4e909ac4758c628c9cd263a0c25b6edff5526d"),
🤖 Prompt for AI Agents
In sdk/runanywhere-swift/Modules/LLMSwift/Package.swift around lines 19 to 23,
the LLM.swift dependency is pinned to branch "main" which is a moving target;
replace the branch specifier with the exact revision (commit hash) recorded in
Package.resolved. Edit the .package(...) entry to use
.revision("<<COMMIT_HASH_FROM_Package.resolved>>") (or the appropriate tag
string if Package.resolved shows a tag) instead of branch: "main", then run
swift package resolve to verify and commit the updated Package.swift.

Comment on lines 6 to +11
public class LLMSwiftService: LLMService {
private var llm: LLM?
private var modelPath: String?
private var _modelInfo: LoadedModelInfo?
// Removed context property - no longer using Context type
private let hardwareConfig: HardwareConfiguration?
private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService")
private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Expose via Obj‑C and provide a public initializer for reflection-based creation

To work with NSClassFromString, LLMSwiftService must be an NSObject subclass with a public init.

-public class LLMSwiftService: LLMService {
+@objc(LLMSwiftService)
+public class LLMSwiftService: NSObject, LLMService {
@@
-    init(hardwareConfig: HardwareConfiguration? = nil) {
-        self.hardwareConfig = hardwareConfig
-    }
+    public override init() {
+        self.hardwareConfig = nil
+        super.init()
+    }
+
+    public init(hardwareConfig: HardwareConfiguration? = nil) {
+        self.hardwareConfig = hardwareConfig
+        super.init()
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public class LLMSwiftService: LLMService {
private var llm: LLM?
private var modelPath: String?
private var _modelInfo: LoadedModelInfo?
// Removed context property - no longer using Context type
private let hardwareConfig: HardwareConfiguration?
private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService")
private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")
@objc(LLMSwiftService)
public class LLMSwiftService: NSObject, LLMService {
private var llm: LLM?
private var modelPath: String?
private var _modelInfo: LoadedModelInfo?
private let hardwareConfig: HardwareConfiguration?
private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService")
public override init() {
self.hardwareConfig = nil
super.init()
}
public init(hardwareConfig: HardwareConfiguration? = nil) {
self.hardwareConfig = hardwareConfig
super.init()
}
// … rest of implementation …
}

Comment on lines +175 to +180
/// Register a module
public func register(_ configuration: ModuleConfiguration) {
queue.async(flags: .barrier) {
self.registeredModules[configuration.moduleId] = configuration
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make registration synchronous to guarantee visibility upon return.

async(flags: .barrier) returns before the write completes, causing racy reads immediately after register.

-    public func register(_ configuration: ModuleConfiguration) {
-        queue.async(flags: .barrier) {
-            self.registeredModules[configuration.moduleId] = configuration
-        }
-    }
+    public func register(_ configuration: ModuleConfiguration) {
+        queue.sync(flags: .barrier) {
+            self.registeredModules[configuration.moduleId] = configuration
+        }
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// Register a module
public func register(_ configuration: ModuleConfiguration) {
queue.async(flags: .barrier) {
self.registeredModules[configuration.moduleId] = configuration
}
}
/// Register a module
public func register(_ configuration: ModuleConfiguration) {
queue.sync(flags: .barrier) {
self.registeredModules[configuration.moduleId] = configuration
}
}
🤖 Prompt for AI Agents
In
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift
around lines 175 to 180, the register(_:) method currently uses
queue.async(flags: .barrier) which returns before the write completes and can
cause racy reads; change it to perform the write synchronously (use
queue.sync(flags: .barrier)) so the registration is visible to callers on
return, and guard against potential deadlock by ensuring register is never
called from the same serial/target queue (or use a re-entrant-safe mechanism)
before switching to sync.

Comment on lines +31 to +44
/// Create Sherpa-ONNX TTS if module is available
private func createSherpaONNXTTS() -> (any TextToSpeechService)? {
// Check if SherpaONNXTTS module is available
let className = "SherpaONNXTTS.SherpaONNXTTSService"

guard NSClassFromString(className) != nil else {
print("[RunAnywhereSDK] SherpaONNXTTS module not found. Add it to your app dependencies.")
return nil
}

// Module should register itself and provide factory when imported
// For now, return nil - actual instantiation will be handled by the module
return nil
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Dynamic module discovery cannot instantiate pure-Swift classes

NSClassFromString("SherpaONNXTTS.SherpaONNXTTSService") will return nil for a Swift class that doesn’t inherit from NSObject or have @objc. Even if the class is found, you return nil instead of constructing the service, so you’ll always fall back to System TTS.

Options:

  • Adopt a small registry: the module calls RunAnywhereSDK.registerTTSFactory { SherpaONNXTTSService(sdk: .shared) } at load; this extension pulls from the registry.
  • Or make SherpaONNXTTSService @objc(SherpaONNXTTSService) and subclass NSObject, then reflect and instantiate, and call await service.initialize().

Example registry sketch:

+public typealias TTSFactory = () async -> (any TextToSpeechService)
+private var ttsFactories: [String: TTSFactory] = [:]
+public static func registerTTSFactory(name: String, factory: @escaping TTSFactory) { ttsFactories[name] = factory }
+public static func resolveTTSFactory(name: String) -> TTSFactory? { ttsFactories[name] }

Then:

- return createSherpaONNXTTS()
+ if let f = RunAnywhereSDK.resolveTTSFactory(name: "SherpaONNXTTS") { return await f() }
+ return nil

Also applies to: 100-116, 123-139

Comment on lines +135 to +156
// Fallback to system TTS
print("[VoiceModuleFactory] Using System TTS")
return SystemTextToSpeechService()
}

/// Create TTS service based on configuration
public static func createTTSService(from config: VoiceTTSConfig) async -> any TextToSpeechService {
let sdk = RunAnywhereSDK.shared

switch config.provider {
case .system:
return SystemTextToSpeechService()

case .sherpaONNX:
if sdk.isSherpaONNXTTSAvailable {
// Module is available but needs proper instantiation
// For now, fallback to system until module provides factory
print("[VoiceModuleFactory] Sherpa-ONNX module detected, awaiting factory implementation")
}
// Fallback to system TTS
return SystemTextToSpeechService()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Factory returns System TTS even when module is present

You detect the Sherpa module then still return SystemTextToSpeechService(). This hides the new feature.

Wire the actual instantiation (via registry or reflection as above) and initialize the service before returning it.


Implement Sherpa-ONNX TTS instantiation in createTTSService
In RunAnywhereSDK+VoiceModules.swift’s createTTSService(from:), the .sherpaONNX case always falls back to SystemTextToSpeechService()—even when sdk.isSherpaONNXTTSAvailable is true. Wire up the actual Sherpa-ONNX implementation (via the registry/reflection mechanism) and return it instead of the system fallback.

Comment on lines +220 to +246
func synthesizeStream(text: String, rate: Float, pitch: Float, volume: Float) -> AsyncThrowingStream<Data, Error> {
AsyncThrowingStream { continuation in
Task {
do {
// For streaming, we need to chunk the text
let sentences = splitIntoSentences(text)

for sentence in sentences {
guard let audioChunk = bridge?.synthesizeText(
sentence,
speakerId: getCurrentSpeakerId(),
speed: rate
) else {
throw SherpaONNXError.synthesisFailure("Stream synthesis failed")
}

continuation.yield(applyVolume(to: audioChunk, volume: volume))
}

continuation.finish()
} catch {
continuation.finish(throwing: error)
}
}
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Serialize access to the native TTS handle.

The plan calls out non-thread-safety, but the stream method can concurrently call into bridge. Wrap all native calls on a serial queue or an actor.

Apply a serial queue:

 final class SherpaONNXWrapper {
-    private var bridge: SherpaONNXBridge?
+    private var bridge: SherpaONNXBridge?
+    private let ttsQueue = DispatchQueue(label: "com.runanywhere.sherpa.tts")

@@
-        guard let audioData = bridge.synthesizeText(
+        let audioData = try await withCheckedThrowingContinuation { cont in
+            ttsQueue.async {
+                let data = bridge.synthesizeText(
                     text,
                     speakerId: speakerId,
                     speed: rate
-        ) else {
+                )
+                guard let data else { cont.resume(throwing: SherpaONNXError.synthesisFailure("Failed to generate audio")); return }
+                cont.resume(returning: data)
+            }
+        }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 220
to 246, the streaming method calls into the non-thread-safe native `bridge` from
potentially concurrent contexts; serialize all access to `bridge` by dispatching
synthesizeText calls onto a dedicated serial queue or by routing them through an
actor, await the serialized call result before yielding to the continuation, and
capture any thrown errors to finish the continuation with that error; ensure
continuation.yield/finish are invoked from the Task context after the serialized
bridge call completes and propagate errors from the bridge back to the caller.

Comment on lines +351 to +385
Create `build_sherpa_onnx.sh`:
```bash
#!/bin/bash

# Build Sherpa-ONNX XCFrameworks for iOS

set -e

SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
PROJECT_ROOT="$SCRIPT_DIR/../.."
EXTERNAL_DIR="$PROJECT_ROOT/EXTERNAL"
MODULE_DIR="$PROJECT_ROOT/sdk/runanywhere-swift/Modules/SherpaONNXTTS"

echo "🔨 Building Sherpa-ONNX XCFrameworks..."

# Clone if not exists
if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
echo "📥 Cloning sherpa-onnx..."
git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
fi

# Build
cd "$EXTERNAL_DIR/sherpa-onnx"
echo "🏗️ Building for iOS..."
./build-ios.sh

# Copy frameworks
echo "📦 Copying XCFrameworks..."
mkdir -p "$MODULE_DIR/XCFrameworks"
cp -r build-ios/sherpa-onnx.xcframework "$MODULE_DIR/XCFrameworks/"
cp -r build-ios/onnxruntime.xcframework "$MODULE_DIR/XCFrameworks/"

echo "✅ Build complete!"
echo "📍 Frameworks located at: $MODULE_DIR/XCFrameworks/"
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pin the upstream ref and harden the build script.

Use a fixed tag/commit and stricter shell options to avoid accidental upgrades and silent failures.

-#!/bin/bash
-set -e
+#!/usr/bin/env bash
+set -euo pipefail
+IFS=$'\n\t'
@@
-if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
+REF="${SHERPA_ONNX_REF:-vX.Y.Z}" # set a default or pass in env
+if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
     echo "📥 Cloning sherpa-onnx..."
-    git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
+    git clone --depth 1 --branch "$REF" https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
 else
-    echo "📥 Pulling latest changes..."
-    cd "$EXTERNAL_DIR/sherpa-onnx"
-    git pull
+    cd "$EXTERNAL_DIR/sherpa-onnx"
+    git fetch --tags --depth 1 origin "$REF"
+    git checkout -f "$REF"
 fi

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 351
to 385, the new build_sherpa_onnx.sh script lacks a pinned upstream ref and is
brittle; update the script to use strict shell options (set -euo pipefail) and
pin the repository to a specific tag or commit (use git clone --branch <TAG> or
clone then git checkout <COMMIT>), verify the clone succeeded and the expected
build script (build-ios.sh) exists and is executable before running it, fail
with clear error messages if checkout/clone or build steps fail, and add checks
that the expected XCFramework output paths exist before copying (exit nonzero
with a helpful message if not).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review continued from previous batch...

Comment on lines +52 to +68
public func registerAndDownloadModel(
name: String,
url: URL,
framework: LLMFramework,
additionalFiles: [URL] = []
) async throws -> URL {
// Create model info with additional files
let model = sdk.addModelFromURL(
name: name,
url: url,
framework: framework,
estimatedSize: nil
)

// Download main model and additional files
return try await downloadModelWithProgress(model.id)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

additionalFiles is unused. Implement or drop from API.

Current docs promise behavior that isn’t implemented.

Option A (remove now, reintroduce when ready):

-    ///   - additionalFiles: Additional files to download
+    ///   - additionalFiles: (Removed) Additional files to download
     /// - Returns: Local path to downloaded model
     public func registerAndDownloadModel(
         name: String,
         url: URL,
-        framework: LLMFramework,
-        additionalFiles: [URL] = []
+        framework: LLMFramework
     ) async throws -> URL {
         // Create model info with additional files
         let model = sdk.addModelFromURL(
             name: name,
             url: url,
             framework: framework,
             estimatedSize: nil
         )
 
-        // Download main model and additional files
+        // Download main model
         return try await downloadModelWithProgress(model.id)
     }

Option B (implement): loop and register extra assets via the SDK’s API (tell me the method names and I’ll wire it up).


🏁 Script executed:

#!/bin/bash
# Search for definition of addModelFromURL to inspect its signature
rg -nP --type swift 'func addModelFromURL' -C3

# Search for any usage of the `additionalFiles` parameter across the codebase
rg -n --type swift 'additionalFiles' -C3

Length of output: 3907


Remove unused additionalFiles parameter
The additionalFiles argument is never consumed—drop it from the method signature and documentation.

-    ///   - additionalFiles: Additional files to download
+    ///   - additionalFiles: (removed)
     /// - Returns: Local path to downloaded model
     public func registerAndDownloadModel(
         name: String,
         url: URL,
-        framework: LLMFramework,
-        additionalFiles: [URL] = []
+        framework: LLMFramework
     ) async throws -> URL {
-        // Download main model and additional files
+        // Download main model
         return try await downloadModelWithProgress(model.id)
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public func registerAndDownloadModel(
name: String,
url: URL,
framework: LLMFramework,
additionalFiles: [URL] = []
) async throws -> URL {
// Create model info with additional files
let model = sdk.addModelFromURL(
name: name,
url: url,
framework: framework,
estimatedSize: nil
)
// Download main model and additional files
return try await downloadModelWithProgress(model.id)
}
/// - additionalFiles: (removed)
/// - Returns: Local path to downloaded model
public func registerAndDownloadModel(
name: String,
url: URL,
framework: LLMFramework
) async throws -> URL {
// Create model info with additional files
let model = sdk.addModelFromURL(
name: name,
url: url,
framework: framework,
estimatedSize: nil
)
// Download main model
return try await downloadModelWithProgress(model.id)
}
🤖 Prompt for AI Agents
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift
lines 52-68: the additionalFiles parameter is unused and should be removed from
the method signature and any public documentation; update the function signature
to remove additionalFiles, remove any references to it in the implementation (no
other code changes needed inside since it was unused), update all call sites and
tests to call registerAndDownloadModel(name:url:framework:) without the extra
argument, and update API docs/comments to reflect the new signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants