-
Notifications
You must be signed in to change notification settings - Fork 29
[iOS-SDK] TTS integration + ONNX runtime integration to run TTS models #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Introduced new extensions for LLM and voice modules in RunAnywhereSDK, enhancing modularity and service creation. - Implemented LLMModuleFactory and VoiceModuleFactory for streamlined service instantiation based on available modules. - Added protocols for LLMService and SpeechToTextService to standardize module interactions. - Created comprehensive configuration structures for LLM and voice modules, improving flexibility and usability. - Established a ModuleIntegrationHelper for downloading models with progress tracking and managing module lifecycles. - Documented module development guidelines to assist future integrations and ensure consistency across modules.
- Introduced the SherpaONNXTTS module, including core components such as SherpaONNXTTSService, SherpaONNXConfiguration, SherpaONNXModelManager, SherpaONNXDownloadStrategy, and SherpaONNXWrapper. - Implemented a robust model registration and download strategy for managing TTS models and their dependencies. - Established a comprehensive configuration structure for the TTS engine, allowing for flexible model management and synthesis options. - Enhanced VoiceCapabilityService to support dynamic loading of the SherpaONNXTTS service based on configuration. - Documented module development guidelines and integration patterns for future reference and consistency.
- Introduced `build_frameworks.sh` to automate the cloning and building of Sherpa-ONNX XCFrameworks. - Added `Package.resolved` to manage dependencies for the SherpaONNXTTS module. - Updated `Package.swift` to include binary targets for the newly built XCFrameworks. - Created a comprehensive `README.md` for module setup, features, and integration instructions. - Implemented module map for C++ interop and added Objective-C++ bridge header and implementation for seamless integration with the Sherpa-ONNX C API. - Cleaned up the project structure and ensured adherence to SOLID principles for maintainability and scalability.
…TTS module - Created a comprehensive `NEXT_STEPS.md` file outlining completed tasks and immediate next steps for the SherpaONNXTTS module. - Updated `Package.swift` to include public headers and C++ settings for better integration with the Objective-C++ bridge. - Introduced `SherpaONNXBridge.mm` for Objective-C++ implementation, facilitating seamless interaction with the Sherpa-ONNX C API. - Added unit tests in `SherpaONNXTTSTests.swift` to validate service initialization, configuration, model types, and error handling.
…odule - Created `BUILD_DOCUMENTATION.md` detailing the end-to-end process for building and integrating the Sherpa-ONNX TTS module with the RunAnywhere Swift SDK. - Updated `Package.swift` to support newer platform versions and include the `SherpaONNXBridge` target for improved integration. - Introduced `SherpaONNXBridge.h` and `SherpaONNXBridge.mm` for Objective-C++ bridging to the Sherpa-ONNX C API. - Enhanced `SherpaONNXWrapper.swift` to utilize the new bridge, improving TTS functionality and performance. - Added XCFrameworks for `onnxruntime` and `sherpa-onnx`, ensuring multi-platform support and optimized builds.
…rocessing capabilities - Introduced LLMSwift module for LLM integration, including adapter and service implementations. - Added WhisperKitTranscription module for speech-to-text functionality, featuring a custom download strategy and service for transcription. - Updated project configuration to include new dependencies and removed obsolete references to WhisperKit. - Enhanced documentation for both modules, detailing installation, usage, and error handling. - Cleaned up unused code and ensured adherence to SOLID principles for maintainability and scalability.
… code clarity - Added LLM.swift as a dependency from GitHub to enhance LLM integration. - Removed legacy text-to-voice handling from WhisperKitAdapter and WhisperKitService, simplifying the codebase to focus solely on speech-to-text functionality. - Cleaned up comments in WhisperKitService to reflect current implementation without legacy references. - Updated project documentation to reflect changes in module structure and dependencies.
- Updated VoiceAssistantView to display the current TTS model from the view model instead of a static label. - Enhanced VoiceAssistantViewModel by adding a new property for the TTS model. - Improved SherpaONNXWrapper to use consistent property names for voice identifiers and streamlined audio playback management in SherpaONNXTTSService. - Refactored SherpaONNXDownloadStrategy to simplify the download process and improve error handling.
- Introduced new TTS models for SherpaONNX, including Kitten TTS Nano and VITS English US, with detailed metadata and download information. - Updated FrameworkRecommender to include performance ratings for SherpaONNX, enhancing model selection capabilities. - Modified LLMFramework to support SherpaONNX as a new framework type, ensuring proper handling for text-to-voice functionality.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
WalkthroughMigrates iOS example from remote packages to local modules, introduces Sherpa-ONNX TTS and WhisperKit transcription modules, and expands SDK with module infrastructure, TTS provider selection, and new enums/configs. Adds LLMSwift package, extensive Sherpa-ONNX bridge and service, download strategies, tests, scripts, and documentation. Minor UI and model list updates. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant App as RunAnywhereAI App
participant VCS as VoiceCapabilityService
participant SDK as RunAnywhereSDK
participant SysTTS as SystemTextToSpeechService
participant Sherpa as SherpaONNXTTSService
Note over App,VCS: TTS selection based on VoiceTTSConfig
User->>App: Start voice session
App->>VCS: findTTSService(for: VoiceTTSConfig)
alt provider = sherpaONNX
VCS->>SDK: isModuleAvailable("SherpaONNXTTS.SherpaONNXTTSService")
alt available
VCS->>Sherpa: init()
VCS-->>App: SherpaONNXTTSService
else not available/fails
VCS->>SysTTS: init()
VCS-->>App: System TTS (fallback)
end
else provider = system or nil
VCS->>SysTTS: init()
VCS-->>App: System TTS
end
App->>+Sherpa: synthesize(text, options)
Sherpa-->>-App: audio Data / Stream
App-->>User: Playback
sequenceDiagram
autonumber
participant Service as SherpaONNXTTSService
participant SDK as RunAnywhereSDK
participant ModelMgr as SherpaONNXModelManager
participant DL as DownloadManager
participant Wrapper as SherpaONNXWrapper
participant Bridge as SherpaONNXBridge
Note over Service,SDK: Initialization and model setup
Service->>SDK: registerModuleDownloadStrategy(SherpaONNXDownloadStrategy)
Service->>ModelMgr: registerModels()
ModelMgr->>SDK: registerModuleModels(models)
Service->>SDK: getModelLocalPath(for: modelId)
alt not downloaded
Service->>DL: download(modelId) with progress
DL-->>Service: completion
Service->>SDK: getModelLocalPath(for: modelId)
end
Service->>Wrapper: init(configuration)
Wrapper->>Bridge: initWithModelPath(..., modelType, ...)
Bridge-->>Wrapper: ready
Wrapper-->>Service: ready
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 41
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (10)
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (4)
17-22: Streaming thresholds use bytes as if they were samples (0.5s/0.1s are off by 4×).minAudioLength and contextOverlap are compared to Data.byteCount. Fix by using bytes-per-sample.
Apply this diff:
- private var audioAccumulator = Data() - private let minAudioLength = 8000 // 500ms at 16kHz - private let contextOverlap = 1600 // 100ms overlap for context + private var audioAccumulator = Data() + private let sampleRate = 16_000 + private let bytesPerSample = MemoryLayout<Float>.size // adjust if using Int16 pipeline + private var minAudioBytes: Int { (sampleRate / 2) * bytesPerSample } // 500ms + private var contextOverlapBytes: Int { (sampleRate / 10) * bytesPerSample } // 100ms- // Process when we have enough audio (500ms) - if audioBuffer.count >= minAudioLength { + // Process when we have enough audio (500ms) + if audioBuffer.count >= minAudioBytes {- // Keep last 100ms for context continuity - audioBuffer = Data(audioBuffer.suffix(contextOverlap)) + // Keep last 100ms for context continuity + audioBuffer = Data(audioBuffer.suffix(contextOverlapBytes))Also applies to: 339-341, 382-384
318-331: Streaming early-return bug: after initializing, the function returns and never processes audio.The return inside the guard’s else exits the Task. Restructure to proceed after lazy init.
Apply this diff:
- // Ensure WhisperKit is loaded - guard let whisperKit = self.whisperKit else { - if self.isInitialized { - // Already initialized, but whisperKit is nil - throw VoiceError.serviceNotInitialized - } else { - // Not initialized, try to initialize with default model - try await self.initialize(modelPath: nil) - guard self.whisperKit != nil else { - throw VoiceError.serviceNotInitialized - } - } - return - } + // Ensure WhisperKit is loaded + let whisperKit: WhisperKit = { + if let wk = self.whisperKit { return wk } + return self.whisperKit! // set below after init + }() + if self.whisperKit == nil { + if self.isInitialized { + throw VoiceError.serviceNotInitialized + } + try await self.initialize(modelPath: nil) + guard let wk = self.whisperKit else { throw VoiceError.serviceNotInitialized } + // use wk below + }
342-345: Same raw Data→Float32 assumption in streaming path.Repeat of the format issue above; convert Int16 to Float or use a definitive format.
Apply this diff in both places:
- let floatArray = audioBuffer.withUnsafeBytes { buffer in - Array(buffer.bindMemory(to: Float.self)) - } + let floatArray: [Float] + if audioBuffer.count % MemoryLayout<Float>.size == 0 { + floatArray = audioBuffer.withUnsafeBytes { buf in + Array(buf.bindMemory(to: Float.self)) + } + } else { + let i16 = audioBuffer.withUnsafeBytes { buf in + Array(buf.bindMemory(to: Int16.self)) + } + floatArray = i16.map { Float($0) / 32768.0 } + }Also applies to: 390-393
156-169: Honor caller options for task and language; don’t force English transcription.For non-English or translation, current behavior is incorrect.
Apply this diff:
- let decodingOptions = DecodingOptions( - task: .transcribe, - language: "en", // Force English to avoid language detection issues + let decodingOptions = DecodingOptions( + task: (options.task == .translate ? .translate : .transcribe), + language: options.language.rawValue, temperature: 0.0, temperatureFallbackCount: 1, sampleLength: 224, usePrefillPrompt: false, - detectLanguage: false, // Force English instead of auto-detect + detectLanguage: false, skipSpecialTokens: true, withoutTimestamps: true, compressionRatioThreshold: 2.4, logProbThreshold: -1.0, noSpeechThreshold: noSpeechThresh )If you support “auto” language in options, set language to nil and detectLanguage = true accordingly.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (1)
118-126: Set request timeouts and use request-based download to avoid hangs.URLSession.shared.download(from:) uses default timeouts. Use URLRequest with a timeout.
Apply this diff in both places:
- let (localURL, response) = try await URLSession.shared.download(from: fileURL) + var req = URLRequest(url: fileURL) + req.timeoutInterval = 60 + let (localURL, response) = try await URLSession.shared.download(for: req)Also applies to: 178-185
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (2)
82-90: Observer token is dropped immediately; model changes won’t be observed
addObserver(forName:...)returns a token you must retain. As written, the token is discarded, so no notifications will ever fire.Fix with Combine (you already import it) to auto-manage lifetimes:
@@ - // Listen for model changes - NotificationCenter.default.addObserver( - forName: Notification.Name("ModelLoaded"), - object: nil, - queue: .main - ) { [weak self] notification in - Task { @MainActor in - self?.updateModelInfo() - } - } + // Listen for model changes + NotificationCenter.default + .publisher(for: Notification.Name("ModelLoaded")) + .receive(on: RunLoop.main) + .sink { [weak self] _ in + self?.updateModelInfo() + } + .store(in: &cancellables)And add storage:
@@ class VoiceAssistantViewModel: ObservableObject { @@ private let audioCapture = AudioCapture() + private var cancellables = Set<AnyCancellable>()
318-324: Stop capture and tear down on pipeline errors to avoid resource leaksOn
.pipelineError, audio capture continues and the task/pipeline aren’t torn down.case .pipelineError(let error): errorMessage = error.localizedDescription sessionState = .error(error.localizedDescription) isProcessing = false isListening = false logger.error("Pipeline error: \(error)") + // Ensure resources are released on failure + audioCapture.stopContinuousCapture() + pipelineTask?.cancel() + pipelineTask = nil + voicePipeline = nilsdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (1)
239-291: Streaming token limit increments twice per tokentokenCount is incremented twice, halving the effective maxTokens. Increment once.
- var tokenCount = 0 + var tokenCount = 0 @@ - for await token in response { - tokenCount += 1 + for await token in response { + tokenCount += 1 @@ - // Check token limit (approximate - actual tokenization may differ) - tokenCount += 1 - if tokenCount >= maxTokens { + // Check token limit (approximate - actual tokenization may differ) + if tokenCount >= maxTokens { break }examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)
147-163: Drop LLM from packageProductDependencies (LLMSwift replaces it).Avoid linking both. If LLM is still needed, remove LLMSwift instead.
541E22712E3BE21300EBF8FA /* RunAnywhereSDK */, - 543028452E442716008361DD /* LLM */, 548CA0762E56D0DC0061CCF5 /* FluidAudioDiarization */, 5479377D2E57DF7600CB9251 /* LLMSwift */, 54760D382E57E06100A03191 /* WhisperKitTranscription */, 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */,
68-89: Remove explicit LLM linking from the Frameworks build phase
LLMSwift’s Swift-PM package already brings in LLM.swift transitively; keeping both risks duplicate symbols. In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj, delete the543028462E442716008361DD /* LLM in Frameworks */entry from the PBXFrameworksBuildPhase file list.
🧹 Nitpick comments (94)
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Package.resolved (3)
1-77: Consider dropping this nested Package.resolved (use only a single top-level lockfile).Keeping a lockfile per submodule often causes churn and conflicting pins across Xcode/SwiftPM versions. Prefer a single root Package.resolved or ignore lockfiles in nested modules.
If you decide to remove it from this module:
-{ - "pins" : [ - ... - ], - "version" : 2 -}
4-74: Prefer constraining versions in Package.swift over relying on Package.resolved.For library packages, clients ignore your lockfile. Encode your semver policy explicitly in Package.swift (e.g., .upToNextMinor for bugfix-only) so consumers resolve within intended bounds.
4-74: Remove unused dependencies: SherpaONNXTTS doesn’t import Alamofire, DeviceKit, Files, GRDB.swift, Pulse, swift-asn1, swift-crypto, or ZIPFoundation—remove them from Package.swift..gitignore (1)
52-55: Resolve LFS vs ignore conflict for XCFrameworks; also remove redundant EXTERNAL entry and keep a placeholder.Currently, XCFrameworks are ignored here while the module’s .gitattributes attempts to store them via Git LFS—only one policy should exist. If the intent is “do not commit frameworks; build locally,” keep this ignore and add a placeholder exception; also drop the redundant EXTERNAL/sherpa-onnx entry because EXTERNAL/ is already ignored later.
Apply:
-# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build) -sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/ -EXTERNAL/sherpa-onnx/ +# SherpaONNX TTS - Large binary frameworks (use setup_frameworks.sh to build) +sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/ +!sdk/runanywhere-swift/Modules/SherpaONNXTTS/XCFrameworks/.gitkeepsdk/runanywhere-swift/Modules/SherpaONNXTTS/.gitattributes (1)
1-4: Align LFS patterns with repo policy (ignored vs tracked binaries).If XCFrameworks remain ignored, this LFS config is moot and confusing. If you plan to track binaries, narrow the scope to explicit frameworks to avoid sweeping other artifacts.
Option A (recommended if binaries stay ignored): remove this file entirely.
Option B (track specific frameworks via LFS):
-*.xcframework filter=lfs diff=lfs merge=lfs -text -*.a filter=lfs diff=lfs merge=lfs -text -XCFrameworks/** filter=lfs diff=lfs merge=lfs -text +# Track only the shipped XCFrameworks +XCFrameworks/SherpaONNXFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text +XCFrameworks/ONNXRuntimeFramework.xcframework/** filter=lfs diff=lfs merge=lfs -text.github/pull_request_template.md (2)
12-13: Capitalize product/language names; minor wording polish.Use “MacBook” and “Swift” for consistency.
-- [ ] Tested on Macbook if swift changes -- [ ] Tested on Tablet/iPad if swift changes +- [ ] Tested on MacBook (if Swift changes) +- [ ] Tested on iPad/Tablet (if Swift changes)
28-29: Prompt for TTS evidence (audio) alongside UI screenshots.Given TTS, ask submitters for short audio samples and device details.
-## Screenshots - Attach all the relevant UI changes screenshots for iOS/Android and MacOS/Tablet/large screen sizes -- +## Screenshots & Media +- Attach relevant UI screenshots for iOS/Android and macOS/iPad (large screens). +- For TTS changes, attach short audio samples (or links) and note device model, iOS version, sample rate, and latency.sdk/runanywhere-swift/Sources/RunAnywhere/Core/Models/Framework/FrameworkModality.swift (1)
45-46: Modality mapping for SherpaONNX is correct and matches intent.Primary and supported modalities set to textToVoice;
isVoiceFrameworkalready covers both voice directions.Optional: co-locate
.sherpaONNXwith other voice frameworks in the same switch group for readability.- case .whisperKit, .openAIWhisper: - return .voiceToText - case .sherpaONNX: - return .textToVoice + case .whisperKit, .openAIWhisper: + return .voiceToText + case .sherpaONNX: + return .textToVoiceAlso applies to: 68-69
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitService.swift (2)
191-196: Remove per-call model enumeration; it’s expensive and noisy.Fetching available models on every transcription adds latency and log spam. Gate behind DEBUG and/or log once at init.
Apply this diff:
- do { - let availableModels = try await WhisperKit.fetchAvailableModels() - logger.info(" Available models: \(availableModels)") - } catch { - logger.info(" Could not fetch available models: \(error)") - } + #if DEBUG + do { + let availableModels = try await WhisperKit.fetchAvailableModels() + logger.debug("Available models: \(availableModels)") + } catch { + logger.debug("Could not fetch available models: \(error)") + } + #endif
128-136: Padding with random noise makes outputs nondeterministic. Consider deterministic dither or zeros.Random noise complicates testing and reproducibility.
Replace with a fixed very low-amplitude dither (e.g., a repeating sequence) or zeros. I can provide a deterministic generator if desired.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitAdapter.swift (1)
35-57: Adapter caching flow looks good; small resilience suggestion.Looks solid. Consider marking cleanupStaleCache as @mainactor (or move cache state into an actor) to avoid races if adapters are used across threads.
sdk/runanywhere-swift/Modules/LLMSwift/Package.resolved (2)
76-82: Pre-release swift-syntax pin may require a newer toolchain than 5.9.Pinned to 602.0.0-prerelease-2025-08-11; with tools 5.9 this could fail. Confirm your CI/Xcode version, or pin LLM.swift to a revision that resolves to a Swift 5.9-compatible swift-syntax.
Run CI with Xcode showing swiftc -version, or update the dependency pin accordingly. I can help pick a compatible rev.
1-95: Consider not committing Package.resolved for library-style modules.Package.resolved is best for apps; libraries should allow clients to resolve. Keeping it may force consumers onto your exact graph.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/WhisperKitDownloadStrategy.swift (3)
57-70: Base URL derivation: handle non-/resolve/main/ URLs more robustly.If the provided URL points directly to a file blob or a different branch/tag, current logic falls back to a fixed repo. Consider parsing owner/repo/path and preserving branch/tag if present.
100-105: Creating analytics/weights subdirs unconditionally.Not harmful, but only weights/ exists in your file lists; consider creating subdirs lazily per needed file path.
216-233: mapToHuggingFacePath(): unconditional dropLast() can mis-map IDs without a hash suffix.If modelId has no trailing hash, dropLast removes a real token. Only drop when a suffix matches a known hash pattern.
I can push a regex-based variant if helpful.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/TEAM_WORKFLOW.md (3)
160-169: Cache key should include the sherpa-onnx ref for deterministic CI.Hashing only setup_frameworks.sh risks stale caches when upstream changes. Include a pinned tag/commit (or env) in the key.
Add to the example:
env: SHERPA_ONNX_REF: vX.Y.Z # or a commit SHA - uses: actions/cache@v3 with: path: EXTERNAL/sherpa-onnx - key: sherpa-onnx-${{ hashFiles('**/setup_frameworks.sh') }} + key: sherpa-onnx-${{ env.SHERPA_ONNX_REF }}-${{ hashFiles('**/setup_frameworks.sh') }}
106-120: Strengthen the Git LFS guidance for existing binaries.If binaries were ever committed without LFS, devs will need migration to avoid bloating history.
Augment with:
# For repos that previously committed binaries: git lfs migrate import --include="*.xcframework,*.a"
211-215: Make “pin to specific sherpa-onnx commit/tag” actionable.Add an explicit example of how to set and propagate the ref used by scripts and CI to avoid accidental upgrades.
Suggested addition:
# In setup/build scripts : "${SHERPA_ONNX_REF:=vX.Y.Z}" git fetch --tags git checkout "$SHERPA_ONNX_REF"thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md (5)
41-45: Remove or use the sampleRate initializer param.Sherpa-ONNX exposes sample rate from the engine; passing it in here is misleading unless it configures resampling. Either wire it to config or drop it.
98-111: Verify model-type fields against the actual C API (likely mismatches).“kitten” looks invalid; common TTS configs are vits/kokoro/etc. Field names like
config.model.kitten.*may not exist.I can align this section to the current C API once you confirm the targeted sherpa-onnx version/tag.
168-176: Import the correct module name in Swift.
import SherpaONNXFrameworkmay be incorrect if the module map exportsSherpaONNXBridge. Import should match the module.modulemap “module” name.
333-346: Avoid Data→Array→Data copies for volume; use Accelerate.For large buffers this double copy is costly; vDSP scales in-place efficiently.
Example:
import Accelerate private func applyVolume(to audioData: Data, volume: Float) -> Data { guard volume != 1.0 else { return audioData } var out = Data(count: audioData.count) audioData.withUnsafeBytes { inBuf in out.withUnsafeMutableBytes { outBuf in let n = audioData.count / MemoryLayout<Float>.size vDSP_vsmul(inBuf.bindMemory(to: Float.self).baseAddress!, 1, [volume], outBuf.bindMemory(to: Float.self).baseAddress!, 1, vDSP_Length(n)) } } return out }
428-434: module.modulemap: consider ‘explicit’ module and header placement.Mark the module explicit and ensure the header path matches the packaged layout to avoid ambiguous imports when combined with other ObjC++ modules.
-module SherpaONNXBridge { +explicit module SherpaONNXBridge { header "SherpaONNXBridge.h" export * }sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftTemplateResolver.swift (1)
15-21: Minor: avoid repeated string scanning.Cache lowercased filename once (already done) and consider a lookup table or ordered rules to simplify maintenance.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/README.md (2)
45-51: Avoid static line-number references in docs.“Garbled output detection (lines 435-477)” will drift. Describe the behavior, not the line range, or link to a symbol.
89-91: Confirm and pin dependency/version constraints.Verify the minimum OS versions and “WhisperKit 0.10.2+” are accurate for this PR branch; consider pinning an exact tag in Package.swift examples.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/build_frameworks.sh (1)
49-57: Pre-flight dependency checks improve UX.Fail early if required tools are missing (git, cmake, xcodebuild).
command -v cmake >/dev/null || { echo -e "${RED}❌ cmake not found${NC}"; exit 1; } xcodebuild -version >/dev/null 2>&1 || { echo -e "${RED}❌ Xcode CLTs not found${NC}"; exit 1; }sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.resolved (1)
1-121: Remove leaf-module Package.resolved files
Detected Package.resolved in module folders under sdk/runanywhere-swift/Modules (SherpaONNXTTS, FluidAudioDiarization, LLMSwift, WhisperKitTranscription). Retain only the root sdk/runanywhere-swift/Package.resolved to reduce churn and avoid lockfile conflicts.sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXModelManager.swift (3)
25-100: Cache model definitions and pin URLs by revision.
- Avoid rebuilding arrays repeatedly; keep a cached list.
- “resolve/main” is mutable; prefer immutable, revision-pinned URLs for reproducible downloads.
- private func createModelDefinitions() -> [ModelInfo] { - return [ + private lazy var modelsCache: [ModelInfo] = { + [ // Kitten ... ModelInfo( - downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/model.onnx"), + downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/model.onnx"), ... alternativeDownloadURLs: [ - URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/voices.json"), + URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/<REV>/voices.json"), ... ].compactMap { $0 } ), // Repeat for other models... - ] - } + ] + }()Also: if these files are required assets (not alternates), align naming with your DownloadStrategy (e.g., “additionalFiles”) to avoid misinterpretation.
102-105: Don’t regenerate to search; use the cache or registry.Rebuilding the array for each lookup is wasteful. Use modelsCache.
- func getModel(by id: String) -> ModelInfo? { - return createModelDefinitions().first { $0.id == id } - } + func getModel(by id: String) -> ModelInfo? { + return modelsCache.first { $0.id == id } + }
107-114: Implement a basic device-aware selector.Small heuristic beats a hardcoded ID and prevents oversized models on constrained devices.
- func selectOptimalModel() -> String { - // TODO: Implement device capability detection - // Consider available memory, CPU performance, etc. - // For now, return the smallest model - return "sherpa-kitten-nano-v0.1" - } + func selectOptimalModel() -> String { + let mem = ProcessInfo.processInfo.physicalMemory + if mem >= 3_000_000_000 { return "sherpa-kokoro-en-v0.19" } + return "sherpa-kitten-nano-v0.1" + }sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.h (1)
59-61: Preferinvalidateand make it idempotent.Name conveys lifecycle intent better than
destroy; ensure multiple calls are safe.-- (void)destroy; +- (void)invalidate;Update implementation accordingly.
thoughts/shared/plans/sherpa_onnx_tts_complete_plan.md (1)
434-443: Turn performance targets into CI checks.Add simple benchmarks or smoke tests to fail PRs when RTF/memory regress beyond thresholds.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleCore.swift (3)
63-74: De-dupe model registration and avoid silent failures.Guard duplicate IDs or let RegistryService upsert; also prefer structured logging over print.
- guard let registry = serviceContainer.modelRegistry as? RegistryService else { - print("[RunAnywhereSDK] Failed to register module models: Registry service not available") + guard let registry = serviceContainer.modelRegistry as? RegistryService else { + // TODO: Inject logger; avoid print in public API. return } - for model in models { - registry.registerModel(model) - } + let unique = Dictionary(grouping: models, by: { $0.id }).compactMap { $0.value.first } + unique.forEach { registry.registerModel($0) }
76-85: Async not needed here.Method is synchronous; consider dropping async to avoid misleading callers.
- public func getModelLocalPath(for modelId: String) async -> URL? { + public func getModelLocalPath(for modelId: String) -> URL? { guard let model = serviceContainer.modelRegistry.getModel(by: modelId) else { return nil } return model.localPath }
37-45: Avoid duplicating cache-clearing logic.Delegate to the file manager’s clearModuleCache to keep one source of truth.
- public func clearModuleCache(moduleId: String) throws { - let baseFolder = serviceContainer.fileManager.getBaseFolder() - if let cacheFolder = try? baseFolder.subfolder(named: "Cache"), - let moduleFolder = try? cacheFolder.subfolder(named: moduleId) { - try moduleFolder.delete() - } - } + public func clearModuleCache(moduleId: String) throws { + try serviceContainer.fileManager.clearModuleCache(moduleId) + }sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXBridge/SherpaONNXBridge.mm (4)
24-26: Remove unused headers
<vector>and<string>aren’t used.-#include <vector> -#include <string>
156-163: Clamp speed and harden speaker validationAvoid out-of-range values reaching the C API; clamp speed and guard zero speakers.
// Validate speaker ID - if (speakerId < 0 || speakerId >= _numSpeakers) { + if (_numSpeakers <= 0 || speakerId < 0 || speakerId >= _numSpeakers) { NSLog(@"[SherpaONNXBridge] Invalid speaker ID: %ld (max: %d)", (long)speakerId, _numSpeakers - 1); speakerId = 0; // Default to first speaker } + // Clamp speed to a sane range [0.25, 4.0] + float clampedSpeed = fmaxf(0.25f, fminf(speed, 4.0f)); + // Generate audio const SherpaOnnxGeneratedAudio *audio = SherpaOnnxOfflineTtsGenerate( tts, [text UTF8String], (int32_t)speakerId, - speed + clampedSpeed );Also applies to: 165-171
181-183: Use size_t for byte count (overflow-safe)Avoid implicit signed-to-unsigned conversion and potential overflow on large buffers.
- NSData *audioData = [NSData dataWithBytes:audio->samples - length:audio->n * sizeof(float)]; + size_t byteCount = (size_t)audio->n * sizeof(float); + NSData *audioData = [NSData dataWithBytes:audio->samples length:byteCount];
276-283: Reset cached properties on destroyMinor hygiene: reset
_sampleRate/_numSpeakersafter destroyingtts.if (tts) { SherpaOnnxDestroyOfflineTts(tts); tts = nullptr; } + _sampleRate = 0; + _numSpeakers = 0;examples/ios/RunAnywhereAI/RunAnywhereAI/App/RunAnywhereAIApp.swift (1)
10-11: Conditionally import optional modulesPrevents build issues when these modules aren’t present in some configurations (e.g., CI, non-target platforms).
-import LLMSwift -import WhisperKitTranscription +#if canImport(LLMSwift) +import LLMSwift +#endif +#if canImport(WhisperKitTranscription) +import WhisperKitTranscription +#endifexamples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved (1)
70-74: LLM.swift pins are consistent across the repo
Both the example and SDK Package.resolved files reference https://github.com/eastriverlee/LLM.swift at revision 4c4e909ac4758c628c9cd263a0c25b6edff5526d.
Optional: pin LLM.swift to a semantic version tag in your Package.swift manifest to prevent drift.sdk/runanywhere-swift/Modules/WhisperKitTranscription/Package.swift (1)
6-11: Platform matrix OK; consider documenting why iOS 16+/macOS 13+ are requiredMatches WhisperKit’s requirements. Add a brief comment to prevent regressions.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantView.swift (1)
66-66: Use a fallback label when TTS model name is empty.Mirror the LLM badge behavior to avoid showing a blank value before the view model is ready.
-ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel, color: .purple) +ModelBadge(icon: "speaker.wave.2", label: "TTS", value: viewModel.ttsModel.isEmpty ? "Loading..." : viewModel.ttsModel, color: .purple)Also applies to: 268-268
sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Compatibility/Services/FrameworkRecommender.swift (1)
161-163: Sherpa-ONNX scoring hooks added—LGTM.Scores are consistent with adjacent framework ranges. Consider a minor bonus in calculateFormatScore for (.sherpaONNX, .onnx) to reflect native format preference, if you find selection too neutral across ONNX-capable frameworks.
Also applies to: 195-197, 228-230, 261-263
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/module.modulemap (1)
1-4: Consider declaring language requirements for the bridge.If the bridge header pulls ObjC/C++ (likely given .mm usage), declaring requirements reduces miscompilation risks across toolchains.
module SherpaONNXTTSBridge { + requires objc, cplusplus header "Internal/Bridge/SherpaONNXBridge.h" export * }If the header is pure C with extern "C" guards, this change is optional; otherwise it helps ensure correct compilation modes.
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Voice/VoiceAssistantViewModel.swift (3)
22-22: Keep ttsModel in sync with actual TTS configThe UI-facing
ttsModelis hardcoded and never updated when you build the pipeline. Set it based on the selected provider/voice to avoid drift.Apply after you construct config (see next comment’s diff) or set it alongside selected voice/model:
- @Published var ttsModel: String = "SherpaONNX" + @Published var ttsModel: String = "SherpaONNX"And later (after config creation):
self.ttsModel = "SherpaONNX • expr-voice-2-f"
142-154: Avoid hardcoding TTS model/voice; expose configuration and clamp rateHardcoded IDs will break on devices without those assets and make A/B testing hard. Surface these as inputs or settings, and clamp rate to valid provider bounds to prevent undefined behavior.
Suggested refactor in-place:
- let config = ModularPipelineConfig( + let selectedModelId = "sherpa-kitten-nano-v0.1" // TODO: load from settings/user selection + let selectedVoice = "expr-voice-2-f" // TODO: load from settings/user selection + let selectedRate: Float = max(0.5, min(2.0, 1.0)) // clamp [0.5, 2.0] (verify provider bounds) + let config = ModularPipelineConfig( components: [.vad, .stt, .llm, .tts], vad: VADConfig(), stt: VoiceSTTConfig(modelId: whisperModelName), llm: VoiceLLMConfig(modelId: "default", systemPrompt: "You are a helpful voice assistant. Keep responses concise and conversational."), - tts: VoiceTTSConfig.sherpaONNX( - modelId: "sherpa-kitten-nano-v0.1", - voice: "expr-voice-2-f", - rate: 1.0 - ) + tts: VoiceTTSConfig.sherpaONNX( + modelId: selectedModelId, + voice: selectedVoice, + rate: selectedRate + ) ) + self.ttsModel = "SherpaONNX • \(selectedVoice)"Please verify the acceptable rate range for Sherpa-ONNX on iOS and whether model/voice IDs match the download strategy. If desired, I can wire this to a settings store.
83-83: Prefer a typed notification nameAvoid stringly-typed
"ModelLoaded". Defineextension Notification.Name { static let modelLoaded = Notification.Name("ModelLoaded") }and use.modelLoadedfor safety and discoverability.sdk/runanywhere-swift/Modules/SherpaONNXTTS/NEXT_STEPS.md (2)
133-138: Reality-check performance targets per device classTargets like “TTFT <100ms” and “RTF >10x” may vary widely by model/device. Consider stating them as goals and adding a simple benchmarking harness (os_signpost + metrics) to validate.
127-130: Avoid committing large frameworks; prefer CI artifact cachingBeyond Git LFS, consider excluding
XCFrameworks/from VCS and producing them via CI with cached artifacts to keep the repo lean.sdk/runanywhere-swift/Modules/LLMSwift/README.md (3)
41-51: Include adapter registration context and orderMention this should be called early (e.g., app launch) before any model loads to avoid fallback adapters.
95-115: Document cancellation and backpressure behaviorAdd a note on whether generation calls are cancelable, thread-safe, and how many concurrent generations the service supports.
165-171: Clarify defaults are examples, not guaranteesContext length, history limit, timeout, and memory estimation may vary by model/hardware. Rephrase as “defaults (configurable)” and link to the knobs.
sdk/runanywhere-swift/Modules/WhisperKitTranscription/Sources/WhisperKitTranscription/VoiceError.swift (2)
3-10: Consider Sendable/I18N and richer contextIf
VoiceErrorcrosses concurrency boundaries, adoptSendable(or document it doesn’t). Also consider localizable strings and attaching context (e.g., sample rate/channels) tounsupportedAudioFormat.Example:
-public enum VoiceError: LocalizedError { +public enum VoiceError: LocalizedError { case serviceNotInitialized case modelNotFound(String) case transcriptionFailed(Error) case insufficientMemory - case unsupportedAudioFormat + case unsupportedAudioFormat(expectedHz: Int, expectedChannels: Int)
11-24: Expose failureReason/recoverySuggestion for user guidanceAdd
failureReason/recoverySuggestionto improve UX messages (e.g., suggest closing background apps on low memory).sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftError.swift (2)
3-9: Preserve underlying cause for generation failuresCarrying only a String drops root-cause details. Include an optional underlying Error to aid debugging and telemetry. If concurrency requires it later, consider documenting Sendable constraints.
-public enum LLMSwiftError: LocalizedError { +public enum LLMSwiftError: LocalizedError { case modelLoadFailed case initializationFailed - case generationFailed(String) + case generationFailed(String, underlying: Error? = nil) case templateResolutionFailed(String)
10-21: Include underlying error in description (when available)Small improvement to surface details in logs while keeping messages user-friendly.
- case .generationFailed(let reason): - return "Generation failed: \(reason)" + case .generationFailed(let reason, let underlying): + let detail = underlying.map { " (\($0.localizedDescription))" } ?? "" + return "Generation failed: \(reason)\(detail)"sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md (4)
35-47: Add language to fenced code block (markdownlint MD040).Specify a language for the "Directory Layout" fence to satisfy linters and improve rendering.
-``` +```text Modules/YourModule/ ├── Package.swift # SPM package definition ...--- `100-111`: **Avoid nil URL in example code.** URL(string:) can return nil; tighten the example to a guaranteed URL to reduce copy/paste footguns. ```diff - downloadURL: URL(string: "https://example.com/model.bin"), + downloadURL: URL(string: "https://example.com/model.bin")!, // safe in docs, or show: + // guard let url = URL(string: "https://example.com/model.bin") else { return }
151-153: Avoid top-level registration calls in library targets.Top-level code (registerModuleDownloadStrategy) in SPM libraries can execute at import time and is discouraged. Show it inside init() or initialize().
-// Register strategy -sdk.registerModuleDownloadStrategy(YourDownloadStrategy()) +// Register strategy during service setup (e.g., in init or initialize) +sdk.registerModuleDownloadStrategy(YourDownloadStrategy())
251-273: Replace emphasis-as-heading (MD036) with proper headings.Conform to markdownlint and improve skimmability.
-**Option A: Async Init (FluidAudioDiarization style)** +#### Option A: Async Init (FluidAudioDiarization style) ... -**Option B: Two-Phase (SherpaONNXTTS style)** +#### Option B: Two-Phase (SherpaONNXTTS style)sdk/runanywhere-swift/Modules/SherpaONNXTTS/README.md (3)
162-170: Add language to the architecture tree code fence (MD040).Specify a language (text) for better rendering and to satisfy linters.
-``` +```text SherpaONNXTTS/ ├── Sources/ ...--- `49-72`: **Clarify audio format returned by synthesize.** Document PCM format (e.g., 16‑bit little‑endian, mono, sample rate) or provide an AudioBuffer/AVAudioPCMBuffer return type to reduce integration ambiguity. Would you like a snippet showing returning AVAudioPCMBuffer and an example AVAudioEngine player? --- `84-97`: **Streaming usage: show cancellation/backpressure handling.** Add an example of cancelling the stream and noting per-chunk size to guide implementers integrating with audio queues. </blockquote></details> <details> <summary>sdk/runanywhere-swift/Sources/RunAnywhere/Capabilities/Voice/Services/VoiceCapabilityService.swift (1)</summary><blockquote> `156-166`: **Log provider and modelId for better diagnosis.** Include provider/modelId in the debug log to trace selection decisions. ```diff - logger.debug("Finding TTS service") + logger.debug("Finding TTS service (provider=\(ttsConfig?.provider.rawValue ?? "nil"), modelId=\(ttsConfig?.modelId ?? "nil"))")sdk/runanywhere-swift/Sources/RunAnywhere/Public/Models/Voice/VoiceTTSConfig.swift (1)
33-47: Consider input validation helpers.Optional: add static clamps for rate/pitch/volume (e.g., 0.5–2.0, 0–1) to keep configs sane across providers.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/setup_frameworks.sh (4)
40-48: Remove $? check; rely on pipefail.With pipefail, a failing curl|tar will exit non‑zero. Simplify and handle errors uniformly.
- curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR" - - if [ $? -eq 0 ]; then - echo "✅ Successfully downloaded pre-built frameworks!" - exit 0 - else - echo "❌ Download failed. Falling back to local build..." - fi + if curl -L "$DOWNLOAD_URL" | tar -xz -C "$XCFRAMEWORKS_DIR"; then + echo "✅ Successfully downloaded pre-built frameworks!" + exit 0 + fi + echo "❌ Download failed. Falling back to local build..."
75-77: Branch name safety.Upstream default may be main; use the default remote branch to avoid failures on repos that renamed master→main.
- git pull origin master + git pull --ff-only origin "$(git remote show origin | awk '/HEAD branch/ {print $NF}')"
91-93: Avoid hard-coding onnxruntime path/version.The sherpa-onnx build layout/version can change. Prefer a find-based copy with validation.
-cp -R "build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework" "$XCFRAMEWORKS_DIR/" +ONNXRT_SRC="$(fd -t d -a '^onnxruntime\.xcframework$' build-ios | head -n1)" +cp -R "$ONNXRT_SRC" "$XCFRAMEWORKS_DIR/"
95-110: Add checksum print to aid cache debugging.Printing shas helps teams verify identical artifacts.
echo "✅ Framework setup completed successfully!" # Show framework sizes echo "📊 Framework sizes:" du -sh "$XCFRAMEWORKS_DIR"/* + echo "🔐 Checksums:" + (cd "$XCFRAMEWORKS_DIR" && shasum -a 256 -b sherpa-onnx.xcframework/Info.plist onnxruntime.xcframework/Info.plist) + echo "" echo "🎉 SherpaONNX TTS is ready to use!"sdk/runanywhere-swift/Modules/SherpaONNXTTS/Tests/SherpaONNXTTSTests/SherpaONNXTTSTests.swift (3)
55-70: Avoid hard-coding model IDs in testsTie optimal-model assertions to registered content rather than a literal string to reduce brittleness across future catalog changes.
- let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1") + let kittenModel = manager.getModel(by: "sherpa-kitten-nano-v0.1") XCTAssertNotNil(kittenModel) - XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1") + XCTAssertEqual(kittenModel?.id, "sherpa-kitten-nano-v0.1") @@ - let optimalModel = manager.selectOptimalModel() - XCTAssertEqual(optimalModel, "sherpa-kitten-nano-v0.1") + let optimalModel = manager.selectOptimalModel() + XCTAssertEqual(optimalModel, kittenModel?.id)
27-35: Cover all cases via CaseIterable to catch future enum additionsIterate SherpaONNXModelType.allCases to ensure new cases get tested automatically.
- let modelTypes: [SherpaONNXModelType] = [.kitten, .kokoro, .vits, .matcha, .piper] + let modelTypes = SherpaONNXModelType.allCases
74-82: Skip async init when frameworks are absentProactively skip rather than implicitly succeed without initialize(); makes intent explicit.
func testServiceInitializationAsync() async throws { - // This test would require XCFrameworks to be built - // For now, we just test that the service can be created + // This test would require XCFrameworks to be built + try XCTSkipIf(true, "Sherpa-ONNX XCFrameworks not available in CI yet") let service = SherpaONNXTTSService() XCTAssertNotNil(service) // Initialization would fail without frameworks // So we don't call initialize() in this test }sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+LLMModules.swift (1)
80-103: Heuristic selection should be case-insensitive and consider file extensionsModel IDs may vary in case; checking by extension is more robust.
- if modelId.contains("mlx") && sdk.isMLXAvailable { + if modelId.lowercased().contains("mlx") && sdk.isMLXAvailable { return await sdk.createModuleLLMService(.mlx) } - if modelId.contains("gguf") && sdk.isLLMSwiftAvailable { + if modelId.lowercased().contains("gguf") && sdk.isLLMSwiftAvailable { return await sdk.createModuleLLMService(.llmSwift) }sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Models/SherpaONNXDownloadStrategy.swift (1)
89-96: Robustness: verify required files in addition to a markerA simple marker can become stale; optionally validate expected filenames exist.
sdk/runanywhere-swift/Modules/LLMSwift/Sources/LLMSwift/LLMSwiftService.swift (5)
57-66: Duplicate readiness guardSecond guard duplicates the nil-check right above; remove it for clarity.
- // Validate model readiness with a simple test prompt - logger.info("🧪 Validating model readiness with test prompt") - guard let llm = self.llm else { - throw FrameworkError( - framework: .llamaCpp, - underlying: LLMSwiftError.modelLoadFailed, - context: "Failed to initialize LLM.swift with model at \(modelPath)" - ) - } + // Validate model readiness with a simple test prompt + logger.info("🧪 Validating model readiness with test prompt (skipped)")
129-137: Don’t log full prompts in production logsReduce risk of PII leakage; log a bounded prefix.
- logger.info("📝 Full prompt being sent to LLM:") - logger.info("---START PROMPT---") - logger.info("\(fullPrompt)") - logger.info("---END PROMPT---") + logger.info("📝 Prompt (prefix 512 chars): \(fullPrompt.prefix(512))")
333-337: Implement generation options or document placeholdersapplyGenerationOptions currently does nothing; either wire supported parameters or mark unsupported explicitly.
318-322: Context memory estimate: tie to actual context lengthIf max context differs from 2048, adjust estimate accordingly.
1-5: Preferimport osoverimport os.logwithLoggerMinor consistency nit.
-import os.log +import ossdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXConfiguration.swift (1)
65-79: Consider using ByteCount for clarity of memory sizesestimatedMemoryUsage is an Int; consider documenting units or using a typealias for bytes.
sdk/runanywhere-swift/Modules/SherpaONNXTTS/BUILD_DOCUMENTATION.md (2)
86-92: Avoid hardcoding ONNX Runtime versioned paths in manual copy instructionsThe docs pin
onnxruntime.xcframeworkunder.../ios-onnxruntime/1.17.1/..., while later steps use a wildcard. This is brittle and will break on version bumps.Suggested edit:
- cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework XCFrameworks/ + cp -R /path/to/sherpa-onnx/build-ios/ios-onnxruntime/*/onnxruntime.xcframework XCFrameworks/
25-26: Clarify actual platform support vs. bridge availabilityYou advertise multi-platform (iOS, macOS, tvOS, watchOS), but the bridge and binary frameworks are gated only for iOS/macOS in code. Call this out explicitly to prevent confusion for tvOS/watchOS integrators.
Would you like me to PR a short “Platform Support” section noting that tvOS/watchOS builds are currently not supported by the Sherpa bridge and binaries?
Also applies to: 123-128
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Internal/Bridge/SherpaONNXWrapper.swift (2)
100-151: Streaming implementation chunks raw bytes with float-size assumptionsChunking uses
MemoryLayout<Float>.sizeand a fixed 16kHz without confirming the actual sample format/rate returned by the bridge. This risks producing malformed chunks and drift.
- Derive chunk size from
sampleRate()and the bridge’s actual sample format (e.g., f32 vs s16), or expose metadata from the bridge.- If the bridge returns raw PCM, document the format and consider emitting WAV-framed chunks for consumers that expect containerized audio.
Also applies to: 131-144
66-98: Pitch and volume arguments ignoredYou accept
pitchandvolumebut do not pass them to the bridge. If unsupported, document this and consider applying gain/pitch-shift client-side or dropping the parameters from this API.sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+VoiceModules.swift (1)
37-39: Prefer unified logging over printUse the SDK’s logging mechanism (os.Logger or a shared logger) instead of
Also applies to: 75-77, 128-138, 165-177
sdk/runanywhere-swift/Modules/SherpaONNXTTS/Sources/SherpaONNXTTS/Public/SherpaONNXTTSService.swift (2)
124-151: Initialize path: swallow errors and race on voice setYou set
currentVoiceby queuingsetVoicein a fire-and-forgetTaskand ignore errors. If this fails, the service state is inconsistent.Either make
currentVoicesetter async or document that it’s best-effort and log any failure explicitly.
6-9: Optional: Make service discoverable via Objective‑C if you keep reflectionIf you choose reflection over a registry, mark the class
@objc(SherpaONNXTTSService)and inherit fromNSObjectto allowNSClassFromStringdiscovery, and expose aninit()or static factory the loader can call.I can generate a small factory shim if you want to keep runtime discovery without direct imports.
Also applies to: 376-381, 413-425
examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj (2)
245-253: Confirm no duplication between root package and per-module packages.You reference "../../../sdk/runanywhere-swift" and the individual module packages. Ensure the root package does not also expose LLMSwift/WhisperKitTranscription/SherpaONNXTTS to avoid duplicate targets.
705-716: Stabilize local package paths.Relative paths are fragile if the project is moved. Prefer anchoring paths to $(SRCROOT) via workspace-level SPM or consolidating under a single aggregator Package.
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift (2)
182-187: Return a stable, deterministic ordering.Makes UI/debug output predictable.
- public func getAllModules() -> [ModuleConfiguration] { - queue.sync { - Array(registeredModules.values) - } - } + public func getAllModules() -> [ModuleConfiguration] { + queue.sync { + registeredModules.values.sorted { $0.moduleId < $1.moduleId } + } + }
70-77: Mark update check as TODO or surface “not implemented.”Stub returning false can hide real update availability.
| /* Begin PBXBuildFile section */ | ||
| 541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; }; | ||
| 543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; }; | ||
| 54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; }; | ||
| 54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; }; | ||
| 5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; }; | ||
| 547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Delete PBXBuildFile for LLM to complete the migration.
Leaving the PBXBuildFile will keep stale references.
- 543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /* Begin PBXBuildFile section */ | |
| 541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; }; | |
| 543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile; productRef = 543028452E442716008361DD /* LLM */; }; | |
| 54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; }; | |
| 54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; }; | |
| 5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; }; | |
| 547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; }; | |
| /* Begin PBXBuildFile section */ | |
| 541E22722E3BE21300EBF8FA /* RunAnywhereSDK in Frameworks */ = {isa = PBXBuildFile; productRef = 541E22712E3BE21300EBF8FA /* RunAnywhereSDK */; }; | |
| 54509A5A2E57FB2E00E24F06 /* SherpaONNXTTS in Frameworks */ = {isa = PBXBuildFile; productRef = 54509A592E57FB2E00E24F06 /* SherpaONNXTTS */; }; | |
| 54760D392E57E06100A03191 /* WhisperKitTranscription in Frameworks */ = {isa = PBXBuildFile; productRef = 54760D382E57E06100A03191 /* WhisperKitTranscription */; }; | |
| 5479377E2E57DF7600CB9251 /* LLMSwift in Frameworks */ = {isa = PBXBuildFile; productRef = 5479377D2E57DF7600CB9251 /* LLMSwift */; }; | |
| 547F70B12E35CE270061BAED /* Transformers in Frameworks */ = {isa = PBXBuildFile; productRef = 547F70B02E35CE270061BAED /* Transformers */; }; |
🤖 Prompt for AI Agents
In examples/ios/RunAnywhereAI/RunAnywhereAI.xcodeproj/project.pbxproj around
lines 9 to 15, the PBXBuildFile entry for "LLM in Frameworks" remains and must
be removed to avoid stale references; delete the entire line
"543028462E442716008361DD /* LLM in Frameworks */ = {isa = PBXBuildFile;
productRef = 543028452E442716008361DD /* LLM */; };" and ensure any commas or
separators in the surrounding PBXBuildFile section remain syntactically correct
(adjust trailing/leading commas or line endings) so the project.pbxproj stays
valid.
| // Sherpa Kitten TTS Nano - Smallest, fastest TTS model (fixed URL) | ||
| ModelInfo( | ||
| id: "sherpa-kitten-nano-v0.1", | ||
| name: "Kitten TTS Nano (Sherpa-ONNX)", | ||
| format: .onnx, | ||
| downloadURL: URL(string: "https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx"), | ||
| estimatedMemory: 50_000_000, // 50MB | ||
| contextLength: 0, // Not applicable for TTS models | ||
| downloadSize: 25_000_000, // ~25MB | ||
| compatibleFrameworks: [.sherpaONNX], | ||
| preferredFramework: .sherpaONNX, | ||
| metadata: ModelInfoMetadata( | ||
| tags: ["tts", "sherpa-onnx", "lightweight"], | ||
| description: "Lightweight neural TTS with 8 expressive voices" | ||
| ), | ||
| supportsThinking: false | ||
| ), | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
ONNX TTS entries look correct; verify URLs and registry uniqueness.
Both models are well-specified. Please verify:
- URLs resolve and sizes roughly match downloadSize.
- Model IDs are unique across predefinedModels.
- Sherpa-ONNX path requires no auxiliary assets (e.g., speaker metadata) for these two models.
Run this script to check reachability/size and duplicate IDs:
Also applies to: 188-205
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Validate the two ONNX URLs resolve and report a Content-Length
for url in \
"https://huggingface.co/KittenML/kitten-tts-nano-0.1/resolve/main/kitten_tts_nano_v0_1.onnx" \
"https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/lessac/medium/en_US-lessac-medium.onnx"
do
echo "Checking: $url"
code=$(curl -sIL -o /dev/null -w '%{http_code}' "$url")
size=$(curl -sIL "$url" | rg -i '^content-length:' | awk '{print $2}' | tr -d '\r')
echo "HTTP $code, Content-Length: ${size:-unknown}"
done
# 2) Detect duplicate model IDs in ModelListViewModel.swift
file="examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift"
echo "Scanning for duplicate IDs in $file"
rg -nP '\bid:\s*"([^"]+)"' "$file" | sed -E 's/.*id:\s*"([^"]+)".*/\1/' | sort | uniq -d | awk 'NF{print "Duplicate ID: "$0}' || trueLength of output: 2295
🏁 Script executed:
#!/bin/bash
sed -n '185,220p' examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swiftLength of output: 1645
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Kitten siblings non-onnx:"
curl -s https://huggingface.co/api/models/KittenML/kitten-tts-nano-0.1 | jq -r '.siblings[].rfilename' | grep -v '\.onnx$' || echo "None"
echo "Piper siblings in en/en_US/lessac/medium non-onnx:"
curl -s https://huggingface.co/api/models/rhasspy/piper-voices | jq -r '.siblings[].rfilename' | grep '^en/en_US/lessac/medium/' | grep -v '\.onnx$' || echo "None"Length of output: 678
Include auxiliary assets for ONNX TTS models
URLs resolve (Kitten ~23.8 MB; Piper ~63.2 MB) and model IDs are unique. Both repos expose non-ONNX files—Kitten: voices.npz; Piper: en_US-lessac-medium.onnx.json—which may be required at runtime. Confirm their necessity and extend your model definitions to download or document these assets.
🤖 Prompt for AI Agents
In
examples/ios/RunAnywhereAI/RunAnywhereAI/Features/Models/ModelListViewModel.swift
around lines 170 to 187, the ONNX TTS model entry for "sherpa-kitten-nano-v0.1"
omits auxiliary assets (e.g., voices.npz) that the repo exposes and may be
required at runtime; update the ModelInfo for Kitten (and any Piper entries) to
either include these auxiliary asset URLs in a new metadata.assets or
downloadURLs array and mark them as required, or add explicit documentation
fields noting which auxiliary files are needed and where to fetch them; ensure
the model IDs remain unique, add estimated sizes for the assets, and update any
download/installer logic to fetch and validate these auxiliary files alongside
the .onnx file.
| // 2. Implement service protocol | ||
| public class MyModuleService: YourServiceProtocol { | ||
| private let sdk = RunAnywhereSDK.shared | ||
|
|
||
| public init() { | ||
| // Register models, strategies, etc. | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix protocol/type naming inconsistency in samples.
Quick Start shows MyModuleService: YourServiceProtocol, while Step-by-Step uses YourModuleService: YourProtocol. Pick one and use it consistently to avoid confusion.
Apply:
- public class YourModuleService: YourProtocol {
+ public class YourModuleService: YourServiceProtocol {Also applies to: 85-91
🤖 Prompt for AI Agents
In sdk/runanywhere-swift/docs/MODULE_DEVELOPMENT_GUIDE.md around lines 18-25
(and also applicable to lines 85-91), the sample class and protocol/type names
are inconsistent (MyModuleService: YourServiceProtocol vs YourModuleService:
YourProtocol); pick one naming convention and make both samples identical.
Replace occurrences so both snippets use the same service and protocol names
(for example MyModuleService implementing YourServiceProtocol) and update any
related references in those line ranges to match exactly.
| // LLM.swift dependency - using latest from GitHub main branch | ||
| .package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"), | ||
| // Reference to main SDK for protocols | ||
| .package(path: "../../"), | ||
| ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid tracking a moving target; pin LLM.swift to a tag or revision.
Depending on branch "main" risks breakages; you already pinned a revision in Package.resolved. Pin here too.
Apply this diff (using the resolved commit):
- .package(url: "https://github.com/eastriverlee/LLM.swift", branch: "main"),
+ .package(url: "https://github.com/eastriverlee/LLM.swift", revision: "4c4e909ac4758c628c9cd263a0c25b6edff5526d"),🤖 Prompt for AI Agents
In sdk/runanywhere-swift/Modules/LLMSwift/Package.swift around lines 19 to 23,
the LLM.swift dependency is pinned to branch "main" which is a moving target;
replace the branch specifier with the exact revision (commit hash) recorded in
Package.resolved. Edit the .package(...) entry to use
.revision("<<COMMIT_HASH_FROM_Package.resolved>>") (or the appropriate tag
string if Package.resolved shows a tag) instead of branch: "main", then run
swift package resolve to verify and commit the updated Package.swift.
| public class LLMSwiftService: LLMService { | ||
| private var llm: LLM? | ||
| private var modelPath: String? | ||
| private var _modelInfo: LoadedModelInfo? | ||
| // Removed context property - no longer using Context type | ||
| private let hardwareConfig: HardwareConfiguration? | ||
| private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService") | ||
| private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expose via Obj‑C and provide a public initializer for reflection-based creation
To work with NSClassFromString, LLMSwiftService must be an NSObject subclass with a public init.
-public class LLMSwiftService: LLMService {
+@objc(LLMSwiftService)
+public class LLMSwiftService: NSObject, LLMService {
@@
- init(hardwareConfig: HardwareConfiguration? = nil) {
- self.hardwareConfig = hardwareConfig
- }
+ public override init() {
+ self.hardwareConfig = nil
+ super.init()
+ }
+
+ public init(hardwareConfig: HardwareConfiguration? = nil) {
+ self.hardwareConfig = hardwareConfig
+ super.init()
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public class LLMSwiftService: LLMService { | |
| private var llm: LLM? | |
| private var modelPath: String? | |
| private var _modelInfo: LoadedModelInfo? | |
| // Removed context property - no longer using Context type | |
| private let hardwareConfig: HardwareConfiguration? | |
| private let logger = Logger(subsystem: "com.runanywhere.RunAnywhereAI", category: "LLMSwiftService") | |
| private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService") | |
| @objc(LLMSwiftService) | |
| public class LLMSwiftService: NSObject, LLMService { | |
| private var llm: LLM? | |
| private var modelPath: String? | |
| private var _modelInfo: LoadedModelInfo? | |
| private let hardwareConfig: HardwareConfiguration? | |
| private let logger = Logger(subsystem: "com.runanywhere.llmswift", category: "LLMSwiftService") | |
| public override init() { | |
| self.hardwareConfig = nil | |
| super.init() | |
| } | |
| public init(hardwareConfig: HardwareConfiguration? = nil) { | |
| self.hardwareConfig = hardwareConfig | |
| super.init() | |
| } | |
| // … rest of implementation … | |
| } |
| /// Register a module | ||
| public func register(_ configuration: ModuleConfiguration) { | ||
| queue.async(flags: .barrier) { | ||
| self.registeredModules[configuration.moduleId] = configuration | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make registration synchronous to guarantee visibility upon return.
async(flags: .barrier) returns before the write completes, causing racy reads immediately after register.
- public func register(_ configuration: ModuleConfiguration) {
- queue.async(flags: .barrier) {
- self.registeredModules[configuration.moduleId] = configuration
- }
- }
+ public func register(_ configuration: ModuleConfiguration) {
+ queue.sync(flags: .barrier) {
+ self.registeredModules[configuration.moduleId] = configuration
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Register a module | |
| public func register(_ configuration: ModuleConfiguration) { | |
| queue.async(flags: .barrier) { | |
| self.registeredModules[configuration.moduleId] = configuration | |
| } | |
| } | |
| /// Register a module | |
| public func register(_ configuration: ModuleConfiguration) { | |
| queue.sync(flags: .barrier) { | |
| self.registeredModules[configuration.moduleId] = configuration | |
| } | |
| } |
🤖 Prompt for AI Agents
In
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift
around lines 175 to 180, the register(_:) method currently uses
queue.async(flags: .barrier) which returns before the write completes and can
cause racy reads; change it to perform the write synchronously (use
queue.sync(flags: .barrier)) so the registration is visible to callers on
return, and guard against potential deadlock by ensuring register is never
called from the same serial/target queue (or use a re-entrant-safe mechanism)
before switching to sync.
| /// Create Sherpa-ONNX TTS if module is available | ||
| private func createSherpaONNXTTS() -> (any TextToSpeechService)? { | ||
| // Check if SherpaONNXTTS module is available | ||
| let className = "SherpaONNXTTS.SherpaONNXTTSService" | ||
|
|
||
| guard NSClassFromString(className) != nil else { | ||
| print("[RunAnywhereSDK] SherpaONNXTTS module not found. Add it to your app dependencies.") | ||
| return nil | ||
| } | ||
|
|
||
| // Module should register itself and provide factory when imported | ||
| // For now, return nil - actual instantiation will be handled by the module | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Dynamic module discovery cannot instantiate pure-Swift classes
NSClassFromString("SherpaONNXTTS.SherpaONNXTTSService") will return nil for a Swift class that doesn’t inherit from NSObject or have @objc. Even if the class is found, you return nil instead of constructing the service, so you’ll always fall back to System TTS.
Options:
- Adopt a small registry: the module calls
RunAnywhereSDK.registerTTSFactory { SherpaONNXTTSService(sdk: .shared) }at load; this extension pulls from the registry. - Or make
SherpaONNXTTSService@objc(SherpaONNXTTSService)and subclassNSObject, then reflect and instantiate, and callawait service.initialize().
Example registry sketch:
+public typealias TTSFactory = () async -> (any TextToSpeechService)
+private var ttsFactories: [String: TTSFactory] = [:]
+public static func registerTTSFactory(name: String, factory: @escaping TTSFactory) { ttsFactories[name] = factory }
+public static func resolveTTSFactory(name: String) -> TTSFactory? { ttsFactories[name] }Then:
- return createSherpaONNXTTS()
+ if let f = RunAnywhereSDK.resolveTTSFactory(name: "SherpaONNXTTS") { return await f() }
+ return nilAlso applies to: 100-116, 123-139
| // Fallback to system TTS | ||
| print("[VoiceModuleFactory] Using System TTS") | ||
| return SystemTextToSpeechService() | ||
| } | ||
|
|
||
| /// Create TTS service based on configuration | ||
| public static func createTTSService(from config: VoiceTTSConfig) async -> any TextToSpeechService { | ||
| let sdk = RunAnywhereSDK.shared | ||
|
|
||
| switch config.provider { | ||
| case .system: | ||
| return SystemTextToSpeechService() | ||
|
|
||
| case .sherpaONNX: | ||
| if sdk.isSherpaONNXTTSAvailable { | ||
| // Module is available but needs proper instantiation | ||
| // For now, fallback to system until module provides factory | ||
| print("[VoiceModuleFactory] Sherpa-ONNX module detected, awaiting factory implementation") | ||
| } | ||
| // Fallback to system TTS | ||
| return SystemTextToSpeechService() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
❓ Verification inconclusive
Factory returns System TTS even when module is present
You detect the Sherpa module then still return SystemTextToSpeechService(). This hides the new feature.
Wire the actual instantiation (via registry or reflection as above) and initialize the service before returning it.
Implement Sherpa-ONNX TTS instantiation in createTTSService
In RunAnywhereSDK+VoiceModules.swift’s createTTSService(from:), the .sherpaONNX case always falls back to SystemTextToSpeechService()—even when sdk.isSherpaONNXTTSAvailable is true. Wire up the actual Sherpa-ONNX implementation (via the registry/reflection mechanism) and return it instead of the system fallback.
| func synthesizeStream(text: String, rate: Float, pitch: Float, volume: Float) -> AsyncThrowingStream<Data, Error> { | ||
| AsyncThrowingStream { continuation in | ||
| Task { | ||
| do { | ||
| // For streaming, we need to chunk the text | ||
| let sentences = splitIntoSentences(text) | ||
|
|
||
| for sentence in sentences { | ||
| guard let audioChunk = bridge?.synthesizeText( | ||
| sentence, | ||
| speakerId: getCurrentSpeakerId(), | ||
| speed: rate | ||
| ) else { | ||
| throw SherpaONNXError.synthesisFailure("Stream synthesis failed") | ||
| } | ||
|
|
||
| continuation.yield(applyVolume(to: audioChunk, volume: volume)) | ||
| } | ||
|
|
||
| continuation.finish() | ||
| } catch { | ||
| continuation.finish(throwing: error) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Serialize access to the native TTS handle.
The plan calls out non-thread-safety, but the stream method can concurrently call into bridge. Wrap all native calls on a serial queue or an actor.
Apply a serial queue:
final class SherpaONNXWrapper {
- private var bridge: SherpaONNXBridge?
+ private var bridge: SherpaONNXBridge?
+ private let ttsQueue = DispatchQueue(label: "com.runanywhere.sherpa.tts")
@@
- guard let audioData = bridge.synthesizeText(
+ let audioData = try await withCheckedThrowingContinuation { cont in
+ ttsQueue.async {
+ let data = bridge.synthesizeText(
text,
speakerId: speakerId,
speed: rate
- ) else {
+ )
+ guard let data else { cont.resume(throwing: SherpaONNXError.synthesisFailure("Failed to generate audio")); return }
+ cont.resume(returning: data)
+ }
+ }Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 220
to 246, the streaming method calls into the non-thread-safe native `bridge` from
potentially concurrent contexts; serialize all access to `bridge` by dispatching
synthesizeText calls onto a dedicated serial queue or by routing them through an
actor, await the serialized call result before yielding to the continuation, and
capture any thrown errors to finish the continuation with that error; ensure
continuation.yield/finish are invoked from the Task context after the serialized
bridge call completes and propagate errors from the bridge back to the caller.
| Create `build_sherpa_onnx.sh`: | ||
| ```bash | ||
| #!/bin/bash | ||
|
|
||
| # Build Sherpa-ONNX XCFrameworks for iOS | ||
|
|
||
| set -e | ||
|
|
||
| SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" | ||
| PROJECT_ROOT="$SCRIPT_DIR/../.." | ||
| EXTERNAL_DIR="$PROJECT_ROOT/EXTERNAL" | ||
| MODULE_DIR="$PROJECT_ROOT/sdk/runanywhere-swift/Modules/SherpaONNXTTS" | ||
|
|
||
| echo "🔨 Building Sherpa-ONNX XCFrameworks..." | ||
|
|
||
| # Clone if not exists | ||
| if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then | ||
| echo "📥 Cloning sherpa-onnx..." | ||
| git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx" | ||
| fi | ||
|
|
||
| # Build | ||
| cd "$EXTERNAL_DIR/sherpa-onnx" | ||
| echo "🏗️ Building for iOS..." | ||
| ./build-ios.sh | ||
|
|
||
| # Copy frameworks | ||
| echo "📦 Copying XCFrameworks..." | ||
| mkdir -p "$MODULE_DIR/XCFrameworks" | ||
| cp -r build-ios/sherpa-onnx.xcframework "$MODULE_DIR/XCFrameworks/" | ||
| cp -r build-ios/onnxruntime.xcframework "$MODULE_DIR/XCFrameworks/" | ||
|
|
||
| echo "✅ Build complete!" | ||
| echo "📍 Frameworks located at: $MODULE_DIR/XCFrameworks/" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Pin the upstream ref and harden the build script.
Use a fixed tag/commit and stricter shell options to avoid accidental upgrades and silent failures.
-#!/bin/bash
-set -e
+#!/usr/bin/env bash
+set -euo pipefail
+IFS=$'\n\t'
@@
-if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
+REF="${SHERPA_ONNX_REF:-vX.Y.Z}" # set a default or pass in env
+if [ ! -d "$EXTERNAL_DIR/sherpa-onnx" ]; then
echo "📥 Cloning sherpa-onnx..."
- git clone https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
+ git clone --depth 1 --branch "$REF" https://github.com/k2-fsa/sherpa-onnx.git "$EXTERNAL_DIR/sherpa-onnx"
else
- echo "📥 Pulling latest changes..."
- cd "$EXTERNAL_DIR/sherpa-onnx"
- git pull
+ cd "$EXTERNAL_DIR/sherpa-onnx"
+ git fetch --tags --depth 1 origin "$REF"
+ git checkout -f "$REF"
fiCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In thoughts/shared/plans/sherpa_onnx_wrapper_implementation.md around lines 351
to 385, the new build_sherpa_onnx.sh script lacks a pinned upstream ref and is
brittle; update the script to use strict shell options (set -euo pipefail) and
pin the repository to a specific tag or commit (use git clone --branch <TAG> or
clone then git checkout <COMMIT>), verify the clone succeeded and the expected
build script (build-ios.sh) exists and is executable before running it, fail
with clear error messages if checkout/clone or build steps fail, and add checks
that the expected XCFramework output paths exist before copying (exit nonzero
with a helpful message if not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review continued from previous batch...
| public func registerAndDownloadModel( | ||
| name: String, | ||
| url: URL, | ||
| framework: LLMFramework, | ||
| additionalFiles: [URL] = [] | ||
| ) async throws -> URL { | ||
| // Create model info with additional files | ||
| let model = sdk.addModelFromURL( | ||
| name: name, | ||
| url: url, | ||
| framework: framework, | ||
| estimatedSize: nil | ||
| ) | ||
|
|
||
| // Download main model and additional files | ||
| return try await downloadModelWithProgress(model.id) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
additionalFiles is unused. Implement or drop from API.
Current docs promise behavior that isn’t implemented.
Option A (remove now, reintroduce when ready):
- /// - additionalFiles: Additional files to download
+ /// - additionalFiles: (Removed) Additional files to download
/// - Returns: Local path to downloaded model
public func registerAndDownloadModel(
name: String,
url: URL,
- framework: LLMFramework,
- additionalFiles: [URL] = []
+ framework: LLMFramework
) async throws -> URL {
// Create model info with additional files
let model = sdk.addModelFromURL(
name: name,
url: url,
framework: framework,
estimatedSize: nil
)
- // Download main model and additional files
+ // Download main model
return try await downloadModelWithProgress(model.id)
}Option B (implement): loop and register extra assets via the SDK’s API (tell me the method names and I’ll wire it up).
🏁 Script executed:
#!/bin/bash
# Search for definition of addModelFromURL to inspect its signature
rg -nP --type swift 'func addModelFromURL' -C3
# Search for any usage of the `additionalFiles` parameter across the codebase
rg -n --type swift 'additionalFiles' -C3Length of output: 3907
Remove unused additionalFiles parameter
The additionalFiles argument is never consumed—drop it from the method signature and documentation.
- /// - additionalFiles: Additional files to download
+ /// - additionalFiles: (removed)
/// - Returns: Local path to downloaded model
public func registerAndDownloadModel(
name: String,
url: URL,
- framework: LLMFramework,
- additionalFiles: [URL] = []
+ framework: LLMFramework
) async throws -> URL {
- // Download main model and additional files
+ // Download main model
return try await downloadModelWithProgress(model.id)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public func registerAndDownloadModel( | |
| name: String, | |
| url: URL, | |
| framework: LLMFramework, | |
| additionalFiles: [URL] = [] | |
| ) async throws -> URL { | |
| // Create model info with additional files | |
| let model = sdk.addModelFromURL( | |
| name: name, | |
| url: url, | |
| framework: framework, | |
| estimatedSize: nil | |
| ) | |
| // Download main model and additional files | |
| return try await downloadModelWithProgress(model.id) | |
| } | |
| /// - additionalFiles: (removed) | |
| /// - Returns: Local path to downloaded model | |
| public func registerAndDownloadModel( | |
| name: String, | |
| url: URL, | |
| framework: LLMFramework | |
| ) async throws -> URL { | |
| // Create model info with additional files | |
| let model = sdk.addModelFromURL( | |
| name: name, | |
| url: url, | |
| framework: framework, | |
| estimatedSize: nil | |
| ) | |
| // Download main model | |
| return try await downloadModelWithProgress(model.id) | |
| } |
🤖 Prompt for AI Agents
sdk/runanywhere-swift/Sources/RunAnywhere/Public/Extensions/ModuleSupport/RunAnywhereSDK+ModuleUtilities.swift
lines 52-68: the additionalFiles parameter is unused and should be removed from
the method signature and any public documentation; update the function signature
to remove additionalFiles, remove any references to it in the implementation (no
other code changes needed inside since it was unused), update all call sites and
tests to call registerAndDownloadModel(name:url:framework:) without the extra
argument, and update API docs/comments to reflect the new signature.
Description
This pull request introduces Text-to-Speech (TTS) capabilities to the iOS SDK. It achieves this by integrating the ONNX Runtime, allowing the SDK to execute various TTS models. This integration offers the following key features:
Type of Change
Testing
Labels
Please add the appropriate label(s):
iOS SDK- Changes to iOS/Swift SDKAndroid SDK- Changes to Android/Kotlin SDKiOS Sample- Changes to iOS example appAndroid Sample- Changes to Android example appChecklist
Summary by CodeRabbit
New Features
Documentation
Chores