perf: lazy-import torch and tiktoken in embedding_compute / chat#323
Open
raoabinav wants to merge 1 commit into
Open
perf: lazy-import torch and tiktoken in embedding_compute / chat#323raoabinav wants to merge 1 commit into
raoabinav wants to merge 1 commit into
Conversation
embedding_compute.py:14-16 and chat.py:13 import torch / tiktoken at module top, which means `import leann` pulls ~1 GB of torch state even for callers that only do MCP search over a prebuilt index, BM25-only queries, or other paths that never touch the embedding pipeline. Moved torch into the two functions that actually use it (compute_embeddings_sentence_transformers, HFLLM.ask). The lazy imports in HFLLM.__init__ and compute_embeddings_ollama were already function-local, so they're unchanged. Moved tiktoken into truncate_to_token_limit. `import leann` drops from ~6700ms to ~128ms locally; torch and tiktoken stay out of sys.modules until first real use. I'm assuming the eager imports were just convenience and not load-bearing in any way I'm missing (e.g. catching ImportError up-front for a clearer error message). Happy to revisit if there's a reason they need to be loaded early. I didn't find an existing issue for this — happy to open one if you'd prefer that path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
embedding_compute.pyandchat.pyimport torch / tiktoken at module top, soimport leannpulls torch up front even for callers that just do MCP search or BM25 lookups. Moved both into the functions that actually use them so its lazy loaded