-
Notifications
You must be signed in to change notification settings - Fork 754
feat: implement GPU-backed FAISS support and dynamic tokenization scaling #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
wgergely
wants to merge
14
commits into
yichuan-w:main
Choose a base branch
from
wgergely:feature/gpu-backend-enhancements
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
feat: implement GPU-backed FAISS support and dynamic tokenization scaling #209
wgergely
wants to merge
14
commits into
yichuan-w:main
from
wgergely:feature/gpu-backend-enhancements
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ding computation - Force use_server=False to prevent ZMQ connection issues - Add explicit logger for better debugging - Improve code structure and comments
Implements a standalone embedding server for the FAISS backend to prevent ZMQ deadlocks that occur when mixing direct embedding computation (build) and server-based computation (search). - Adds faiss_embedding_server.py: Specialized server reusing leann-core logic. - Updates __init__.py: Exports and registers the new server module.
Adds: - gitignore-parser: For robust .gitignore handling in the CLI. - einops: Required for nomic-embed-text-v1.5 custom implementation.
- api.py: Explicitly separate server-mode (search) vs direct-mode (build) to ensure stability. - embedding_compute.py: Add parallel tokenization, adaptive batch sizing, and support for nomic-embed-text-v1.5. - tests: Add token truncation tests.
- Add gitignore-parser integration for correct file exclusion. - Add suppress_cpp_output context manager to silence noisy FAISS/HNSW backend logs. - Add code-optimized SentenceSplitter configuration.
- metadata_filter.py: Implements comprehensive filtering (comparison, membership, string, boolean) for search results. - tests: Add test suite for metadata filtering logic.
Author
|
Successfully updated the PR with the latest stabilization fixes, metadata enrichment, and MCP protocol v2025 updates. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Update Draft - Leann GPU Backend & Metadata Enrichment
This update builds upon the initial FAISS GPU support by adding robust metadata extraction and stabilizing the environment for production use.
What's New?
1. Metadata-Rich Indexing (Context Headers)
We’ve added a
CodeAnalyzerthat usestree-sitterto extract global context from files. Every code chunk now includes a "Context Header" prepended to its text:2. FAISS Stability (ZMQ Fixes)
To prevent ZMQ deadlocks observed in high-concurrency scenarios, we've implemented an in-process embedding strategy for the FAISS backend. Search operations now compute query embeddings within the same process by default.
3. MCP Protocol v2025 Upgrade
Standardized the codebase to support the latest MCP protocol version (
2025-11-25).4. Better Environment Control
Standardized
LEANN_HOMEandLEANN_DOCShandling across CLI and Server modules. The system now strictly respects these environment variables if provided.trust_remote_code=Trueto supportnomic-embed-text-v1.5out of the box.tree-sitter(0.23+) andgitignore-parserto core requirements.ProcessPoolExecutorfor true CPU parallelism.Verification
Full test suite passed, including new integration tests for the FAISS ZMQ server and metadata analyzer.