-
Notifications
You must be signed in to change notification settings - Fork 35
Initial implementation of SVS Runtime package #208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| find_package(svs REQUIRED) | ||
| target_link_libraries(${TARGET_NAME} PRIVATE | ||
| svs::svs | ||
| svs::svs_compile_options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice when svs::svs_compile_options is not included, performance decreases. I tried it because we do not link it in https://github.com/RedisAI/VectorSimilarity/blob/main/src/VecSim/CMakeLists.txt#L47-L52 - should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try it and check VecSim performance.
| elseif(TARGET svs_static_library) | ||
| # Links to SVS static library built as part of the main SVS build | ||
| target_link_libraries(${TARGET_NAME} PRIVATE | ||
| svs_devel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is svs_devel? Is there a reason linking here would be different than lines 131-135?
ahuber21
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments. Let's discuss how to tackle todos.
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include <svs/runtime/dynamic_vamana_index.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to sort all headers in the order
- "Local" headers (includes with quotes, same directory)
- SVS runtime headers
- SVS core headers
- deps headers
- std library, other core headers
Also, in svs/include we use mainly quote includes. IMO we should stay consistent here. Or are there reasons against?
| for (; found < k; ++found) { | ||
| curr_distances[found] = -1; | ||
| curr_labels[found] = -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is padding with -1 specific to Faiss? Would it make sense to accept a parameter to pick a different value for padding? Maybe even skip this entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question.
The search() method should somehow 'return' the actual number of results - for example, if k > index.size().
Possible solutions:
- Pad result buffers with special values:
label = max_size_t,distance = infinity - One more argument - pointer to a vector of size_t where to store numbers
- Use
ResultsAllocator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS. I would vote for the option 3. - it will make search signatures consitent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahuber21, can you please respond?
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not replying. Good point about consistency. In its current form, is ResultsAllocator already capable of initializing to arbitrary values?
In other words, can we have a special allocator for Faiss that initializes with inf distance and -1 index?
Would the initialization happen on allocation, or would it only backfill between [found, k-1]? Given that search is typically an expensive operation, I wouldn't expect a big impact from initializing everything first..
| MetricType metric, | ||
| StorageKind storage_kind, | ||
| const VamanaIndex::BuildParams& params, | ||
| const VamanaIndex::SearchParams& default_search_params = {10, 10, 0, 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not set prefetching defaults in the runtime and instead let libsvs set the defaults (currently using estimate_prefetch_parameters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- According to the request,
VamanaIndex::SearchParamscontains prefetch parameters. - Prefetching values
=0interpreted as "use default" in make_search_parameters()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mihaic, can you please explain in details which behavior is expected there:
- Should we let user disable prefetching?
- Should we let
MutableVamanaIndexto select all search parameters including window size itself? - BuildParameters: which values should be default?
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behavior 2 is expected here and, as discussed, for any other parameters that are not required by libsvs. I am restating the behavior to hopefully clarify: Any optional libsvs parameter should have a default value in libsvs_runtime that means "let libsvs choose". libsvs_runtime should still allow all possible values for libsvs parameters (e.g., disabling prefetching).
Edit: I realized I misinterpreted the numbering in your question. Hopefully my answer still covers everything.
| }; | ||
|
|
||
| struct SearchResultsStorage { | ||
| std::span<size_t> labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer aligned with https://github.com/ahuber21/faiss/blob/svs-io/faiss/svs/IndexSVSFaissUtils.h#L109 which leads to the build fail in CI - I guess we should revise that line in FAISS? Or revert this back to int64_t?
5cb9050 to
f2680de
Compare
| ) | ||
|
|
||
| # Build tests if requested | ||
| if(SVS_BUILD_RUNTIME_TESTS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahuber21, is it possible to reuse SVS_BUILD_TESTS here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll try to consolidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really possible in the current setup because runtime lib is built separate from our main repo.
0f09f59 to
6d74a20
Compare
No description provided.