Skip to content

Conversation

@normen662
Copy link
Contributor

@normen662 normen662 commented Oct 22, 2025

This PR implements the HNSW paper using the recently introduced linear package together with RaBitQ.

@normen662 normen662 force-pushed the hnsw-on-linear branch 4 times, most recently from 4cfef2a to 3a04055 Compare October 24, 2025 16:49
@normen662 normen662 requested a review from alecgrieser October 28, 2025 19:17
@normen662 normen662 added the enhancement New feature or request label Oct 28, 2025
@normen662 normen662 force-pushed the hnsw-on-linear branch 3 times, most recently from 8fcb3a6 to fc7994f Compare October 29, 2025 13:15
Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has obviously taken a bit of time, but this is part one of the review. It covers:

  1. The HNSW class and core algorithm
  2. The Node and NodeKind classes

Still yet to look at are:

  1. The StorageAdapter and implementations
  2. Change sets
  3. Any of the changes to the linear and RaBitQ packages
  4. All tests

As hopefully is clear in the review, a lot of what's in it are requests for clarification. Some of these should probably turn into comments.

I also think that it would be good to take another look at the teamscale findings. Most of those are also pretty minor, but it would be good to try to conform a bit more to them. I'm less concerned about things like method length, nesting, or number of parameters (especially for private methods), but it would be nice to take another look at them.

Overall, I think the approach makes sense, through. Nice!

@normen662 normen662 force-pushed the hnsw-on-linear branch 2 times, most recently from 267e633 to 1963a0a Compare October 30, 2025 15:41
Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this adds more to the review, in particular focusing on the storage serialization/deserialization. I still have:

  1. The changes to the other packages
  2. Tests
  3. Looking at updates since the last review

@normen662 normen662 force-pushed the hnsw-on-linear branch 6 times, most recently from 0a243ee to bedfd30 Compare November 3, 2025 13:24
@ScottDugas
Copy link
Collaborator

@normen662 @alecgrieser @MMcM
Teamscale is currently not reporting back to github
https://fdb.teamscale.io/activity/merge-requests/foundationdb-fdb-record-layer/FoundationDB%2Ffdb-record-layer%2F3691

Looking at the coverage report from the actions, I think the test gaps in teamscale are incorrect, but I would trust the findings, at least mostly.
You can see the summaries for changed files, which is pretty helpful for the new ones: https://github.com/FoundationDB/fdb-record-layer/actions/runs/19036192615

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I overall like what's being done with Transform. I did leave one comment about a usage pattern that is a bit surprising, if not understandable. I looked at the tests, and they seem like a good set of basic high-level tests. I'm not sure off the top of my head what improvements I'd like to see, but it does seem like we should stress it a bit more. It may also be the kind of thing where if we took the current version and then devised more interesting testing strategies, that would be fine

@normen662 normen662 force-pushed the hnsw-on-linear branch 3 times, most recently from d6bc44c to e83340d Compare November 4, 2025 19:26
Copy link
Collaborator

@MMcM MMcM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few mores dates.

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. The only serious thing is the question raised about the StorageTransform change that went in

Copy link
Collaborator

@alecgrieser alecgrieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. There are failing tests, which I think are all from using hasToString instead of doesNotHaveToString in the latest update in a few places. If that's all that's wrong, and correcting that results in a passing PRB, I think this is good to merge

@normen662 normen662 merged commit 69c8839 into FoundationDB:main Nov 5, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants