Skip to content

Conversation

viclafargue
Copy link
Contributor

Closes #5717

@viclafargue viclafargue requested a review from a team as a code owner October 2, 2025 15:12
@viclafargue viclafargue requested a review from csadorf October 2, 2025 15:12
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Oct 2, 2025
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the expectation that the MRE provided in #5717 just fails with ValueError now?

@viclafargue
Copy link
Contributor Author

Is the expectation that the MRE provided in #5717 just fails with ValueError now?

I just updated the PR so that if the number of neighbors provided in the precomputed KNN graph is higher than the number of requested neighbors, the graph is trimmed instead of raising an exception.

@csadorf
Copy link
Contributor

csadorf commented Oct 6, 2025

Is the expectation that the MRE provided in #5717 just fails with ValueError now?

I just updated the PR so that if the number of neighbors provided in the precomputed KNN graph is higher than the number of requested neighbors, the graph is trimmed instead of raising an exception.

How does umap-learn handle this case?

@viclafargue
Copy link
Contributor Author

How does umap-learn handle this case?

In the case of more neighbors than requested it trims the pre-computed KNN graph. In case there are less than required it displays a discrete warning, drops the pre-computed graph and proceed with the computation of a new KNN graph from scratch. But, I feel like raising an exception here is a better choice to alert the user in case the KNN graph has an insufficient number of nearest neighbors instead of almost silently computing a new graph. What do you think?

@viclafargue viclafargue added bug Something isn't working non-breaking Non-breaking change labels Oct 6, 2025
@csadorf
Copy link
Contributor

csadorf commented Oct 6, 2025

How does umap-learn handle this case?

In the case of more neighbors than requested it trims the pre-computed KNN graph. In case there are less than required it displays a discrete warning, drops the pre-computed graph and proceed with the computation of a new KNN graph from scratch. But, I feel like raising an exception here is a better choice to alert the user in case the KNN graph has an insufficient number of nearest neighbors instead of almost silently computing a new graph. What do you think?

Yes, absolutely. I agree that matching the behavior w.r.t. to pruning makes sense. We should however fail with an exception instead of computing the KNN graph on an insufficient number of neighbors. We can implement a 100% match of the behavior in cuml.accel only.

@viclafargue viclafargue requested a review from csadorf October 14, 2025 16:44
Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, but overall LGTM! Thanks a lot!

@viclafargue
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 6e7849e into rapidsai:branch-25.12 Oct 17, 2025
101 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Cython / Python Cython or Python issue non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Using precomputed KNN with more neighbors than n_neighbors drops trustworthiness score for UMAP

2 participants