-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add utility for memory efficient maximum pairwise distance computation with GPU support #838
base: branch-25.04
Are you sure you want to change the base?
Conversation
Compute maximal pairwise distance without storing all distances Provides fallback to CPU implementation of optional cuVS/pylibraft dependency is not available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @grlee77 for this update! I left some minor comments on this.
if _distance_on_cpu: | ||
warnings.warn( | ||
"cuVS >= 25.02 or pylibraft < 24.12 must be installed to use " | ||
"GPU-accelerated pairwaise distance computations. Falling back " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"GPU-accelerated pairwaise distance computations. Falling back " | |
"GPU-accelerated pairwise distance computations. Falling back " |
Internally, calls to cdist will be made with subsets of coords where | ||
the subset size is (coords_per_block, ndim). | ||
compute_argmax : bool, optional | ||
If True, the value of the coordate indices corresponding to the maxima |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If True, the value of the coordate indices corresponding to the maxima | |
If True, the value of the cooridate indices corresponding to the maxima |
requirement. The memory used at runtime will be proportional to | ||
``coords_per_block**2``. | ||
|
||
A block size of >= 2000 is recommended to overhead poor GPU resource usage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A block size of >= 2000 is recommended to overhead poor GPU resource usage | |
A block size of >= 2000 is recommended to avoid poor GPU resource usage |
) | ||
current_output = temp | ||
else: | ||
# omit out= for the last block as size may be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment sentence doesn't seem to be complete.
coords : np.ndarray (num_points, ndim) | ||
The coordinates to process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code converts input coordinates to float32 by default, which could lead to precision loss. I think it would be good to note that this coords
would be converted to float32
data type.
|
||
num_coords, _ = coords.shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Input validation might be needed here.
num_coords, _ = coords.shape | |
if not isinstance(coords, (np.ndarray, cp.ndarray)): | |
raise TypeError("coords must be a numpy or cupy array") | |
if coords.ndim != 2: | |
raise ValueError( | |
f"coords must be a 2-dimensional array, got shape {coords.shape}" | |
) | |
num_coords, _ = coords.shape |
"to SciPy-based CPU implementation." | ||
) | ||
xp = np | ||
coords = cp.asnumpy(coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to use numpy for here?
coords = cp.asnumpy(coords) | |
coords = np.asnumpy(coords) |
|
||
Parameters | ||
---------- | ||
coords : np.ndarray (num_points, ndim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coords : np.ndarray (num_points, ndim) | |
coords : numpy.ndarray or cupy.ndarray of shape (num_points, ndim) |
The maximum Feret diameter computations compute pairwise distances between all points which can lead to out of memory errors if the number of points on the object boundary is large. (The memory used is quadratic in the number of points).
This MR implements a block-wise version that retains efficiency, but has much lower memory requirements. It also handles checking for optional GPU acceleration via CuPy's optional cuVS/pylibraft dependencies. CPU fallback is done if those are not available.