You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CUB] Replace several direct uses of __clz (#6099)
* Replace `__clz` in `warp_scan_shfl.cuh`.
* Replace `__clz` in `block_radix_rank.cuh`
* Replace `__clz` in `warp_reduce_shfl.cuh`
* Replace `__clz` in `warp_reduce_smem.cuh`
* Replace thrust's `clz` with `cuda::std::countl`
* Fully qualify with `::cuda`
* Fixup types or copy paste mistakes
* Address review comments, `countr_zero` instead of `countl(brev())`
* Use __bit_log2 for warp ballot index.
* Use `__bit_log2` for block leader in ComputeRanksItem
* Ensure that we static cast in `__clz` to int in case we deal with ARM
* Rename variable to not conflict with builtin
* Use `__bit_log2`
* Fix incorrect transformation
* Drop internal `clz` function in favor of `countl_zero`
* Drop unneeded include
* Fix return type of `__ballot_sync` to unsigned
* fix typo
* Be super safe about unsigned integers
* Fix argument type in radix_rank
---------
Co-authored-by: Michael Schellenberger Costa <[email protected]>
0 commit comments