-
Notifications
You must be signed in to change notification settings - Fork 282
Open
Labels
bugSomething isn't working right.Something isn't working right.
Description
Is this a duplicate?
- I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct
Type of Bug
Performance
Component
CUB
Describe the bug
Users reported potential 20x speedup of cub::DeviceSegmentedSort::SortKeys
on SM90 for their workload. The issue stems from the fact that max policy for segmented sort is SM86 tuning. If we apply SM80 tuning to SM90, performance is back to normal
using MaxPolicy = Policy860; |
How to Reproduce
Run segmented sort benchmark
Expected behavior
We should provide SM90, SM100, and SM120 tunings. As a safe workaround, we can start with copying SM80 policy as SM90 and SM100, and SM86 as SM120.
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't working right.Something isn't working right.
Type
Projects
Status
In Progress