Skip to content

Conversation

@Aminsed
Copy link
Contributor

@Aminsed Aminsed commented Oct 13, 2025

Use uint32_t for bitmasks in block_radix_rank.cuh.

  • Bitmasks should be unsigned types
  • Required for transition to type-checked cuda::std functions

Fixes #6106

@Aminsed Aminsed requested a review from a team as a code owner October 13, 2025 00:03
@Aminsed Aminsed requested a review from NaderAlAwar October 13, 2025 00:03
@github-project-automation github-project-automation bot moved this to Todo in CCCL Oct 13, 2025
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 13, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Oct 13, 2025
Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, We can now drop some of the internal casts

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Oct 13, 2025
@miscco
Copy link
Contributor

miscco commented Oct 13, 2025

/ok to test 4887eb0

@github-actions

This comment has been minimized.

@bernhardmgruber
Copy link
Contributor

We should probably review the generated SASS once this PR is ready.

@github-project-automation github-project-automation bot moved this from In Progress to In Review in CCCL Oct 13, 2025
@miscco
Copy link
Contributor

miscco commented Oct 13, 2025

/ok to test fb741b1

@github-actions
Copy link
Contributor

🥳 CI Workflow Results

🟩 Finished in 3h 24m: Pass: 100%/75 | Total: 2d 18h | Max: 3h 22m | Hits: 85%/73707

See results here.

@Aminsed Aminsed force-pushed the bugfix/mask_uint32_types branch from fefe4a3 to ec8926d Compare October 26, 2025 19:07
@Aminsed Aminsed force-pushed the bugfix/mask_uint32_types branch from 0071054 to 9c228a8 Compare October 26, 2025 19:32
@Aminsed Aminsed requested a review from fbusato October 26, 2025 19:34
Comment on lines +722 to +730
using ::cuda::std::uint32_t;
uint32_t warp_id = linear_tid >> LOG_WARP_THREADS;
uint32_t lane_mask_lt = ::cuda::ptx::get_sreg_lanemask_lt();

_CCCL_PRAGMA_UNROLL_FULL()
for (int ITEM = 0; ITEM < KEYS_PER_THREAD; ++ITEM)
{
// My digit
::cuda::std::uint32_t digit = digit_extractor.Digit(keys[ITEM]);
uint32_t digit = digit_extractor.Digit(keys[ITEM]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one occurrence, please use the fully qualified one there

::cuda::std::uint32_t bin_mask = *p_match_mask;
int leader = ::cuda::std::__bit_log2(bin_mask);
int warp_offset = 0;
int popc = __popc(bin_mask & ::cuda::ptx::get_sreg_lanemask_le());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use ::cuda::std::popcount

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Oct 27, 2025
int lane_mask = 1 << lane;
int* warp_offsets = &s.warp_offsets[warp][0];
int* match_masks = &s.match_masks[warp][0];
::cuda::std::uint32_t lane_mask = 1 << lane;
Copy link
Contributor

@fbusato fbusato Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better alternative is

Suggested change
::cuda::std::uint32_t lane_mask = 1 << lane;
auto lane_mask = 1u << lane;

signed shift is UB before C++20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

[BUG]: Wrong mask type in cub/cub/block/block_radix_rank.cuh

4 participants