Skip to content

Conversation

@zacharyvincze
Copy link
Contributor

Details

  • Removes branching where possible to the casting helper functions seen in casting.hpp. Aims to reduce divergence on GPU kernel implementations.
  • Includes fixes to some float -> integer saturation casts, especially for 32/64-bit integer cases that are not represented exactly as 32-bit floats.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the core casting helpers to reduce branching (especially for GPU code paths) and adjusts saturation behavior for some float→integer conversions, alongside adding a small test and extending supported type traits.

Changes:

  • Refactors ScalarSaturateCast / ScalarRangeCast logic in casting.hpp to use more branchless/min-max based clamping and special-case small integer widths.
  • Extends type traits support to include long/ulong vectorized types.
  • Adds a new C++ test covering basic SaturateCast behavior and a few limit/vector cases.
  • Adjusts the GPU block dimensions for the Composite operator kernel launch.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
include/core/detail/casting.hpp Refactors saturate/range cast implementations to reduce branching and adjust clamping/rounding logic.
include/core/detail/type_traits.hpp Adds long / ulong to the type-traits macro set.
tests/roccv/cpp/src/tests/core/detail/test_saturate_cast.cpp Introduces a basic unit test for SaturateCast, including a couple of vectorized casts.
src/op_composite.cpp Changes GPU kernel launch block dimensions for the composite operator.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +41 to +54
if constexpr (sizeof(T) <= 2) {
// 8/16 bit integer cases. These can be represented exactly in floating point.
#ifdef __HIP_DEVICE_COMPILE__
return static_cast<T>(rintf(fminf(fmaxf(v, minVal), maxVal)));
#else
return static_cast<T>(std::round(std::clamp(v, minVal, maxVal)));
#endif
} else {
// 32/64 bit integer cases.
#ifdef __HIP_DEVICE_COMPILE__
U rounded = rintf(v);
#else
U rounded = std::round(v);
#endif
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The device-side float->integer paths use float-specific intrinsics (rintf, fminf/fmaxf, __saturatef, __float2int_rn) even though U is any floating-point type. If U is double, this will downcast to float and can change rounding/saturation behavior. Consider either constraining these branches to U == float (static_assert / if constexpr) or adding double-correct implementations (rint, fmin/fmax, __double2int_rn, etc.).

Copilot uses AI. Check for mistakes.
@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 54.54545% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
include/core/detail/casting.hpp 54.55% 12 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #117      +/-   ##
===========================================
- Coverage    73.88%   73.86%   -0.02%     
===========================================
  Files           74       74              
  Lines         2864     2877      +13     
  Branches       615      610       -5     
===========================================
+ Hits          2116     2125       +9     
- Misses         330      331       +1     
- Partials       418      421       +3     
Files with missing lines Coverage Δ
include/core/detail/type_traits.hpp 87.50% <ø> (ø)
include/core/detail/casting.hpp 78.26% <54.55%> (+2.31%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:precheckin enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants