[Optimization]: Reduce branching when possible in casting.hpp #117

zacharyvincze · 2026-02-06T20:13:00Z

Details

Removes branching where possible to the casting helper functions seen in casting.hpp. Aims to reduce divergence on GPU kernel implementations.
Includes fixes to some float -> integer saturation casts, especially for 32/64-bit integer cases that are not represented exactly as 32-bit floats.

Copilot

Pull request overview

This PR updates the core casting helpers to reduce branching (especially for GPU code paths) and adjusts saturation behavior for some float→integer conversions, alongside adding a small test and extending supported type traits.

Changes:

Refactors ScalarSaturateCast / ScalarRangeCast logic in casting.hpp to use more branchless/min-max based clamping and special-case small integer widths.
Extends type traits support to include long/ulong vectorized types.
Adds a new C++ test covering basic SaturateCast behavior and a few limit/vector cases.
Adjusts the GPU block dimensions for the Composite operator kernel launch.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`include/core/detail/casting.hpp`	Refactors saturate/range cast implementations to reduce branching and adjust clamping/rounding logic.
`include/core/detail/type_traits.hpp`	Adds `long` / `ulong` to the type-traits macro set.
`tests/roccv/cpp/src/tests/core/detail/test_saturate_cast.cpp`	Introduces a basic unit test for `SaturateCast`, including a couple of vectorized casts.
`src/op_composite.cpp`	Changes GPU kernel launch block dimensions for the composite operator.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

include/core/detail/casting.hpp

Copilot · 2026-02-06T20:19:14Z

include/core/detail/casting.hpp

+        if constexpr (sizeof(T) <= 2) {
+            // 8/16 bit integer cases. These can be represented exactly in floating point.
+#ifdef __HIP_DEVICE_COMPILE__
+            return static_cast<T>(rintf(fminf(fmaxf(v, minVal), maxVal)));
+#else
+            return static_cast<T>(std::round(std::clamp(v, minVal, maxVal)));
+#endif
+        } else {
+            // 32/64 bit integer cases.
+#ifdef __HIP_DEVICE_COMPILE__
+            U rounded = rintf(v);
+#else
+            U rounded = std::round(v);
+#endif


The device-side float->integer paths use float-specific intrinsics (rintf, fminf/fmaxf, __saturatef, __float2int_rn) even though U is any floating-point type. If U is double, this will downcast to float and can change rounding/saturation behavior. Consider either constraining these branches to U == float (static_assert / if constexpr) or adding double-correct implementations (rint, fmin/fmax, __double2int_rn, etc.).

tests/roccv/cpp/src/tests/core/detail/test_saturate_cast.cpp

src/op_composite.cpp

codecov-commenter · 2026-02-06T20:45:49Z

Codecov Report

❌ Patch coverage is 54.54545% with 15 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
include/core/detail/casting.hpp	54.55%	12 Missing and 3 partials ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #117      +/-   ##
===========================================
- Coverage    73.88%   73.86%   -0.02%     
===========================================
  Files           74       74              
  Lines         2864     2877      +13     
  Branches       615      610       -5     
===========================================
+ Hits          2116     2125       +9     
- Misses         330      331       +1     
- Partials       418      421       +3

Files with missing lines	Coverage Δ
include/core/detail/type_traits.hpp	`87.50% <ø> (ø)`
include/core/detail/casting.hpp	`78.26% <54.55%> (+2.31%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

zacharyvincze added 3 commits January 30, 2026 10:31

Avoid branching in casting implementations

4232bcd

Add more tests for Saturate cast

77cabc7

Fix issues with float -> integer saturate casts

d887102

zacharyvincze requested review from Copilot, jeffqjiangNew and paveltc February 6, 2026 20:13

zacharyvincze self-assigned this Feb 6, 2026

zacharyvincze added enhancement New feature or request ci:precheckin labels Feb 6, 2026

Copilot started reviewing on behalf of zacharyvincze February 6, 2026 20:13 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Undo changes to composite

146a1f9

zacharyvincze added 2 commits February 6, 2026 16:18

Review fixes

e9e9f0b

Add another test case for RangeCast

13a78be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization]: Reduce branching when possible in casting.hpp #117

[Optimization]: Reduce branching when possible in casting.hpp #117

zacharyvincze commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 6, 2026

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Optimization]: Reduce branching when possible in casting.hpp #117

Are you sure you want to change the base?

[Optimization]: Reduce branching when possible in casting.hpp #117

Conversation

zacharyvincze commented Feb 6, 2026

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov-commenter commented Feb 6, 2026 •

edited

Loading