Enable architecture selection for `DPCTL_TARGET_CUDA` #2096

vlad-perevezentsev · 2025-06-05T13:14:29Z

This PR proposes to change DPCTL_TARGET_CUDA CMake option from a boolean to a string allowing users to specify a CUDA architecture (e.g. sm_80). If not specified, it defaults to sm_50.

$ python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=<cuda_arch>"
# or
$ python scripts/build_locally.py --verbose --cmake-opts="-DDPCTL_TARGET_CUDA=ON"

The specified architecture is used to construct a SYCL alias target (e.g. nvidia_gpu_sm_80) and passed via -fsycl-targets option, following OneAPI for NVIDIA GPUs

Additionally removing DPCTL_TARGET_CUDA env handling logic

Have you provided a meaningful PR description?
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
Have you checked performance impact of proposed changes?
Have you added documentation for your changes, if necessary?
Have you added your changes to the changelog?
If this PR is a work in progress, are you opening the PR as a draft?

github-actions · 2025-06-05T13:51:03Z

View rendered docs @ https://intelpython.github.io/dpctl/pulls/2096/index.html

github-actions · 2025-06-05T13:59:26Z

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_8 ran successfully.
Passed: 1115
Failed: 6
Skipped: 119

coveralls · 2025-06-05T14:00:53Z

coverage: 84.972% (-0.01%) from 84.984%
when pulling feee948 on update_cuda_build
into 556a5c6 on master.

CMakeLists.txt

github-actions · 2025-06-05T19:25:09Z

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_9 ran successfully.
Passed: 1114
Failed: 7
Skipped: 119

github-actions · 2025-06-05T19:34:00Z

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_10 ran successfully.
Passed: 1114
Failed: 7
Skipped: 119

antonwolfy · 2025-06-05T21:34:17Z

CMakeLists.txt

-   else()
-      if (DEFINED ENV{DPCTL_TARGET_CUDA})
-          set(_dpctl_sycl_targets "nvptx64-nvidia-cuda,spir64-unknown-unknown")
+   if (NOT "x${DPCTL_TARGET_CUDA}" STREQUAL "x")


Is it fair to validate DPCTL_TARGET_CUDA only in case when empty DPCTL_SYCL_TARGETS?

it was how we were doing it before—but it looks like current logical flow will add HIP targets even when DPCTL_SYCL_TARGETS is not none, but not CUDA

so that should probably be changed, either make DPCTL_SYCL_TARGETS exclusive from both or check DPCTL_TARGET_CUDA as well

In my understanding, when the user passes DPCTL_SYCL_TARGETS he is responsible for the correctness of the flags.

The logic of checking if (NOT “x${DPCTL_TARGET_HIP}” STREQUAL “x”) when DPCTL_SYCL_TARGETS is not none was added to pass the correct compile and link options.

if(_dpctl_amd_targets) list(APPEND _dpctl_sycl_target_compile_options -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=${_dpctl_amd_targets}) list(APPEND _dpctl_sycl_target_link_options -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=${_dpctl_amd_targets}) endif()

I am already working on PR that will refresh the logic for AMD build using aliases to remove if(_dpctl_amd_targets) branch.

CMakeLists.txt

antonwolfy · 2025-06-05T21:45:55Z

docs/doc_sources/beginners_guides/installation.rst

+For reference, compute architecture strings like ``sm_80`` are based on
+CUDA Compute Capability. A complete mapping between NVIDIA GPU models and their
+respective ``sm_XX`` values can be found in the official
+`CUDA GPU Compute Capability <https://developer.nvidia.com/cuda-gpus>`_.


The mapping is not clear from the reference doc.

Yes, seems they aren't necessarily related either (see here and below it)

A CUDA developer notes that sm_XX refers to machine code for a specific GPU hardware architecture. Since each Compute Capability version corresponds to a particular architecture (CC 8.0 -> Ampere A100) it is reasonable to say that sm_80 corresponds to CC 8.0

I changed the text a bit

github-actions · 2025-06-06T15:12:21Z

Array API standard conformance tests for dpctl=0.21.0dev0=py310h93fe807_17 ran successfully.
Passed: 1113
Failed: 8
Skipped: 119

vlad-perevezentsev added 5 commits June 4, 2025 09:11

Clean up DPCTL_TARGET_CUDA handling

bae5f3d

Ver 1: Add sm_* offload arch support to DPCTL_TARGET_CUDA

de83f29

Ver 2: Use nvidia_gpu_sm_* alias in -fsycl-targets for CUDA

9751a71

Update CUDA build docs

4a3ecf8

Merge master into update_cuda_build

58710ed

vlad-perevezentsev requested review from ndgrigorian, antonwolfy and vtavana as code owners June 5, 2025 13:14

vlad-perevezentsev self-assigned this Jun 5, 2025

vlad-perevezentsev linked an issue Jun 5, 2025 that may be closed by this pull request

Add CUDA architecture to CMake option when building for NVidia devices #2029

Open

ndgrigorian reviewed Jun 5, 2025

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

vlad-perevezentsev added 2 commits June 5, 2025 11:35

Improve robustness of DPCTL_TARGET_CUDA handling

80f7bc7

Update DPCTL_TARGET_CUDA option description

d395910

antonwolfy reviewed Jun 5, 2025

View reviewed changes

vlad-perevezentsev added 3 commits June 6, 2025 04:57

Apply remarks

758e00f

Merge master into update_cuda_build

155ca52

Use string(CONCAT) for multi-line DPCTL_TARGET_CUDA description

feee948

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable architecture selection for `DPCTL_TARGET_CUDA` #2096

Enable architecture selection for `DPCTL_TARGET_CUDA` #2096

Uh oh!

vlad-perevezentsev commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

coveralls commented Jun 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

antonwolfy Jun 5, 2025

Uh oh!

ndgrigorian Jun 5, 2025

Uh oh!

vlad-perevezentsev Jun 6, 2025

Uh oh!

Uh oh!

Uh oh!

antonwolfy Jun 5, 2025

Uh oh!

ndgrigorian Jun 5, 2025

Uh oh!

vlad-perevezentsev Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

Uh oh!

Enable architecture selection for DPCTL_TARGET_CUDA #2096

Are you sure you want to change the base?

Enable architecture selection for DPCTL_TARGET_CUDA #2096

Uh oh!

Conversation

vlad-perevezentsev commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

coveralls commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

antonwolfy Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

ndgrigorian Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

vlad-perevezentsev Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

antonwolfy Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

ndgrigorian Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

vlad-perevezentsev Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

Uh oh!

Enable architecture selection for `DPCTL_TARGET_CUDA` #2096

Enable architecture selection for `DPCTL_TARGET_CUDA` #2096

coveralls commented Jun 5, 2025 •

edited

Loading