add rocm pipeline #8

AlbertThie · 2025-05-13T07:04:20Z

Add a separate pipeline to use GPUs on a ROCM environment. This pipeline avoids using gpu-numba, as this has not been updated for ROCM for quite some time. The loading and unloading is a bit hacky, but I haven't noticed too much perfomance degradation.

Passed all pytest tests.

Tested on a node with 8 AMD Instinct MI250X GPU's.

Please note I did not have access to a NVIDIA multi GPU environment, some testing may be warranted.

zubatyuk · 2025-05-15T19:46:51Z

Thanks for the contribution.

Couple of issues:

I notice .cpu() and .cuda() transfers for the tensors.
Are these host-device transfers really required?
You should use current device index instead of the first one.

I still need to confirm that the PR does not brake CUDA and CPU compatibility.

add rocm pipeline

bc2c399

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add rocm pipeline #8

add rocm pipeline #8

Uh oh!

AlbertThie commented May 13, 2025

Uh oh!

zubatyuk commented May 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add rocm pipeline #8

Are you sure you want to change the base?

add rocm pipeline #8

Uh oh!

Conversation

AlbertThie commented May 13, 2025

Uh oh!

zubatyuk commented May 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants