Skip to content

fix: force to use fgpu-rdc to compile#236

Open
zhenhuang12 wants to merge 2 commits intomainfrom
fix/zhuang12/rocm7.2
Open

fix: force to use fgpu-rdc to compile#236
zhenhuang12 wants to merge 2 commits intomainfrom
fix/zhuang12/rocm7.2

Conversation

@zhenhuang12
Copy link
Contributor

@zhenhuang12 zhenhuang12 commented Feb 11, 2026

From JIRA AIMA-212, it shows that DeepEP intranode-combine timeout on rocm7.2 docker image, but works well on rocm7.1.

I found a workaround to solve the timeout which forcely add -fgpu-rdc to compile and disable -amdgpu-function-calls=false

Copilot AI review requested due to automatic review settings February 11, 2026 11:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a timeout issue in DeepEP intranode-combine on ROCm 7.2 by forcing the use of the -fgpu-rdc (Relocatable Device Code) compiler flag across all compilation stages.

Changes:

  • Added -fgpu-rdc flag to CXX compilation flags
  • Added -fgpu-rdc and --hip-link flags to linker arguments
  • Added -fgpu-rdc flag to HIP compilation flags

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

setup.py Outdated
"-Wl,-rpath,/opt/rocm/lib",
f"-L/usr/lib/{arch}-linux-gnu",
]
extra_link_args = ["-Wl,-rpath,/opt/rocm/lib", f"-L/usr/lib/{arch}-linux-gnu", "-fgpu-rdc", "--hip-link"]
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra_link_args list is formatted on a single line, making it harder to read and maintain compared to the original multi-line format. Consider restoring the multi-line format for consistency and readability.

Suggested change
extra_link_args = ["-Wl,-rpath,/opt/rocm/lib", f"-L/usr/lib/{arch}-linux-gnu", "-fgpu-rdc", "--hip-link"]
extra_link_args = [
"-Wl,-rpath,/opt/rocm/lib",
f"-L/usr/lib/{arch}-linux-gnu",
"-fgpu-rdc",
"--hip-link",
]

Copilot uses AI. Check for mistakes.
setup.py Outdated
cxx_flags = [
"-O3",
"-fvisibility=hidden",
"-fgpu-rdc",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -fgpu-rdc flag is duplicated across cxx_flags and hip_flags. Consider extracting this to a shared constant or variable to ensure consistency and make future updates easier to manage.

Copilot uses AI. Check for mistakes.
setup.py Outdated
"-amdgpu-early-inline-all=true",
"-mllvm",
"-amdgpu-function-calls=false",
"-fgpu-rdc",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -fgpu-rdc flag is duplicated across cxx_flags and hip_flags. Consider extracting this to a shared constant or variable to ensure consistency and make future updates easier to manage.

Copilot uses AI. Check for mistakes.
@xiaobochen-amd
Copy link
Collaborator

I remember this compile option caused issues before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants