-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
automate offload, part 2 - clang calls #149202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
5992ea5 to
0588363
Compare
0588363 to
99348e0
Compare
This comment has been minimized.
This comment has been minimized.
99348e0 to
b117ff4
Compare
This comment has been minimized.
This comment has been minimized.
b117ff4 to
3a9d453
Compare
89a17be to
ebbe811
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
3632e16 to
9b4ad11
Compare
This comment has been minimized.
This comment has been minimized.
9b4ad11 to
b117ff4
Compare
This comment has been minimized.
This comment has been minimized.
| let host_out_c = path_to_c_string(device_pathbuf.as_path()); | ||
|
|
||
| // 2) Finalize host: lib.bc + host.out -> host.offload.o (host TM created in C++) | ||
| let llmod2 = llvm::LLVMCloneModule(module.module_llvm.llmod()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling here is that embedding a device artifact into a host module would break incremental compilation, so I create a copy. It's likely not great for compile times, but better correct than fast. Also at least amd is currently using build-std, so compile times are a lost cause anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do the loading in a query in the future, then it is tracked in incremental just like include_bytes could be
|
|
||
| Expected<std::unique_ptr<Module>> | ||
| loadHostModuleFromBitcode(LLVMContext &Ctx, StringRef LibBCPath) { | ||
| auto MBOrErr = MemoryBuffer::getFile(LibBCPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have MemoryBuffer, but ModuleBuffer exposed in our RustWrappers. However, they don't seem to be the same. I had some issues exposing MemoryBuffers, especially a constructor for them in Rust, so I left 25 lines of C++.
| let variant = match key { | ||
| "Host" => { | ||
| if let Some(p) = arg { | ||
| Offload::Host(p.to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could already do the argument verification here - E.g. is this an absolute path, does it point to a file called host.out, and does the full path with file exist. However, we don't have the error contexts here, so I left them for rustc_cg_llvm, even if that's a bit late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would have to happen later. We may want to do it earlier than the backend at some point. For now this is fine
| "clang-21" "-cc1as" "-triple" "x86_64-unknown-linux-gnu" "-filetype" "obj" "-main-file-name" "lib.rs" "-target-cpu" "x86-64" "-mrelocation-model" "pic" "-o" "host.o" "host.s" | ||
| "clang-linker-wrapper" "--should-extract=gfx90a" "--device-compiler=amdgcn-amd-amdhsa=-g" "--device-compiler=amdgcn-amd-amdhsa=-save-temps=cwd" "--device-linker=amdgcn-amd-amdhsa=-lompdevice" "--host-triple=x86_64-unknown-linux-gnu" "--save-temps" "--linker-path=/ABSOlUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/lld/bin/ld.lld" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-pie" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "bare" "/lib/../lib64/Scrt1.o" "/lib/../lib64/crti.o" "/ABSOLUTE_PATH_TO/crtbeginS.o" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/x86_64-unknown-linux-gnu" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/lib/clang/21/lib/x86_64-unknown-linux-gnu" "-L/lib/../lib64" "-L/usr/lib64" "-L/lib" "-L/usr/lib" "host.o" "-lstdc++" "-lm" "-lomp" "-lomptarget" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/lib" "-lgcc_s" "-lgcc" "-lpthread" "-lc" "-lgcc_s" "-lgcc" "/ABSOLUTE_PATH_TO/crtendS.o" "/lib/../lib64/crtn.o" | ||
| "clang-linker-wrapper" "--should-extract=gfx90a" "--device-compiler=amdgcn-amd-amdhsa=-g" "--device-compiler=amdgcn-amd-amdhsa=-save-temps=cwd" "--device-linker=amdgcn-amd-amdhsa=-lompdevice" "--host-triple=x86_64-unknown-linux-gnu" "--save-temps" "--linker-path=/ABSOlUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/lld/bin/ld.lld" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-pie" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "bare" "/lib/../lib64/Scrt1.o" "/lib/../lib64/crti.o" "/ABSOLUTE_PATH_TO/crtbeginS.o" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib/x86_64-unknown-linux-gnu" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/lib/clang/21/lib/x86_64-unknown-linux-gnu" "-L/lib/../lib64" "-L/usr/lib64" "-L/lib" "-L/usr/lib" "target/<GPU_DIR>/release/host.o" "-lstdc++" "-lm" "-lomp" "-lomptarget" "-L/ABSOLUTE_PATH_TO/rust/build/x86_64-unknown-linux-gnu/llvm/lib" "-lgcc_s" "-lgcc" "-lpthread" "-lc" "-lgcc_s" "-lgcc" "/ABSOLUTE_PATH_TO/crtendS.o" "/lib/../lib64/crtn.o" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't wait to get rid of this one too.
| { | ||
| let device_pathbuf = PathBuf::from(device_path); | ||
| if device_pathbuf.is_relative() { | ||
| panic!("Absolute path is needed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently creating proper errors for all panic/assertions.
|
r? @oli-obk I got the c++ code down to 25 lines! Also Rustc does the new writing to file, as you suggested in your last PR. |
|
|
|
The rustc-dev-guide subtree was changed. If this PR only touches the dev guide consider submitting a PR directly to rust-lang/rustc-dev-guide otherwise thank you for updating the dev guide with your changes. |
970f1b1 to
e83fef0
Compare
This comment has been minimized.
This comment has been minimized.
e83fef0 to
e01b74f
Compare
This comment has been minimized.
This comment has been minimized.
| let host_out_c = path_to_c_string(device_pathbuf.as_path()); | ||
|
|
||
| // 2) Finalize host: lib.bc + host.out -> host.offload.o (host TM created in C++) | ||
| let llmod2 = llvm::LLVMCloneModule(module.module_llvm.llmod()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could do the loading in a query in the future, then it is tracked in incremental just like include_bytes could be
c02f6ff to
604668f
Compare
This comment has been minimized.
This comment has been minimized.
604668f to
05e647a
Compare
|
Requested reviewer is already assigned to this pull request. Please choose another assignee. |
This automates steps 2+3 (the clang invocations) of the rust offload usage pipeline.
Now all that remains is a clang-linker-invocation after this step.
r? oli-obk