Skip to content

Commit 4c6d66c

Browse files
authored
Merge pull request #2447 from rust-lang/offload-docs
initial instructions for gpu offload
2 parents 33eaf36 + 4233695 commit 4c6d66c

File tree

3 files changed

+82
-0
lines changed

3 files changed

+82
-0
lines changed

src/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,8 @@
101101
- [The `rustdoc` test suite](./rustdoc-internals/rustdoc-test-suite.md)
102102
- [The `rustdoc-gui` test suite](./rustdoc-internals/rustdoc-gui-test-suite.md)
103103
- [The `rustdoc-json` test suite](./rustdoc-internals/rustdoc-json-test-suite.md)
104+
- [GPU offload internals](./offload/internals.md)
105+
- [Installation](./offload/installation.md)
104106
- [Autodiff internals](./autodiff/internals.md)
105107
- [Installation](./autodiff/installation.md)
106108
- [How to debug](./autodiff/debugging.md)

src/offload/installation.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Installation
2+
3+
In the future, `std::offload` should become available in nightly builds for users. For now, everyone still needs to build rustc from source.
4+
5+
## Build instructions
6+
7+
First you need to clone and configure the Rust repository:
8+
```bash
9+
git clone --depth=1 [email protected]:rust-lang/rust.git
10+
cd rust
11+
./configure --enable-llvm-link-shared --release-channel=nightly --enable-llvm-assertions --enable-offload --enable-enzyme --enable-clang --enable-lld --enable-option-checking --enable-ninja --disable-docs
12+
```
13+
14+
Afterwards you can build rustc using:
15+
```bash
16+
./x.py build --stage 1 library
17+
```
18+
19+
Afterwards rustc toolchain link will allow you to use it through cargo:
20+
```
21+
rustup toolchain link offload build/host/stage1
22+
rustup toolchain install nightly # enables -Z unstable-options
23+
```
24+
25+
26+
27+
## Build instruction for LLVM itself
28+
```bash
29+
git clone --depth=1 [email protected]:llvm/llvm-project.git
30+
cd llvm-project
31+
mkdir build
32+
cd build
33+
cmake -G Ninja ../llvm -DLLVM_TARGETS_TO_BUILD="host,AMDGPU,NVPTX" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_RUNTIMES="offload,openmp" -DLLVM_ENABLE_PLUGINS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=.
34+
ninja
35+
ninja install
36+
```
37+
This gives you a working LLVM build.
38+
39+
40+
## Testing
41+
run
42+
```
43+
./x.py test --stage 1 tests/codegen/gpu_offload
44+
```
45+
46+
## Usage
47+
It is important to use a clang compiler build on the same llvm as rustc. Just calling clang without the full path will likely use your system clang, which probably will be incompatible.
48+
```
49+
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --edition=2024 --crate-type cdylib src/main.rs --emit=llvm-ir -O -C lto=fat -Cpanic=abort -Zoffload=Enable
50+
/absolute/path/to/rust/build/x86_64-unknown-linux-gnu/llvm/bin/clang++ -fopenmp --offload-arch=native -g -O3 main.ll -o main -save-temps
51+
LIBOMPTARGET_INFO=-1 ./main
52+
```
53+
The first step will generate a `main.ll` file, which has enough instructions to cause the offload runtime to move data to and from a gpu.
54+
The second step will use clang as the compilation driver to compile our IR file down to a working binary. Only a very small Rust subset will work out of the box here, unless
55+
you use features like build-std, which are not covered by this guide. Look at the codegen test to get a feeling for how to write a working example.
56+
In the last step you can run your binary, if all went well you will see a data transfer being reported:
57+
```
58+
omptarget device 0 info: Entering OpenMP data region with being_mapper at unknown:0:0 with 1 arguments:
59+
omptarget device 0 info: tofrom(unknown)[1024]
60+
omptarget device 0 info: Creating new map entry with HstPtrBase=0x00007fffffff9540, HstPtrBegin=0x00007fffffff9540, TgtAllocBegin=0x0000155547200000, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=1, HoldRefCount=0, Name=unknown
61+
omptarget device 0 info: Copying data from host to device, HstPtr=0x00007fffffff9540, TgtPtr=0x0000155547200000, Size=1024, Name=unknown
62+
omptarget device 0 info: OpenMP Host-Device pointer mappings after block at unknown:0:0:
63+
omptarget device 0 info: Host Ptr Target Ptr Size (B) DynRefCount HoldRefCount Declaration
64+
omptarget device 0 info: 0x00007fffffff9540 0x0000155547200000 1024 1 0 unknown at unknown:0:0
65+
// some other output
66+
omptarget device 0 info: Exiting OpenMP data region with end_mapper at unknown:0:0 with 1 arguments:
67+
omptarget device 0 info: tofrom(unknown)[1024]
68+
omptarget device 0 info: Mapping exists with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, DynRefCount=0 (decremented, delayed deletion), HoldRefCount=0
69+
omptarget device 0 info: Copying data from device to host, TgtPtr=0x0000155547200000, HstPtr=0x00007fffffff9540, Size=1024, Name=unknown
70+
omptarget device 0 info: Removing map entry with HstPtrBegin=0x00007fffffff9540, TgtPtrBegin=0x0000155547200000, Size=1024, Name=unknown
71+
```

src/offload/internals.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# std::offload
2+
3+
This module is under active development. Once upstream, it should allow Rust developers to run Rust code on GPUs.
4+
We aim to develop a `rusty` GPU programming interface, which is safe, convenient and sufficiently fast by default.
5+
This includes automatic data movement to and from the GPU, in a efficient way. We will (later)
6+
also offer more advanced, possibly unsafe, interfaces which allow a higher degree of control.
7+
8+
The implementation is based on LLVM's "offload" project, which is already used by OpenMP to run Fortran or C++ code on GPUs.
9+
While the project is under development, users will need to call other compilers like clang to finish the compilation process.

0 commit comments

Comments
 (0)