You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When implementing SAXPY(a "hello world" example for parallel algorithms) an unexpected segfault happens in the parallelized closure when compiling with default settings(with optimizations ON). Turning optimizations OFF using -O0 fixes the segfault and gives the correct result. The sequential implementation does not segfault.
Steps to reproduce
Include relevant code snippet or link to code that did not work as expected.
from algorithm import parallelize
fn main():
var num_items: Int = 99999999;
var a: Int32 = 2;
var X = List[Int32](capacity = num_items);
var Y = List[Int32](capacity = num_items);
X.resize(num_items, 3);
Y.resize(num_items, 4);
@parameter
fn inner_saxpy(i: Int) -> None:
Y[i] += a * X[i]; #segfault
parallelize[inner_saxpy](num_items);
var s: Int32 = 0;
for i in range(num_items): #can also be done in parallel(leaving that out for demonstration purposes).
s += Y[i];
print(s); #prints 999999990
Include anything else that might help us debug the issue.
Using max Tensor instead of List has the same behavior. Using unsafe pointer doesn't segfault but gives wrong result(correct result with -O0).
Declaring X and Y with List[Int32, True] has same behavior. Specifying number of workers in parallelize has same behavior.
The sequential implementation(not shown here) uses a normal for-loop and works as intended with optimizations ON.
When ran in the mojo playground the output is empty, but should be 999999990.
System information
- What OS did you do install Mojo on ? --- ubuntu 22.04.5 LTS
- Provide version information for Mojo by pasting the output of `mojo -v` --- 24.5.0 (e8aacb95)
- Provide Magic CLI version by pasting the output of `magic -V` or `magic --version` --- 0.4.0 (based on pixi 0.33.0)
Hardware:
Intel 13700k (8P/8E/24T) CPU
nvidia 4070 GPU
kingston 7200MTs CL38 32GB RAM
gigabyte Z790 motherboard
The text was updated successfully, but these errors were encountered:
Bug description
Code included below.
When implementing SAXPY(a "hello world" example for parallel algorithms) an unexpected segfault happens in the parallelized closure when compiling with default settings(with optimizations ON). Turning optimizations OFF using -O0 fixes the segfault and gives the correct result. The sequential implementation does not segfault.
Steps to reproduce
Include relevant code snippet or link to code that did not work as expected.
Include anything else that might help us debug the issue.
Using max Tensor instead of List has the same behavior. Using unsafe pointer doesn't segfault but gives wrong result(correct result with -O0).
Declaring X and Y with List[Int32, True] has same behavior. Specifying number of workers in parallelize has same behavior.
The sequential implementation(not shown here) uses a normal for-loop and works as intended with optimizations ON.
When ran in the mojo playground the output is empty, but should be 999999990.
System information
The text was updated successfully, but these errors were encountered: