Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] optimization causes segfault in parallelized code #3769

Open
Drunte opened this issue Nov 13, 2024 · 1 comment
Open

[BUG] optimization causes segfault in parallelized code #3769

Drunte opened this issue Nov 13, 2024 · 1 comment
Labels
bug Something isn't working mojo-repo Tag all issues with this label

Comments

@Drunte
Copy link

Drunte commented Nov 13, 2024

Bug description

Code included below.

When implementing SAXPY(a "hello world" example for parallel algorithms) an unexpected segfault happens in the parallelized closure when compiling with default settings(with optimizations ON). Turning optimizations OFF using -O0 fixes the segfault and gives the correct result. The sequential implementation does not segfault.

Steps to reproduce

  • Include relevant code snippet or link to code that did not work as expected.

      from algorithm import parallelize
    
      fn main():  
          var num_items: Int = 99999999; 
    
          var a: Int32 = 2;
    
          var X = List[Int32](capacity = num_items);
          var Y = List[Int32](capacity = num_items);
    
          X.resize(num_items, 3);
          Y.resize(num_items, 4);
    
          @parameter
          fn inner_saxpy(i: Int) -> None:
              Y[i] += a * X[i]; #segfault
      
          parallelize[inner_saxpy](num_items);
    
          var s: Int32 = 0;
          for i in range(num_items): #can also be done in parallel(leaving that out for demonstration purposes).
              s += Y[i];
    
          print(s); #prints 999999990
    
  • Include anything else that might help us debug the issue.

Using max Tensor instead of List has the same behavior. Using unsafe pointer doesn't segfault but gives wrong result(correct result with -O0).
Declaring X and Y with List[Int32, True] has same behavior. Specifying number of workers in parallelize has same behavior.
The sequential implementation(not shown here) uses a normal for-loop and works as intended with optimizations ON.

When ran in the mojo playground the output is empty, but should be 999999990.

System information

- What OS did you do install Mojo on ? --- ubuntu 22.04.5 LTS
- Provide version information for Mojo by pasting the output of `mojo -v` --- 24.5.0 (e8aacb95)
- Provide Magic CLI version by pasting the output of `magic -V` or `magic --version` --- 0.4.0 (based on pixi 0.33.0)

Hardware:
    Intel 13700k (8P/8E/24T) CPU
    nvidia 4070 GPU
    kingston 7200MTs CL38 32GB RAM
    gigabyte Z790 motherboard
@Drunte Drunte added bug Something isn't working mojo-repo Tag all issues with this label labels Nov 13, 2024
@Drunte
Copy link
Author

Drunte commented Nov 13, 2024

i struggled to fix the indentations in the code snippet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mojo-repo Tag all issues with this label
Projects
None yet
Development

No branches or pull requests

1 participant