Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

julia_110{,-bin}: 1.10.4 -> 1.10.8 #361201

Merged
merged 5 commits into from
Jan 27, 2025
Merged

julia_110{,-bin}: 1.10.4 -> 1.10.8 #361201

merged 5 commits into from
Jan 27, 2025

Conversation

NickCao
Copy link
Member

@NickCao NickCao commented Dec 2, 2024

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@NickCao
Copy link
Member Author

NickCao commented Dec 2, 2024

@ofborg build julia_110 julia_110-bin

@ofborg ofborg bot requested a review from thomasjm December 3, 2024 09:12
@ofborg ofborg bot added the 11.by: package-maintainer This PR was created by the maintainer of the package it changes label Dec 3, 2024
@thomasjm
Copy link
Contributor

thomasjm commented Dec 3, 2024

Hmm, I wanted to run the julia.withPackages tests but I ran into a problem in the check phase of julia-bin-1.10.7:

       > The global RNG seed was 0x93690267f30d697b4000d0323f78fd24.
       >
       > Error in testset LinearAlgebra/blas:
       > Test Failed at /nix/store/rglc4g306vagny8pdn7581kdxpi409aw-julia-bin-1.10.7/share/julia/stdlib/v1.10/LinearAlgebra/test/blas.jl:712
       >   Expression: BLAS.axpy!(α, a, copy(b)) ≈ α * a + b
       >    Evaluated: ComplexF64[1.9330187934128453 - 8.970730564994865im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im, -0.6963896877405945 + 0.8611170231579576im] ≈ ComplexF64[-0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im, -0.4334488396252506 - 0.12206773565732487im]
       >
       > ERROR: LoadError: Test run finished with errors
       > in expression starting at /nix/store/rglc4g306vagny8pdn7581kdxpi409aw-julia-bin-1.10.7/share/julia/test/runtests.jl:95
       For full logs, run 'nix log /nix/store/4p2p3zckhzl05k32rr1pqih72br5qyji-julia-bin-1.10.7.drv'.

@NickCao
Copy link
Member Author

NickCao commented Dec 3, 2024

Likely another openblas bug, I used to hit these on specific cpu models without avx, it went into a broken code path causing incorrect floating point calculations.

@thomasjm
Copy link
Contributor

thomasjm commented Dec 3, 2024

Gotcha. The julia-1.10.7 build took a bit longer but it also just crashed on the same test.

This is a 13th Gen Intel(R) Core(TM) i9-13900K.

@NickCao
Copy link
Member Author

NickCao commented Dec 3, 2024

The upstream issue: OpenMathLib/OpenBLAS#4176

@thomasjm
Copy link
Contributor

thomasjm commented Dec 3, 2024

The thing is, my CPU flags include avx, so this makes me wonder if this is connected to another issue I saw recently--

I found that on my i9-13900K machine with NixOS, Julia was detecting the CPU microarchitecture incorrectly as "goldmont" when it should be Raptor Lake. This seems to be related to the LLVM used to compile Julia. So maybe there is some CPU detection problem going on. You can check this by running versioninfo() and looking at the LLVM: row, or looking at Sys.CPU_NAME. In fact, here are the versions I see currently:

attr LLVM
julia_110 LLVM: libLLVM-15.0.7 (ORCJIT, goldmont)
julia_110-bin LLVM: libLLVM-15.0.7 (ORCJIT, goldmont)
julia_111 LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
julia_111-bin LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)

Could the microarchitecture be getting baked in by whatever Hydra machine compiled LLVM, and then when my machine downloads that, it confuses the CPU version checks on my machine?

Julia's Sys.CPU_NAME field is ultimately derived from here:

https://github.com/JuliaLang/julia/blob/2590e675885b97579a7531c343a546f6f5bbcbe5/src/processor.cpp#L974-L977

@NickCao
Copy link
Member Author

NickCao commented Dec 3, 2024

On my machine:

attr LLVM
julia_110 libLLVM-15.0.7 (ORCJIT, znver3)
julia_110-bin libLLVM-15.0.7 (ORCJIT, znver3)
julia_111 LLVM: libLLVM-16.0.6 (ORCJIT, znver4)
julia_111-bin LLVM: libLLVM-16.0.6 (ORCJIT, znver4)

So that's not the case.

@thomasjm
Copy link
Contributor

thomasjm commented Dec 3, 2024

Hmm, but you might have built LLVM yourself rather than getting it from cache. Does your Sys.CPU_NAME show the same results? Your system can't be both znver3 and znver4...

@NickCao
Copy link
Member Author

NickCao commented Dec 4, 2024

Hmm, but you might have built LLVM yourself rather than getting it from cache.

Nah, these are freshly fetched from the cache.

Does your Sys.CPU_NAME show the same results? Your system can't be both znver3 and znver4...

Same, I guess it's just the older llvm versions don't recognize znver4 yet thus falling back to znver3.

@thomasjm
Copy link
Contributor

thomasjm commented Dec 4, 2024

Makes sense, I can confirm that's the case by running nix run .#clang_15 -- --print-supported-cpus.

However, doing the same with .#clang_16 shows support for raptorlake. My machine is raptorlake, and yet Sys.CPU_NAME is showing alderlake. A bit weird but I'm satisfied about why I'm hitting this AVX bug on LLVM 15.x versions 👍

@MisileLab
Copy link
Contributor

julia 1.10.8 released: https://github.com/JuliaLang/julia/blob/v1.10.8/NEWS.md

@NickCao NickCao changed the title julia_110{,-bin}: 1.10.4 -> 1.10.7 julia_110{,-bin}: 1.10.4 -> 1.10.8 Jan 24, 2025
@NickCao
Copy link
Member Author

NickCao commented Jan 24, 2025

The zlib related bug is still present on 1.10.8, we have to live with it.

@thomasjm
Copy link
Contributor

I didn't know about this zlib issue. I can build 1.10.8 with your branch but I can't start it on my NixOS machine since it fails to find a system libz.so.1. How are we to live with this?

The old 1.10.4 on master works for me on NIxOS.

@thomasjm
Copy link
Contributor

Could we patchelf libunwind to use the Nix-provided zlib?

@NickCao
Copy link
Member Author

NickCao commented Jan 25, 2025

Let's attempt another solution: reverting the offending commit. I don't know why changes to the llvm build system would affect the build of libunwind, but it is what it is.....

@thomasjm
Copy link
Contributor

Okay, that makes things much better:

  • julia_110: 0 failures on top 500 packages (100% success)
  • julia_110-bin: 0 failures on top 500 packages (100% success)

Copy link
Contributor

@thomasjm thomasjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Hopefully upstream will fix this zlib issue soon?

@NickCao
Copy link
Member Author

NickCao commented Jan 27, 2025

LGTM. Hopefully upstream will fix this zlib issue soon?

I think it's half a nixpkgs issue, haven't got time to dig deeper, see JuliaLang/julia#55617 for prior discussions.

@NickCao NickCao merged commit 794cca2 into NixOS:master Jan 27, 2025
4 of 6 checks passed
@NickCao NickCao deleted the julia_110 branch January 27, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants