Skip to content

Free GPU memory eagerly#689

Open
jpsamaroo wants to merge 21 commits intomasterfrom
jps/lu-ldiv-free
Open

Free GPU memory eagerly#689
jpsamaroo wants to merge 21 commits intomasterfrom
jps/lu-ldiv-free

Conversation

@jpsamaroo
Copy link
Member

@jpsamaroo jpsamaroo commented Feb 27, 2026

We now use unsafe_free! to eagerly free Datadeps and DArray GPU allocations when safe to do so.

This PR also:

  • Fixes LU for GPUs
  • Fixes Cholesky for DVector inputs
  • Adds + and - methods for DArrays
  • Fixes a scheduler bug around error reporting in large task graphs
  • Adapts RefValue to a new GPURef, required for LU and other algorithms

Todo:

  • Add unsafe_free! to stencils
  • Evaluate if QR/CAQR can use unsafe_free!
  • Add CPU/GPU tests for Cholesky with DVector inputs
  • Add GPU tests for Cholesky and LU
  • Add GPU tests for + and -
  • Add BLAS alternatives for GPU backends which don't have BLAS wired up

jpsamaroo and others added 12 commits March 7, 2026 11:34
…c, GPU-native chkfinite

- is_cross_hermitian/is_cross_symmetric: wrap ≈ in GPUArraysCore.allowscalar
  so ROCArray/CuArray chunk comparison does not trigger scalar indexing in norm
- chkfinite: use GPU-native all(isfinite, A) for AbstractGPUArray chunks
  (mapreduce on device); keep LAPACK.chkfinite for CPU chunks
- chkfinite!: set finite[] = true on success so chkfinite(A::DArray) returns correctly

Made-with: Cursor
Pivot value reads and ipiv write use scalar getindex/setindex; wrap body
in GPUArraysCore.allowscalar() so LU with RowMaximum works on ROCArray/CuArray chunks.

Made-with: Cursor
- ROCExt/CUDAExt: move(GPUProc, GPUProc, ::ROCArray/CuArray) so datadeps
  move_rewrap (unwrapped array) performs DtoD copy instead of identity
- aliasing: tochunk GPU source with first(processors(memory_space(x)))
- lu.jl: findmax pivot metrics + host 1×1 pivot read; geru! panel update
- linalg.jl: chunk symmetry via norm bound (replaces allowscalar ≈)
- IntelExt: minor; test/gpu.jl Intel + ROCm coverage

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants