Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kronecker product between Diagonal and AbstractGPUArray #585

Open
ytdHuang opened this issue Jan 19, 2025 · 8 comments
Open

Kronecker product between Diagonal and AbstractGPUArray #585

ytdHuang opened this issue Jan 19, 2025 · 8 comments

Comments

@ytdHuang
Copy link

The Kronecker product between Diagonal and CUDA.CUSPARSE.AbstractCuSparseMatrix works:

using LinearAlgebra
using CUDA
CUDA.allowscalar(false)

Ms = cu(sprand(ComplexF64, 4, 4, 0.5))
Id = I(2)
kron(Id, Ms)

Is it possible to also support Kronecker product between Diagonal and (dense) AbstractGPUArray ?
Because currently it's using scalar indexing:

Md = cu(rand(ComplexF64, 4, 4))
Id = I(2)
kron(Id, Md)

I post it here because this should also help for other GPU backends, Metal etc.

@maleadt
Copy link
Member

maleadt commented Jan 27, 2025

cc @albertomercurio who provided the initial implementation in #575

@albertomercurio
Copy link
Contributor

The kron product between I(2) and cu(sprand(ComplexF64, 4, 4, 0.5)) is performed by this function.

However, at the moment of writing that function, I thought that the Diagonal type was just the identity, while it is something more generic, containing the vector of the diagonal. This is why the current implementation for the sparse case works, but it would give wrong results for a general Diagonal matrix like Diagonal(rand(10)).

This means that the current implementation for the sparse case should be constrained to Diagonal{T, <:CuVector} rather than simply Diagonal, since the last case would also include Diagonal types with CPU Vector type. The current implementation was assuming a diagonal filled by ones. I will make a PR to CUDA.jl t only support GPU Diagonal matrices.

Speaking of @ytdHuang's question, I think it would still be useful to define methods for a generic identity matrix, which are independent on the Array type. For this case, FillArrays.jl would be the solution. Indeed, we should only implement the method for Diagonal{T, <:AbstractFill{T, 1}}, which would be exactly the implementation already done for sparse matrices. If this can be done, than the extension to dense matrices is straightforward, which can be directly done in GPUArrays.jl.

If you agree on this implementation, I can make these PRs, but I don't know whether to use FillArrays.jl as a direct dependency or as an extension.

@ytdHuang
Copy link
Author

ytdHuang commented Jan 31, 2025

But I think applying kronecker product between a CPU diagonal and GPU sparse matrices is also very useful...
can we keep it ?

@albertomercurio
Copy link
Contributor

albertomercurio commented Jan 31, 2025

If the diagonal is constant yes, but make it for generic Diagonal is not correct. That's why I propose using FillArrays.jl.

One can first convert the diagonal vector to a CUDA one, but I don't think it is a good idea. Linear algebra operations should always be between compatible types.

I don't know what @maleadt thinks about it.

@maleadt
Copy link
Member

maleadt commented Feb 3, 2025

We generally don't support array operations on mixed data sources, as the performance characteristics may be very surprising (having to allocate or upload behind the scenes).

@albertomercurio
Copy link
Contributor

Yes indeed. But what do you recommend to do? Can I write a FillArrays.ls extension? Or can I just import FillArrays as a direct dependency to both CUDA and KernelAbstractions?

@maleadt
Copy link
Member

maleadt commented Feb 3, 2025

I think an extension would be a better fit here -- you're suggesting adding a specific implementation as a method that dispatches on AbstractFill, and not to use FillArrays.jl functionality to implement a method that doesn't depend on types from that package, right?

@albertomercurio
Copy link
Contributor

Yes. We just add a method for the kron function between a GPU sparse matrix and a Diagonal filled array

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants