Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the custom definition of == on 2 rasters? #887

Open
tiemvanderdeure opened this issue Feb 5, 2025 · 8 comments · May be fixed by #888
Open

Remove the custom definition of == on 2 rasters? #887

tiemvanderdeure opened this issue Feb 5, 2025 · 8 comments · May be fixed by #888

Comments

@tiemvanderdeure
Copy link
Collaborator

Right now == on two Rasters has a custom definition, so that

da1 = rand(X(1:10))
da2 = rand(Y(1:10))

da1 == da2 # returns false
Raster(da1) == Raster(da2) # errors with DimensionMismatch

This is because of this definition

Rasters.jl/src/array.jl

Lines 69 to 71 in 177d89f

function Base.:(==)(A::AbstractRaster{T,N}, B::AbstractRaster{T,N}) where {T,N}
size(A) == size(B) && all(A .== B)
end

Which hasn't been touched for 4 years - should we just remove it on the next breaking release?

@rafaqz
Copy link
Owner

rafaqz commented Feb 5, 2025

Yes that looks pretty bad.

But what happens if we remove it? I think part of the reason is to make sure it uses DiskArrays for the comparison.

It could call == on the parent object and let DiskArrays handle it? Idk. Maybe it also needs that dim check comparison now that it errors. But we could use the Bool version that just returns false

@asinghvi17
Copy link
Collaborator

asinghvi17 commented Feb 5, 2025

We already have

Base.:(==)(A1::AbstractDimArray, A2::AbstractDimArray) =
    parent(A1) == parent(A2) && dims(A1) == dims(A2)

in DD, which this falls back to. We should probably reorder this to

Base.:(==)(A1::AbstractDimArray, A2::AbstractDimArray) =
    dims(A1) == dims(A2) && parent(A1) == parent(A2)

to save a bit of time, but in general would that be sufficient? Not sure if we've implemented == on things like FileArray in any case. Maybe this should be moved to DD directly...or we make the DD version broadcast on == too?

Especially if DD can emit HDF5 like arrays directly, this becomes important, since the use of DiskArrays will no longer be bound to Rasters.

@rafaqz
Copy link
Owner

rafaqz commented Feb 5, 2025

Yeah, that's probably fine as a fallback.

Ok to delete the Rasters one.

FileArray is a disk array so will do whatever DiskArrays does

@tiemvanderdeure tiemvanderdeure linked a pull request Feb 6, 2025 that will close this issue
@tiemvanderdeure
Copy link
Collaborator Author

I don't know if there are other fields we want to check for equality? Like metadata? Though then we should handle it in DD

calling .== is potentially pretty bad on lazy rasters

@rafaqz
Copy link
Owner

rafaqz commented Feb 6, 2025

It's bad but less bad than it could be as it's a BitArray. But I think DiskArrays should handle it In the fallback to parent.

Metadata no. But maybe missingval? Otherwise the comparison could be wrong?

Idk what's best

@tiemvanderdeure
Copy link
Collaborator Author

As it turns out DiskArrays doesn't have a fallback so Raster("myraster.nc"; lazy = true) == Raster("myraster.nc"; lazy = true) reads in the whole raster twice even after the fallback.

Okay missingval is a good one. Probably Raster(zeros(X(1:10)); missingval = 0) == Raster(zeros(X(1:10))) should return false. I don't know how often that will happen out in the wild.

@rafaqz
Copy link
Owner

rafaqz commented Feb 6, 2025

It will always read the whole array twice, it's a bit of work to go through the broadcast and get identical arrays and link them somehow then put them back in the broadcast, so that doesn't happen.

Just using cache would get most of the way. We could flatten, check equality and cache accordingly then reinsert a single cached array in all spots

(Oh but in this case we could short circuit at equality check by just comparing the FileArray? but is it worth it...)

@tiemvanderdeure
Copy link
Collaborator Author

I don't think it's worth putting much time into, no :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants