Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways to improve spherical cellarea #779

Open
1 of 3 tasks
asinghvi17 opened this issue Oct 5, 2024 · 20 comments
Open
1 of 3 tasks

Ways to improve spherical cellarea #779

asinghvi17 opened this issue Oct 5, 2024 · 20 comments

Comments

@asinghvi17
Copy link
Collaborator

asinghvi17 commented Oct 5, 2024

#768 already does a lot! But there are still a few low hanging fruit.

For reference,

# setup
using Rasters, Proj, Chairmarks #ArchGDAL, Chairmarks
using Rasters.Lookups
dX = 0.0006
dY = -0.0003
lon = X(Projected(166.0:dX:167.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-79.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
ras2 = Rasters.reproject(ras; crs = Rasters.EPSG(3857))

# benchmarks
julia> @be Rasters.cellarea(Planar(), ras2) seconds=3
Benchmark: 634 samples with 1 evaluation
 min    3.289 ms (18 allocs: 42.518 MiB)
 median 3.646 ms (18 allocs: 42.518 MiB)
 mean   4.695 ms (18 allocs: 42.518 MiB, 7.53% gc time)
 max    185.097 ms (18 allocs: 42.518 MiB, 96.16% gc time)

julia> @be Rasters.cellarea(Spherical(), ras2) seconds=30
Benchmark: 3 samples with 1 evaluation
        12.187 s (161175590 allocs: 6.004 GiB, 5.18% gc time)
        12.258 s (161175590 allocs: 6.004 GiB, 5.32% gc time)
        12.640 s (161175590 allocs: 6.004 GiB, 7.09% gc time)
  • Each point is transformed 4 times, except for corner and border points.
    • We could reduce this by pre-transforming all points and selecting from that "raster". But for large rasters this is too memory intensive.
    • Tiled iteration over the raster would allow us to cut the memory use here and also speed up computation.
  • Multithreading via tiled iteration would also significantly speed up the computation (currently single threaded)
  • Projections via Proj are at least an order of magnitude faster than those via GDAL (determined by GeometryOps benchmarks)
@rafaqz
Copy link
Owner

rafaqz commented Oct 5, 2024

With tiled batches we can just keep filling the same small ector of points that we pass to Proj to tranform. We can swap over resample at the same time. Its just historical that GDAL was already available as a hard dep so I used it. But its not anymore, and we should switch to Proj.

(also MWE or at least the size of ras2 would help your benchmark)

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 6, 2024

Added MWE code. General tiled iteration would also be nice to have, maybe from DD - this way you can iterate over the most optimal pattern for access as well, which could potentially play well with Dagger. I'm sure DD already does this but it would be nice to have it more exposed.

@rafaqz
Copy link
Owner

rafaqz commented Oct 6, 2024

Actually we just let DiskArrays handle that currently

@tiemvanderdeure
Copy link
Contributor

The reason I wrote it so that every point is transformed 4 times is that cell boundaries don't necessarily match. We can write a more optimized version but we need to keep something as stupid as the current implementation around for those cases.

@rafaqz
Copy link
Owner

rafaqz commented Oct 7, 2024

Ah right we need a version for Irregular and Regular

@tiemvanderdeure
Copy link
Contributor

Ah right we need a version for Irregular and Regular

Basically, though technically it could be possible to have overlap or gaps even if a dimension is Regular, right?

@rafaqz
Copy link
Owner

rafaqz commented Oct 7, 2024

No regular is by definition gapless, there is no way to represent gaps. Actually only Explicit lookups have gaps and not Irregular, Irregular cant represent gaps either

@tiemvanderdeure
Copy link
Contributor

Switching ArchGDAL out with Proj and using tuples instead of vectors already is a bit of an improvement:

julia> @be Rasters.cellarea(Spherical(), ras2) seconds=30
Benchmark: 4 samples with 1 evaluation
        8.519 s (49 allocs: 42.481 MiB)
        8.876 s (61 allocs: 42.481 MiB, 0.26% gc time)
        9.016 s (49 allocs: 42.481 MiB)
        9.396 s (56 allocs: 42.481 MiB, 0.20% gc time)

Profview shows that roughly 50% of the time is spent on the transformation and 50% on converting lonlat to cartesian coordinates.

@tiemvanderdeure
Copy link
Contributor

I also found out that proj can convert directly to cartesian coordinates, though it seems to be much slower than converting to lat-lon and then to cartesian in Julia.
https://proj.org/en/9.3/operations/conversions/cart.html

@asinghvi17
Copy link
Collaborator Author

Proj will likely be less accurate than Julia for that transformation, FYI.

@tiemvanderdeure
Copy link
Contributor

It was also slower, so the way it was before was probably fine.

I think switching from GDAL to Proj is easy enough. It's silly to transform every point 4 times but can't think of an elegant end generic solution to do the tiled iteration right now. I guess both of you have implemented that kind of thing a hundred times?

@tiemvanderdeure
Copy link
Contributor

Also, this is not the kind of function that would end up inside an inner loop. I'm not sure how necessary it is to spend a lot of effort on making it fast.

@rafaqz
Copy link
Owner

rafaqz commented Oct 7, 2024

Its probably already the fastest that exists by an order of magnitude, so I'm happy to call it off wherever. But generally switching from GDAL to Proj for these things is better for a bunch of reasons, not just speed.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 8, 2024

If we're breaking, then we may as well break everything, and switch both reproject and cellarea to Proj instead of ArchGDAL? Since none of them need gdalwarp. Maybe also mapped crs things.

@asinghvi17
Copy link
Collaborator Author

A generic tiled iterator would be a bit tough. I think you'd indicate an approximate size, and the iterator could partition into tiles that fit the chunk pattern most optimally. But that's a bigger change...I think even switching to Proj will give us enough speedup for my purposes.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 8, 2024

Ah crap, converting cellarea to Proj means we have to implement some form of GFT conversion in Proj as well...

Edit: Proj has both GFT conversions and an inbuilt geographic check. So we could just switch.

@asinghvi17
Copy link
Collaborator Author

https://github.com/meggart/DiskArrayEngine.jl/tree/main kind of implements tiled iteration, but it's also huge. I don't think it offers multithreading by default either, and it pulls in a crazy number of dependencies.

We could talk to Fabian about maybe splitting parts of it out into a DiskArrayEngineBase.jl package? But it's a bit complicated. Probably easier to write a lightweight wrapper on our own and maybe take some concepts / struct from DiskArrayEngine...

@lazarusA
Copy link
Collaborator

lazarusA commented Oct 8, 2024

I believe is threaded and distributed by default, see the runner.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 13, 2024

I just realized that Proj allows you to return data in radians (which is what it uses internally). We could potentially switch to that and avoid cosd and sind. Might speed up projection speed by a hair as well.

The switch is +units=r in a Proj-string, but if it's safer we can always use a const RADIAN_WGS84 = GFT.ESRIWellKnownText(...).

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

Good idea, minimise conversions where possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants