feat: Add regrid functionality based on match geometry#1597
feat: Add regrid functionality based on match geometry#1597eliascapriles-NOAA wants to merge 43 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1597 +/- ##
==========================================
+ Coverage 85.58% 85.61% +0.03%
==========================================
Files 79 78 -1
Lines 6998 7169 +171
==========================================
+ Hits 5989 6138 +149
- Misses 1009 1031 +22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hey @eliascapriles-NOAA : Thanks for the PR! Below are some comments based on quick look:
|
|
Sounds good. I will get started on implementing the changes ! |
|
Hi Wu-Jung sorry for the delay ! The tests uncovered a couple of bugs that I wanted to fix before submitting for a new PR. I have added unit tests for my helper function, and integration test using SV data for the regridding function. Let me know if there are any changes ! |
Merged changes in main to regrid for PR
for more information, see https://pre-commit.ci
|
Thanks @eliascapriles-NOAA ! I'm going to ask @LOCEANlloydizard to help review this since you had most of the discussions with him. @LOCEANlloydizard - Could you please take a look? feel free to ping me for discussions. Thanks! |
|
Sounds good ! Thanks @leewujung |
LOCEANlloydizard
left a comment
There was a problem hiding this comment.
Hey @eliascapriles-NOAA, thx for the PR!
Following our discussion, a small recap (you probably noted more!):
- the errors from CI are now gone with the merge of pandas < 3, just need to address the assertionError
- remove the +20 padding at the end of the function
- double-check that pings are actually aligned before regridding (there’s already a helper in echopype that does this, see getting_started notebook)
- update the notebook to call the function from your PR, so I can run it on my side too
- we can go from there for other points !
One thing I wanted to flag more generally: right now the regridding is done by looping over channel in Python, and inside that running an apply_ufunc that vectorises over ping_time × range_sample.
This is more a design question than a bug: are we happy with looping over channels in Python and vectorising over pings with apply_ufunc, or should we try to make the whole regridding run as one channel-aware vectorised operation? I know you looked into it.., and i could have a look as well! and maybe @leewujung would have recommendations for this?
Cheers!
There was a problem hiding this comment.
This error "TypeError: only integer scalar arrays can be converted to a scalar index" is due to pandas update to version 3.0. Could you merge the latest version of the main echopype repo (pandas is pinned there!)
|
Hey @eliascapriles-NOAA @LOCEANlloydizard : Not sure if I am really making a good suggestion, but since |
|
Hi @leewujung LLoyd brought this up yesterday. The reason I currently have the channels in a loop is because my apply_ufunc is parallelizing the function across ping_time as specificed by the Echoview algorithm. However, I will try to rework my function to parallelize across the channel dimension as well |
|
Hey @eliascapriles-NOAA, thanks for the modifications! I’ve left a few comments above, and also added some broader thoughts below to check first =>
we now returns a new xarray object which is nice, but we might want to align it more with other regridding like compute_MVBS? -propagate key variables: -add provenance: -add variable-level attrs on Sv:
does not seem necessary here since nothing is mutated afterwards? (but could be missing the obvious!)
It's the final adjusments! Thank you very much for this! |
- Add warning when resampling angle variables - Extend log-domain handling to Sp and TS - Add tests for Sp/TS resampling and input validation - Add angle warning test
LOCEANlloydizard
left a comment
There was a problem hiding this comment.
Hello @eliascapriles-NOAA, @leewujung, i've made a few final small adjustments! There will probably still be some edge cases to handle over time but I think we're ready to merge 🙂
A few of the final additions:
- support for Sp and TS through the same log -> linear -> log resampling pathway used for Sv
- warning when resampling angle variables
- a few extra tests covering Sp/TS handling, angle warnings, and target-geometry validation
I think it would be nice to eventually add a regression test against Echoview if we can process the same file and match all channels to the 200 kHz geometry, similar to what is shown in the notebook: echostack-org/echopype-examples#101
@eliascapriles-NOAA I can help with this, but I'd need an output from Echoview for this file to compare against.. let me know if you would have time to generate it! Cheers!
|
Ok thank you @eliascapriles-NOAA for the file! I’ve added the Echoview regression test using the GitHub asset added in #1679. The comparison between Echoview and echopype shows very close output grids, with Sv differences below 0.003 dB outside the near-transducer region We also now raise warnings if users try to resample TS, Sp, or angle variables, since these cases should be interpreted with caution. The notebook has also been updated. |
Implements regrid_all_channel function that would allow for users to match the sampling rates of a channels in a ds_Sv dataset to a specific channel