-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various Cube merge/concatenate issues #5375
Comments
For anyone outside the Met Office who is interested, I also have the |
Great issue, thanks for articulating those problems. To add another one to the list, when trying to concatenate ensemble runs the realization coordinate often overlaps, e.g., different MOGREPS-G runs all have the control member numbered 0. Appreciate sometimes you’d want this information to be preserved but most of the time I don’t really care which member is which! |
Taming Design NotesSimilarity of proposed "operations"All of the proposed "operations" follow a pattern -
Proposed new API formWe could write all these as additional iris.util routines, in the style of equalise_attributes/unify_time_coords
Possible additional functionsWe can think of some extra possibly useful operations, too:
Grouping of input cubes ?Optionally, the top-level "equalise_cubes" function could identify the "groups" of cubes (which would naturally be chosen by merge/concat),
Embedding in extended "combine_cubes" operationWe also anticipate that "equalise_cubes" can become an additional step in an extended "combine_cubes" operation
|
Proposed Development Plan=Tamed ?
Notes on processfor taming meeting:
|
Background
Following the Dragon Taming session offline and #4446 #3234, I thought it best to try and capture what users want from improvements to merge/concatenate and what pain points there are, so we can establish what if any can be fixed.
So the user story is as follows - I've loaded in some data as iris cubes, and put them in my cubelist. I want to do this to perform some kind of analysis on the resulting cubes, producing plots or datasets, not cubes to further share on, so destruction or ignorance of metadata is fine. I've done equalise_attributes and unify_time_units. Yet Iris refuses to join my cubes together, when I know the data makes sense together. Why?
Exploration of Problem
Here's a few examples I've found from talking to users / yammer/AVD knowledgebase
I think you can put these in three categories - things that are bugs to fix, things that a "force" keyword or another util function could fix, and things that should be errors.
Dim coords having same values, but different Dtypes (see #5372 for it already being addressed for time, but I think a broader check may be useful)
Remove all cell methods from the cubes before merge/concat
Remove all auxiliary coordinates from the cubes before merge/concat
Remove all derived coordinates from the cubes before merge/concat
Remove scalar coordinates before concatenate (not merge)
Remove bounds from dimensional coordinates and only compare points (maybe guess_bounds afterwards to resstablish?)
Guess a order for dimensional coordinates for cases where dim coords are exactly equal (maybe, might leave this one as error)
Cubes having overlapping times.
Cubes having different units (or names, for either coords or the cube itself)
Proposed solution
In this process, the hardest to diagnose errors were often ones relating to Dim coords not matching. Adding a function to iris similar to @rcomer's coord_diffs (http://fcm9/projects/utils/browser/hadru-python/trunk/iris_wrappers.py?marks=19-52#L19) to allow easier or automatic comparison of coords when there is an error would also improve the user experience.
So, I propose two things. A coord_comparison function as a coord method that gives more detail to the user when a concatenate or merge fails due to dim coords not matching.
And a Force keyword for concatenate/merge cube that does some or all of the above listed fixes to allow users to automatically do the steps in iris they are already manually doing to their cubes. It should also automatically call equalise_attributes and unify_time_units
Thoughts?
The text was updated successfully, but these errors were encountered: