-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-44555: [C++][Compute] Allow casting struct to bigger nullable struct #44587
base: main
Are you sure you want to change the base?
GH-44555: [C++][Compute] Allow casting struct to bigger nullable struct #44587
Conversation
|
I'm not confident that I've taken the right approach here but I think its ready for review. |
Ok, it looks like there is at least one more tests I need to update. I can probably sort that tomorrow. |
@@ -2328,7 +2328,7 @@ DatasetAndBatches MakeNestedDataset() { | |||
field("b", boolean()), | |||
field("c", struct_({ | |||
field("d", int64()), | |||
field("e", float64()), | |||
field("e", int64()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this was not supposed to be different from in the physical schema. This data is used to test if a virtual column will be materialised so I guess it shouldn't be testing int64 -> float64 casting at the same time.
Sorry for the direct ping but @lidavidm please could you review when you get a chance. I assume you are the right person to review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me, thanks. CC @bkietz for some more eyes?
// is definitely no in_field with matching name. | ||
|
||
// -2 is a sentinel value indicating fill with null. | ||
fields_to_select[out_field_index++] = -2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add something like constexpr int kFillNullSentinel = -2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done but I wasn't really confident on the best place to define the constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be placed inside the method definition itself!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I moved it inside. As you can probably tell I am a C++ noob.
Co-authored-by: David Li <[email protected]>
Rationale for this change
Sometimes its useful to add a column full of nulls in a cast. Currently this is supported in top level columns but not inside structs. Example where this is important: delta-io/delta-rs#1610
What changes are included in this PR?
Add support for filling in columns full of null for nullable struct fields.
I've gone for a fairly minimal change that achieves what I needed but I wonder if there should be a more significant change so that this casting is done by field name and ignore the field order.
Are these changes tested?
Yes. The expected behaviour in several existing tests has been altered and a couple of new tests have been added.
I also rolled out a custom build with this change internally because it suddenly became a critical problem.
Are there any user-facing changes?
Yes. There are scenarios that previously failed with
struct fields don't match or are in the wrong order
but now succeed after filling innull
s.