-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Allow logical expressions to express a cast to an extension type #18136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
if data_type.metadata().is_empty() { | ||
write!(f, "CAST({expr} AS {})", data_type.data_type()) | ||
} else { | ||
write!( | ||
f, | ||
"CAST({expr} AS {}<{:?}>)", | ||
data_type.data_type(), | ||
data_type.metadata() | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a utility function for this in #17986 (how to print a type represented by a DataType and possibly metadata in a user-facing error or explain plan). The idea is to keep existing plan explain strings the same unless somebody actually puts metadata here.
f.as_ref() | ||
.clone() | ||
.with_data_type(data_type.data_type().clone()) | ||
.with_metadata(f.metadata().clone()) | ||
// TODO: should nullability be overridden here or derived from the | ||
// input expression? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this type of cast should be able to express nullability or not.
data_type.clone(), | ||
// TODO: this drops extension metadata associated with the cast | ||
data_type.data_type().clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually need physical expressions to be able to cast things...my vauge plan is to use a logical plan transformation or perhaps optimizer rule to replace casts to extension types with a ScalarUDF call. This should possibly error if there is mismatched metadata between the input and destination (i.e., a physical cast would only ever represent a storage cast, which is usually OK).
Which issue does this PR close?
I am sorry that I missed the previous PR implementing this ( #18120 ) and I'm also happy to review that one instead of updating this!
Rationale for this change
Other systems that interact with the logical plan (e.g., SQL, Substrait) can express types that are not strictly within the arrow DataType enum.
What changes are included in this PR?
For the Cast and TryCast structs, the destination data type was changed from a DataType to a FieldRef.
Are these changes tested?
Work in progress!
Are there any user-facing changes?
Yes, any code using
Cast { .. }
to create an expression would need to useCast::new()
instead (or pass on field metadata if it has it)