Skip to content

Conversation

paleolimbot
Copy link
Member

Which issue does this PR close?

I am sorry that I missed the previous PR implementing this ( #18120 ) and I'm also happy to review that one instead of updating this!

Rationale for this change

Other systems that interact with the logical plan (e.g., SQL, Substrait) can express types that are not strictly within the arrow DataType enum.

What changes are included in this PR?

For the Cast and TryCast structs, the destination data type was changed from a DataType to a FieldRef.

Are these changes tested?

Work in progress!

Are there any user-facing changes?

Yes, any code using Cast { .. } to create an expression would need to use Cast::new() instead (or pass on field metadata if it has it)

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate substrait Changes to the substrait crate proto Related to proto crate functions Changes to functions implementation labels Oct 17, 2025
Comment on lines +3511 to +3520
if data_type.metadata().is_empty() {
write!(f, "CAST({expr} AS {})", data_type.data_type())
} else {
write!(
f,
"CAST({expr} AS {}<{:?}>)",
data_type.data_type(),
data_type.metadata()
)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a utility function for this in #17986 (how to print a type represented by a DataType and possibly metadata in a user-facing error or explain plan). The idea is to keep existing plan explain strings the same unless somebody actually puts metadata here.

Comment on lines +598 to +603
f.as_ref()
.clone()
.with_data_type(data_type.data_type().clone())
.with_metadata(f.metadata().clone())
// TODO: should nullability be overridden here or derived from the
// input expression?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this type of cast should be able to express nullability or not.

Comment on lines -294 to +295
data_type.clone(),
// TODO: this drops extension metadata associated with the cast
data_type.data_type().clone(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually need physical expressions to be able to cast things...my vauge plan is to use a logical plan transformation or perhaps optimizer rule to replace casts to extension types with a ScalarUDF call. This should possibly error if there is mismatched metadata between the input and destination (i.e., a physical cast would only ever represent a storage cast, which is usually OK).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates proto Related to proto crate sql SQL Planner substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LogicalPlan Casts can't express a cast to an extension type

1 participant