-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: enable full decimal to decimal support #1385
base: main
Are you sure you want to change the base?
fix: enable full decimal to decimal support #1385
Conversation
use a regex to match arrow invalid argument error.
@@ -872,6 +872,13 @@ fn cast_array( | |||
let array = array_with_timezone(array, cast_options.timezone.clone(), Some(to_type))?; | |||
let from_type = array.data_type().clone(); | |||
|
|||
let native_cast_options: CastOptions = CastOptions { | |||
safe: !matches!(cast_options.eval_mode, EvalMode::Ansi), // take safe mode from cast_options passed | |||
format_options: FormatOptions::new() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one can use a default
value defined for FormatOptions
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the default CAST_OPTIONS which is replaced by this native_cast_options had two these set to
static TIMESTAMP_FORMAT: Option<&str> = Some("%Y-%m-%d %H:%M:%S%.f");
timestamp_format: TIMESTAMP_FORMAT,
timestamp_tz_format: TIMESTAMP_FORMAT,
If we change it to default
, I checked FormatOptions::default()
implementation set these
timestamp_format: None,
timestamp_tz_format: None,
Hence kept it as it is defined inside default CAST_OPTIONS for comet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. (The format options are used only to make the cast of timestamp to string compatible with Spark, and are not needed anywhere else) but I guess it is a good idea to be consistent everywhere.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1385 +/- ##
=============================================
- Coverage 56.12% 39.32% -16.81%
- Complexity 976 2085 +1109
=============================================
Files 119 265 +146
Lines 11743 61128 +49385
Branches 2251 12960 +10709
=============================================
+ Hits 6591 24036 +17445
- Misses 4012 32587 +28575
- Partials 1140 4505 +3365 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, thank you @himadripal just minor comments
|-|-|-| | ||
| boolean | byte | | | ||
| boolean | short | | | ||
|-|---------|-| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Just checking whether this change is due to the changes in the producing method
Should be automatically created by make release
// for comet decimal conversion throws ArrowError(string) from arrow - across spark versions the message dont match. | ||
if (sparkMessage.contains("cannot be represented as")) { | ||
assert( | ||
sparkException.getMessage | ||
.replace(".WITH_SUGGESTION] ", "]") | ||
.startsWith(cometMessage)) | ||
} else if (CometSparkSessionExtensions.isSpark34Plus) { | ||
// for Spark 3.4 we expect to reproduce the error message exactly | ||
assert(cometMessage == sparkMessage) | ||
cometMessage.contains("cannot be represented as") || cometMessage.contains( | ||
"too large to store")) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are message modifications below per spark version
Would you mind update them instead of creating another if branch?
Completes #375
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
How are these changes tested?