-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support group by span over time based column with Span UDF #3421
Support group by span over time based column with Span UDF #3421
Conversation
Signed-off-by: Songkan Tang <[email protected]>
…ngine Signed-off-by: Songkan Tang <[email protected]>
date.atStartOfDay().atZone(ZoneOffset.UTC).toInstant().toEpochMilli(), interval); | ||
return SqlFunctions.timestampToDate(dateEpochValue); | ||
case SqlTypeName.TIME: | ||
if (dateTimeUnit.getId() > 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why need this limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment inline. The TIME type usually means date format without year, month, etc like the field '17:59:59.99'
* day, 1 month, 1 hour | ||
* </ol> | ||
*/ | ||
public class SpanFunction implements UserDefinedFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave a TODO here to do refactoring in future. Your previous idea of implementing a self-defined implementor seems to be a better approach.
Looks like it could be implemented by replacing ScalarFunction
(we currently use) with ImplementableFunction
, and move logic for handling different filed type to the implementor. In that way, this span function should be simpler with only handling timestamp type.
Then, we may have different functions like dateToTs
, tsToDate
, span(ForTs)
and provide us more flexibility to reuse them or combine them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO
Signed-off-by: Songkan Tang <[email protected]>
@@ -100,6 +101,9 @@ static SqlOperator translate(String op) { | |||
return SqlLibraryOperators.DATE_ADD_SPARK; | |||
case "DATE_ADD": | |||
return SqlLibraryOperators.DATEADD; | |||
// UDF Functions | |||
case "SPAN": | |||
return TransferUserDefinedFunction(SpanFunction.class, "SPAN", ReturnTypes.ARG0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReturnTypes.ARG0
or ReturnTypes.ARG0_NULLABLE
? What if the timestamp is null in span(timestamp, ...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to ReturnTypes.ARG0_NULLABLE
but it will transform it to default long or int value like 0L or 0 in linq4j generated code. Still looking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved the groupby nulls issue due to sql type creation conflicts in RexNodeBuilder.
public void testAvgByTimeSpanAndFields() { | ||
JSONObject actual = | ||
executeQuery( | ||
String.format( | ||
"source=%s | stats avg(balance) by span(birthdate, 1 day) as age_balance", | ||
"source=%s | stats avg(balance) by span(birthdate, 1 month) as age_balance", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need some ITs for such as 15 minutes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 15 minutes IT.
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
|
Flaky test
Let me trigger a re-run attempt. cc @yuancu |
Signed-off-by: Songkan Tang <[email protected]>
Signed-off-by: Songkan Tang <[email protected]>
Added corresponding change and more IT to cover span over different kinds of date formats. |
Looks like we have some flaky tests, this time there is one in
|
b99130a
into
opensearch-project:feature/calcite-engine
Failure due to flaky test, merged. |
Description
Support group by span over time based column with Span UDF
Related Issues
Resolves #3354
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.