Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Calcite Engine] Support In expression #3429

Conversation

qianheng-aws
Copy link
Contributor

@qianheng-aws qianheng-aws commented Mar 14, 2025

Description

Support In expression, also add data anonymizer for it.

./gradlew :integ-test:integTest --tests '*Calcite*IT succeed locally.

../../../ppl/src/test/java/org/opensearch/sql/ppl/calcite/* UT succeed locally.

Related Issues

Resolves #3420

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -139,6 +140,16 @@ public RexNode visitNot(Not node, CalcitePlanContext context) {
return context.relBuilder.not(expr);
}

@Override
public RexNode visitIn(In node, CalcitePlanContext context) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calcite will optimize In expression to equal comparison if there is only 1 element in value list. We don't need to do such optimization ourself here like v2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do such optimization ourself here like v2.

Do u mean we push down IN to OpenSearch as terms query, so Calcite optimized is not needed?

@Ignore("https://github.com/opensearch-project/sql/issues/3428")
@Override
public void testIsNotNullFunction() throws IOException {}
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this ;

Signed-off-by: Heng Qian <[email protected]>
@LantaoJin LantaoJin added the calcite calcite migration releated label Mar 14, 2025
@@ -154,6 +154,11 @@ public void testNotExpression() {
assertEquals("source=t | where not a = ***", anonymize("source=t | where not a=1 "));
}

@Test
public void testInExpression() {
assertEquals("source=t | where a in (***)", anonymize("source=t | where a in (1, 2, 3) "));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test for

where a not in (1, 2, 3)

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not in" is not supported in our g4, which mean v2 engine don't support it either. Need to add that implementation for v2 engine as well.

JSONObject result =
executeQuery(
String.format(
"source=%s | where firstname in ('Amber') | fields firstname", TEST_INDEX_ACCOUNT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a test for

where age in (36, 36.0, '36')

Signed-off-by: Heng Qian <[email protected]>
…ine' into feature/calcite-engine-in

# Conflicts:
#	ppl/src/main/antlr/OpenSearchPPLParser.g4
#	ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
public class CalciteWhereCommandIT extends WhereCommandIT {
@Override
public void init() throws IOException {
enableCalcite();
disallowCalciteFallback();
super.init();
}

@Ignore("https://github.com/opensearch-project/sql/issues/3428")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3428 is LIKE function, why releated to testIsNotNull?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test case contains a like function as well.

"source=%s | where isnotnull(age) and like(firstname, 'Ambe_') | fields firstname"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WhereCommandIT is an existed ppl IT which is extended by CalciteWhereCommandIT

@@ -139,6 +140,16 @@ public RexNode visitNot(Not node, CalcitePlanContext context) {
return context.relBuilder.not(expr);
}

@Override
public RexNode visitIn(In node, CalcitePlanContext context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do such optimization ourself here like v2.

Do u mean we push down IN to OpenSearch as terms query, so Calcite optimized is not needed?

@Override
protected String getIncompatibleTypeErrMsg() {
return "In expression types are incompatible: fields type BIGINT, values type [INTEGER,"
+ " INTEGER, CHAR(4)]";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHAR(4) is not a valid OpenSearch PPL datatype.

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No, I mean v2 has optimization to transform column in (a) to column = a when converting. Since calcite will do such optimization in its optimization phase, we just need to convert it to the former expression directly without too much consideration.

@@ -9,12 +9,25 @@
import org.junit.Ignore;
import org.opensearch.sql.ppl.WhereCommandIT;

@Ignore("Not all boolean functions are supported in Calcite now")
public class CalciteWhereCommandIT extends WhereCommandIT {
@Override
public void init() throws IOException {
enableCalcite();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we disableCalcite also? otherwise it may impact other tests?

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have exclude 'org/opensearch/sql/calcite/remote/**' in build.gradle, it won't really run in CI. The parent class WhereCommandIT is the suite for test cases without calcite enabled.

Signed-off-by: Heng Qian <[email protected]>
@LantaoJin LantaoJin merged commit 29c7625 into opensearch-project:feature/calcite-engine Mar 18, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants