Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Calcite Engine] Support In expression #3429

Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import org.opensearch.sql.ast.expression.Compare;
import org.opensearch.sql.ast.expression.EqualTo;
import org.opensearch.sql.ast.expression.Function;
import org.opensearch.sql.ast.expression.In;
import org.opensearch.sql.ast.expression.Let;
import org.opensearch.sql.ast.expression.Literal;
import org.opensearch.sql.ast.expression.Not;
Expand Down Expand Up @@ -143,6 +144,27 @@ public RexNode visitNot(Not node, CalcitePlanContext context) {
return context.relBuilder.not(expr);
}

@Override
public RexNode visitIn(In node, CalcitePlanContext context) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calcite will optimize In expression to equal comparison if there is only 1 element in value list. We don't need to do such optimization ourself here like v2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to do such optimization ourself here like v2.

Do u mean we push down IN to OpenSearch as terms query, so Calcite optimized is not needed?

final RexNode field = analyze(node.getField(), context);
final List<RexNode> valueList =
node.getValueList().stream().map(value -> analyze(value, context)).toList();
final List<RelDataType> dataTypes =
new java.util.ArrayList<>(valueList.stream().map(RexNode::getType).toList());
dataTypes.add(field.getType());
RelDataType commonType = context.rexBuilder.getTypeFactory().leastRestrictive(dataTypes);
if (commonType != null) {
List<RexNode> newValueList =
valueList.stream().map(value -> context.rexBuilder.makeCast(commonType, value)).toList();
return context.rexBuilder.makeIn(field, newValueList);
} else {
throw new SemanticCheckException(
StringUtils.format(
"In expression types are incompatible: fields type %s, values type %s",
dataTypes.getLast(), dataTypes.subList(0, dataTypes.size() - 1)));
}
}

@Override
public RexNode visitCompare(Compare node, CalcitePlanContext context) {
SqlOperator op = BuiltinFunctionUtils.translate(node.getOperator());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,25 @@
import org.junit.Ignore;
import org.opensearch.sql.ppl.WhereCommandIT;

@Ignore("Not all boolean functions are supported in Calcite now")
public class CalciteWhereCommandIT extends WhereCommandIT {
@Override
public void init() throws IOException {
enableCalcite();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we disableCalcite also? otherwise it may impact other tests?

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have exclude 'org/opensearch/sql/calcite/remote/**' in build.gradle, it won't really run in CI. The parent class WhereCommandIT is the suite for test cases without calcite enabled.

disallowCalciteFallback();
super.init();
}

@Ignore("https://github.com/opensearch-project/sql/issues/3428")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3428 is LIKE function, why releated to testIsNotNull?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test case contains a like function as well.

"source=%s | where isnotnull(age) and like(firstname, 'Ambe_') | fields firstname"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WhereCommandIT is an existed ppl IT which is extended by CalciteWhereCommandIT

@Override
public void testIsNotNullFunction() throws IOException {}

@Ignore("https://github.com/opensearch-project/sql/issues/3333")
@Override
public void testWhereWithMetadataFields() throws IOException {}

@Override
protected String getIncompatibleTypeErrMsg() {
return "In expression types are incompatible: fields type BIGINT, values type [INTEGER,"
+ " INTEGER, CHAR(4)]";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHAR(4) is not a valid OpenSearch PPL datatype.

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No, I mean v2 has optimization to transform column in (a) to column = a when converting. Since calcite will do such optimization in its optimization phase, we just need to convert it to the former expression directly without too much consideration.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@

package org.opensearch.sql.ppl;

import static org.hamcrest.CoreMatchers.containsString;
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_ACCOUNT;
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_BANK_WITH_NULL_VALUES;
import static org.opensearch.sql.util.MatcherUtils.rows;
import static org.opensearch.sql.util.MatcherUtils.verifyDataRows;

import java.io.IOException;
import org.hamcrest.MatcherAssert;
import org.json.JSONObject;
import org.junit.jupiter.api.Test;

Expand Down Expand Up @@ -106,4 +108,73 @@ public void testWhereWithMetadataFields() throws IOException {
String.format("source=%s | where _id='1' | fields firstname", TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows("Amber"));
}

@Test
public void testWhereWithIn() throws IOException {
JSONObject result =
executeQuery(
String.format(
"source=%s | where firstname in ('Amber') | fields firstname", TEST_INDEX_ACCOUNT));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a test for

where age in (36, 36.0, '36')

verifyDataRows(result, rows("Amber"));

result =
executeQuery(
String.format(
"source=%s | where firstname in ('Amber', 'Dale') | fields firstname",
TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows("Amber"), rows("Dale"));

result =
executeQuery(
String.format(
"source=%s | where balance in (4180, 5686.0) | fields balance",
TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows(4180), rows(5686));
}

@Test
public void testWhereWithNotIn() throws IOException {
JSONObject result =
executeQuery(
String.format(
"source=%s | where account_number < 4 | where firstname not in ('Amber', 'Levine')"
+ " | fields firstname",
TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows("Roberta"), rows("Bradshaw"));

result =
executeQuery(
String.format(
"source=%s | where account_number < 4 | where not firstname in ('Amber', 'Levine')"
+ " | fields firstname",
TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows("Roberta"), rows("Bradshaw"));

result =
executeQuery(
String.format(
"source=%s | where not firstname not in ('Amber', 'Dale') | fields firstname",
TEST_INDEX_ACCOUNT));
verifyDataRows(result, rows("Amber"), rows("Dale"));
}

@Test
public void testInWithIncompatibleType() {
Exception e =
assertThrows(
Exception.class,
() -> {
executeQuery(
String.format(
"source=%s | where balance in (4180, 5686, '6077') | fields firstname",
TEST_INDEX_ACCOUNT));
});
MatcherAssert.assertThat(e.getMessage(), containsString(getIncompatibleTypeErrMsg()));
}

protected String getIncompatibleTypeErrMsg() {
return "function expected"
+ " {[BYTE,BYTE],[SHORT,SHORT],[INTEGER,INTEGER],[LONG,LONG],[FLOAT,FLOAT],[DOUBLE,DOUBLE],[STRING,STRING],[BOOLEAN,BOOLEAN],[DATE,DATE],[TIME,TIME],[TIMESTAMP,TIMESTAMP],[INTERVAL,INTERVAL],[IP,IP],[STRUCT,STRUCT],[ARRAY,ARRAY]},"
+ " but got [LONG,STRING]";
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -138,11 +138,13 @@ public UnresolvedExpression visitCompareExpr(CompareExprContext ctx) {

@Override
public UnresolvedExpression visitInExpr(InExprContext ctx) {
return new In(
visit(ctx.valueExpression()),
ctx.valueList().literalValue().stream()
.map(this::visitLiteralValue)
.collect(Collectors.toList()));
UnresolvedExpression expr =
new In(
visit(ctx.valueExpression()),
ctx.valueList().literalValue().stream()
.map(this::visitLiteralValue)
.collect(Collectors.toList()));
return ctx.NOT() != null ? new Not(expr) : expr;
}

/** Value Expression. */
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import org.opensearch.sql.ast.expression.Compare;
import org.opensearch.sql.ast.expression.Field;
import org.opensearch.sql.ast.expression.Function;
import org.opensearch.sql.ast.expression.In;
import org.opensearch.sql.ast.expression.Interval;
import org.opensearch.sql.ast.expression.Let;
import org.opensearch.sql.ast.expression.Literal;
Expand Down Expand Up @@ -421,6 +422,12 @@ public String visitBetween(Between node, String context) {
return StringUtils.format("%s between %s and %s", value, left, right);
}

@Override
public String visitIn(In node, String context) {
String field = analyze(node.getField(), context);
return StringUtils.format("%s in (%s)", field, MASK_LITERAL);
}

@Override
public String visitField(Field node, String context) {
return node.getField().toString();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,34 @@ public void testFilterQueryWithOr2() {
verifyPPLToSparkSQL(root, expectedSparkSql);
}

@Test
public void testFilterQueryWithIn() {
String ppl = "source=scott.products_temporal | where ID in ('1000', '2000')";
RelNode root = getRelNode(ppl);
String expectedLogical =
"LogicalFilter(condition=[SEARCH($0, Sarg['1000':VARCHAR(32),"
+ " '2000':VARCHAR(32)]:VARCHAR(32))])\n"
+ " LogicalTableScan(table=[[scott, products_temporal]])\n";
verifyLogical(root, expectedLogical);

String expectedSparkSql =
"SELECT *\nFROM `scott`.`products_temporal`\nWHERE `ID` IN ('1000', '2000')";
verifyPPLToSparkSQL(root, expectedSparkSql);
}

@Test
public void testFilterQueryWithIn2() {
String ppl = "source=EMP | where DEPTNO in (20, 30.0)";
RelNode root = getRelNode(ppl);
String expectedLogical =
"LogicalFilter(condition=[SEARCH($7, Sarg[20.0E0:DOUBLE, 30.0E0:DOUBLE]:DOUBLE)])\n"
+ " LogicalTableScan(table=[[scott, EMP]])\n";
verifyLogical(root, expectedLogical);

String expectedSparkSql = "SELECT *\nFROM `scott`.`EMP`\nWHERE `DEPTNO` IN (2.00E1, 3.00E1)";
verifyPPLToSparkSQL(root, expectedSparkSql);
}

@Test
public void testQueryWithFields() {
String ppl = "source=products_temporal | fields SUPPLIER, ID";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ public void testNotExpression() {
assertEquals("source=t | where not a = ***", anonymize("source=t | where not a=1 "));
}

@Test
public void testInExpression() {
assertEquals("source=t | where a in (***)", anonymize("source=t | where a in (1, 2, 3) "));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test for

where a not in (1, 2, 3)

Copy link
Contributor Author

@qianheng-aws qianheng-aws Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"not in" is not supported in our g4, which mean v2 engine don't support it either. Need to add that implementation for v2 engine as well.

}

@Test
public void testQualifiedName() {
assertEquals("source=t | fields + field0", anonymize("source=t | fields field0"));
Expand Down
Loading