Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge feature/calcite-engine to main #3448

Merged
merged 38 commits into from
Mar 21, 2025

Conversation

penghuo
Copy link
Collaborator

@penghuo penghuo commented Mar 19, 2025

Description

Merge feature branch to main

Related Issues

n/a

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

LantaoJin and others added 30 commits January 17, 2025 13:58
…er (opensearch-project#3249)

* First commit for Calcite integration

Signed-off-by: Lantao Jin <[email protected]>

* disable java security manager in IT

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
…ject#3258)

* [POC] Make Calcite execute successfully

Signed-off-by: Heng Qian <[email protected]>

* [POC] Change caching schema to simple schema and avoid registering table when visitRelation.

Signed-off-by: Heng Qian <[email protected]>

* spotlessApply

Signed-off-by: Heng Qian <[email protected]>

* address comments

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* Make basic aggregation working (partial)

Signed-off-by: Lantao Jin <[email protected]>

* add a settings to enable calcite

Signed-off-by: Lantao Jin <[email protected]>

* add more UTs

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
opensearch-project#3327)

* Support Filter and Project pushdown

Signed-off-by: Heng Qian <[email protected]>

* Support Filter and Project pushdown v2

Signed-off-by: Heng Qian <[email protected]>

* Address comments

Signed-off-by: Heng Qian <[email protected]>

* Add original license for PredicateAnalyzer

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* Build integration test framework

Signed-off-by: Lantao Jin <[email protected]>

* make local work

Signed-off-by: Lantao Jin <[email protected]>

* Fix the timestamp issue

Signed-off-by: Lantao Jin <[email protected]>

* address comments

Signed-off-by: Lantao Jin <[email protected]>

* fix java style and rename CalcitePPLTestCase back to CalcitePPLIntegTestCase

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
…oject#3355)

* Add more aggregation tests

Signed-off-by: Lantao Jin <[email protected]>

* delete unrelavant code

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Transform to calcite plan before executing

Signed-off-by: Heng Qian <[email protected]>

* Fix bug for single column row

Signed-off-by: Heng Qian <[email protected]>

* Add settings for calcite pushdown

Signed-off-by: Heng Qian <[email protected]>

* Lazily construct OpenSearchRequestBuilder and do push down

Signed-off-by: Heng Qian <[email protected]>

* Address comments and disable push down

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* Fix PredicateAnalyzer for in and notIn

Signed-off-by: Heng Qian <[email protected]>

* Change text field to keyword since we don't support push down for that type

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
…3376)

* [BugFix] Fix text field push down

Signed-off-by: Heng Qian <[email protected]>

* Ignore CalciteSortCommandIT.testSortWithNullValue

Signed-off-by: Heng Qian <[email protected]>

* Refine code: only get keyword subfield for termQuery builder

Signed-off-by: Heng Qian <[email protected]>

* Refine code

Signed-off-by: Heng Qian <[email protected]>

* remove ignore tests in CalcitePPLInSubqueryIT

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* add udf/udaf interface and take/sqrt function

Signed-off-by: xinyual <[email protected]>

* add UT

Signed-off-by: xinyual <[email protected]>

* add POW, Atan, Atan2 and corresponding UT

Signed-off-by: xinyual <[email protected]>

* apply spotless

Signed-off-by: xinyual <[email protected]>

* fix table for join it

Signed-off-by: xinyual <[email protected]>

* add java doc

Signed-off-by: xinyual <[email protected]>

* apply spotless

Signed-off-by: xinyual <[email protected]>

---------

Signed-off-by: xinyual <[email protected]>
…t#3392)

* Implement ppl scalar subquery command with Calcite

Signed-off-by: Lantao Jin <[email protected]>

* more general subquery checker

Signed-off-by: Lantao Jin <[email protected]>

* support correlated IN subquery

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Change push down to logical index scan

Signed-off-by: Heng Qian <[email protected]>

* Support Aggregate Push Down

Signed-off-by: Heng Qian <[email protected]>

* Rebase and resolve conflict

Signed-off-by: Heng Qian <[email protected]>

* Add TODO

Signed-off-by: Heng Qian <[email protected]>

* Address comments

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* add string udfs

Signed-off-by: xinyual <[email protected]>

* add it to string

Signed-off-by: xinyual <[email protected]>

* add IT for string function

Signed-off-by: xinyual <[email protected]>

* remove change for local test

Signed-off-by: xinyual <[email protected]>

* revert change

Signed-off-by: xinyual <[email protected]>

---------

Signed-off-by: xinyual <[email protected]>
…nsearch-project#3405)

* Keep aggregation in Calcite consistent with current PPL behavior

Signed-off-by: Lantao Jin <[email protected]>

* remove unrelated code

Signed-off-by: Lantao Jin <[email protected]>

* revert some code

Signed-off-by: Lantao Jin <[email protected]>

* fix issue 3404

Signed-off-by: Lantao Jin <[email protected]>

* add more tests

Signed-off-by: Lantao Jin <[email protected]>

* address comments

Signed-off-by: Lantao Jin <[email protected]>

* add more tests

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Support multiple table and index pattern

Signed-off-by: Heng Qian <[email protected]>

* Fix UT

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
* add condition udfs

Signed-off-by: xinyual <[email protected]>

* add IT for conditions and register null table

Signed-off-by: xinyual <[email protected]>

* fix it

Signed-off-by: xinyual <[email protected]>

* update utils define

Signed-off-by: xinyual <[email protected]>

* add condition functions

Signed-off-by: xinyual <[email protected]>

* modify IT

Signed-off-by: xinyual <[email protected]>

* fix IT

Signed-off-by: xinyual <[email protected]>

* revert useless change and add comments

Signed-off-by: xinyual <[email protected]>

* reverse typo and apply spotless

Signed-off-by: xinyual <[email protected]>

---------

Signed-off-by: xinyual <[email protected]>
* Revert alias change, Fix IT

Signed-off-by: Peng Huo <[email protected]>

* Fix spotlessCheck

Signed-off-by: Peng Huo <[email protected]>

* Revert development test

Signed-off-by: Peng Huo <[email protected]>

* Fix PPL Test

Signed-off-by: Peng Huo <[email protected]>

* license header

Signed-off-by: Peng Huo <[email protected]>

* Ignore flaky test

Signed-off-by: Peng Huo <[email protected]>

---------

Signed-off-by: Peng Huo <[email protected]>
* revert result ordering of stats-by

Signed-off-by: Lantao Jin <[email protected]>

* Fix CRLF issue

Signed-off-by: Lantao Jin <[email protected]>

* only check spark sql

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Implement ppl lookup command with Calcite

Signed-off-by: Lantao Jin <[email protected]>

* step 2

Signed-off-by: Lantao Jin <[email protected]>

* add all lookup IT

Signed-off-by: Lantao Jin <[email protected]>

* Support lookup command

Signed-off-by: Heng Qian <[email protected]>

* Refactor Lookup

Signed-off-by: Heng Qian <[email protected]>

* Refine Code

Signed-off-by: Heng Qian <[email protected]>

* Fix UT

Signed-off-by: Heng Qian <[email protected]>

* Add anonymizer for lookup

Signed-off-by: Heng Qian <[email protected]>

* Refine code

Signed-off-by: Heng Qian <[email protected]>

* Fix UT

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Co-authored-by: Heng Qian <[email protected]>
Co-authored-by: Lantao Jin <[email protected]>
* Support ppl BETWEEN operation within Calcite

Signed-off-by: Lantao Jin <[email protected]>

* add more tests

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Correct the precedence for logical operators

Signed-off-by: Lantao Jin <[email protected]>

* fix flaky test

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
* Implement ppl dedup command with Calcite

Signed-off-by: Lantao Jin <[email protected]>

* remove union

Signed-off-by: Lantao Jin <[email protected]>

---------

Signed-off-by: Lantao Jin <[email protected]>
@penghuo penghuo added the calcite calcite migration releated label Mar 19, 2025
@penghuo penghuo changed the title Calcite engine merge v2 Merge feature/calcite-engine to main Mar 19, 2025
@penghuo penghuo marked this pull request as ready for review March 19, 2025 22:40
@LantaoJin
Copy link
Member

The DCO fails with weird summary, for example:

Commit sha: a62d87d, Author: Lantao Jin, Committer: GitHub; The sign-off is missing.

but commit a62d87d was introduced by https://github.com/opensearch-project/sql/pull/3371/checks which is signed off.

xinyual and others added 2 commits March 20, 2025 10:55
* add math udfs

Signed-off-by: xinyual <[email protected]>

* add log argument

Signed-off-by: xinyual <[email protected]>

* Add math function unit tests
- Additionally implement user-defined ConvFunction

Signed-off-by: Yuanchun Shen <[email protected]>

* Add integration tests for Calcite math functions

Signed-off-by: Yuanchun Shen <[email protected]>

* Rename CalcitePPLMathFunctionsIT to CalcitePPLBuiltinFunctionIT

Signed-off-by: Yuanchun Shen <[email protected]>

* add license

Signed-off-by: xinyual <[email protected]>

* apply spot

Signed-off-by: xinyual <[email protected]>

* Update the implementation of CONV function to align with v2's behavior
- Rename UserDefineFunctionUtils to UserDefinedFunctionUtils

Signed-off-by: Yuanchun Shen <[email protected]>

* Improve code style:
- enforce uniform parameter number check
- comment on differences from calcite's implementation if necessary

Signed-off-by: Yuanchun Shen <[email protected]>

* Simplify Calcite PPL math function unit tests

Signed-off-by: Yuanchun Shen <[email protected]>

* Alter MOD and SQRT UDF to conform to documented behaviors
- return null with invalid (zero, negative) arguments
- return wider type for mod

Signed-off-by: Yuanchun Shen <[email protected]>

* Complicate math integration tests
- edge cases for UDF
- combine operations or clauses

Signed-off-by: Yuanchun Shen <[email protected]>

* Handle NULL return in ASIN, ACOS, SQRT and POW by convert returned Double.NaN and Float.NaN to null

Signed-off-by: Yuanchun Shen <[email protected]>

* Apply spotless on math UDFs and their tests

Signed-off-by: Yuanchun Shen <[email protected]>

* Remove unnecessary Double cast in SQRT UDF

Signed-off-by: Yuanchun Shen <[email protected]>

* Convert returned Double.NaN and Float.NaN from math UDFs to LITERAL_NULL

Signed-off-by: Yuanchun Shen <[email protected]>

* Correct math UDF integration tests
- remove comparision between string and integers
- correct thrown error types

Signed-off-by: Yuanchun Shen <[email protected]>

* Update MOD UDF
- add alias % to MOD
- return negative when the dividend is negative

Signed-off-by: Yuanchun Shen <[email protected]>

* Modify substring ITs

Signed-off-by: Yuanchun Shen <[email protected]>

* apply spot

Signed-off-by: xinyual <[email protected]>

* Replace containsMessage with verifyErrorMessageContains in math ITs

Signed-off-by: Yuanchun Shen <[email protected]>

* Correct MOD return types
- additionally enrich math ITs with fields calculations

Signed-off-by: Yuanchun Shen <[email protected]>

* fix UT

Signed-off-by: xinyual <[email protected]>

---------

Signed-off-by: xinyual <[email protected]>
Signed-off-by: Yuanchun Shen <[email protected]>
Co-authored-by: xinyual <[email protected]>
Co-authored-by: Yuanchun Shen <[email protected]>
* Fix flaky tests: testSubstring, testPosition, testLike

Signed-off-by: Yuanchun Shen <[email protected]>

* Keep generated code from spotless check

Signed-off-by: Yuanchun Shen <[email protected]>

---------

Signed-off-by: Yuanchun Shen <[email protected]>
LantaoJin
LantaoJin previously approved these changes Mar 20, 2025
Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, no squash merging please.

@penghuo penghuo merged commit 32fc251 into opensearch-project:main Mar 21, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants