Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Sum multiple null values should return null instead of 0 #3408

Open
LantaoJin opened this issue Mar 11, 2025 · 4 comments
Open

[BUG] Sum multiple null values should return null instead of 0 #3408

LantaoJin opened this issue Mar 11, 2025 · 4 comments
Labels
breaking bug Something isn't working calcite calcite migration releated

Comments

@LantaoJin
Copy link
Member

LantaoJin commented Mar 11, 2025

What is the bug?
In most databases. Below query will return null

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    name VARCHAR(50),
    department VARCHAR(50),
    salary INT
);

INSERT INTO employees VALUES
(1, 'John', 'IT', 60000),
(2, 'Alice', 'IT', 55000),
(3, 'Bob', 'IT', null),
(4, 'Sarah', 'HR', 52000),
(5, 'Mike', 'HR', 48000),
(6, 'Lisa', 'Sales', null),
(7, 'Tom', 'Sales', null);

select sum(salary) from employees where department = 'Sales';

Return

+-------------+
| sum(salary) |
+-------------+
|        NULL |
+-------------+

But in PPL v2, it return 0. For example

  @Test
  public void testSumNull() {
    String response =
        execute(
            String.format(
                "source=%s | where department = 'Sales' | stats sum(salary)",
                TEST_INDEX_BANK_WITH_NULL_VALUES));
    assertEquals(
        ""
            + "{\n"
            + "  \"schema\": [\n"
            + "    {\n"
            + "      \"name\": \"sum(salary)\",\n"
            + "      \"type\": \"long\"\n"
            + "    }\n"
            + "  ],\n"
            + "  \"datarows\": [\n"
            + "    [\n"
            + "      0\n"
            + "    ]\n"
            + "  ],\n"
            + "  \"total\": 1,\n"
            + "  \"size\": 1\n"
            + "}",
        response);
  }

Current the v3 (Calcite) will return null as well. Should we change this behaviour align with other databases?

@LantaoJin LantaoJin added bug Something isn't working untriaged calcite calcite migration releated and removed untriaged labels Mar 11, 2025
@LantaoJin
Copy link
Member Author

LantaoJin commented Mar 11, 2025

If to align with other database, then will introduce a breaking change. cc @dai-chen @penghuo

@LantaoJin LantaoJin changed the title [BUG] Sum multiple null value should return 0 instead of null [BUG] Sum multiple null values should return null instead of 0 Mar 11, 2025
@LantaoJin
Copy link
Member Author

The root cause is the behavior of return 0 is from OpenSearch (agg pushdown applied). When we disable agg pushdown, the test case above will return null.

So I think it should be a bug of OpenSearch.

@LantaoJin
Copy link
Member Author

cc @gaobinlong

@gaobinlong
Copy link

It should be a bug, created an issue in OpenSearch core.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking bug Something isn't working calcite calcite migration releated
Projects
None yet
Development

No branches or pull requests

2 participants