Description
What is the bug?
This is a bug for three related issues around booleans in WHERE clauses. I suspect without proof that they all have a similar root cause. All examples here use the ecommerce dataset as an example index, but should be reproducible with any index.
How can one reproduce the bug?
WHERE FALSE
matches all records.
Query:
POST _plugins/_sql/_explain
{
"query": "SELECT COUNT(*) FROM opensearch_dashboards_sample_data_ecommerce WHERE FALSE"
}
Result:
{
"schema": [
{
"name": "COUNT(*)",
"type": "integer"
}
],
"datarows": [
[
4675 // should be 0
]
],
"total": 1,
"size": 1,
"status": 200
}
WHERE NOT(x)
causes an error for any constant boolean.
Query:
POST _plugins/_sql
{
"query": "SELECT * FROM opensearch_dashboards_sample_data_ecommerce WHERE NOT(FALSE)"
}
Result:
{
"error": {
"reason": "Invalid SQL query",
"details": "inner bool query clause cannot be null",
"type": "IllegalArgumentException"
},
"status": 400
}
The same happens with WHERE NOT(TRUE)
, WHERE NOT(NOT(FALSE))
, WHERE NOT(FALSE OR TRUE)
, and similar.
- Internal druid exceptions raised when building expressions with
NOT(NULL)
.
Query:
POST _plugins/_sql
{
"query": "SELECT * FROM opensearch_dashboards_sample_data_ecommerce WHERE TRUE = NOT(NULL)"
}
Result:
{
"error": {
"reason": "Invalid SQL query",
"details": "err find condition class com.alibaba.druid.sql.ast.expr.SQLBooleanExpr",
"type": "SqlParseException"
},
"status": 400
}
NOT(NULL)
is equivalent to NULL
, so this whole expression should be equivalent to WHERE NULL
and return no records1.
What is the expected behavior?
These expressions should be correctly evaluated and applied. They're somewhat weird examples for human-written queries, but automatic query builders may produce queries like this, especially for WHERE FALSE
.
What is your host/environment?
- SQL: 0e61d20
Do you have any screenshots?
N/A
Do you have any additional context?
Found by the WIP distributed-testing suite. See: #3220
Footnotes
-
It's worth noting that SQL really uses ternary logic: NULL lives among the typical boolean values and the 3 values generate their own truth tables. This is (for better or for worse) pretty fundamental to SQL's operation and is the principle that TLP is built on. As such, in boolean handling, we should really treat NULL as a bona fide boolean. ↩