Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Mismatch in documented and actual function argument types #3494

Open
Tracked by #3487
yuancu opened this issue Mar 31, 2025 · 1 comment
Open
Tracked by #3487

[BUG] Mismatch in documented and actual function argument types #3494

yuancu opened this issue Mar 31, 2025 · 1 comment
Labels
breaking bug Something isn't working

Comments

@yuancu
Copy link
Contributor

yuancu commented Mar 31, 2025

What is the bug?

I found there are mismatches between declared and actual accepting argument types in datetime functions.

The documentation of datetime functions: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.rst

Accepting TIME, but not declared to do so

  • DAY
  • DAY_OF_WEEK
  • DAYOFWEEK
  • DAY_OF_YEAR
  • DAYOFYEAR
  • DAYNAME
  • DAYOFMONTH
  • DAY_OF_MONTH
  • MONTHNAME
  • QUARTER
  • TO_DAYS
  • YEAR
  • DATE
  • MONTH
  • MONTH_OF_YEAR
  • LAST_DAY
  • WEEK_DAY
  • DAYOFWEEK
  • UNIX_TIMESTAMP

NOT Accepting TIMESTAMP, but declared to do so:

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Send a PPL request to local cluster

    curl --location 'http://localhost:9200/_plugins/_ppl' \
    --header 'Content-Type: application/json' \
    --data '{
      "query" : "source=[ANY_INDEX]| head 1 | eval d = unix_timestamp(time('\''12:00:00'\'')) | fields d"
    }
    '
    
  2. It returns a value, assuming that the date is today. However, TIME is not supported as an argument for unix_timestamp. This behavior is undocumented.

    {
        "schema": [
            {
                "name": "d",
                "type": "double"
            }
        ],
        "datarows": [
            [
                1.7434224E9
            ]
        ],
        "total": 1,
        "size": 1
    }

Moreover, when I pass a time from the index, it reports an error, but with unexpected error message:

  1. Send a ppl query: source=dates | head 1 | eval d = unix_timestamp(time) | fields d, where time is of type TIME.

  2. It reports an error like below:

    {
      "error": {
        "reason": "Invalid Query",
        "details": "invalid to get doubleValue from value of type TIME",
        "type": "ExpressionEvaluationException"
      },
      "status": 400
    }
    

What is the expected behavior?
To report an error on the incompatible argument, resembling the following:

{
  "error": {
    "reason": "Invalid Query",
    "details": "unix_timestamp function expected {[], [DOUBLE],[DATE],[TIMESTAMP]}, but get [TIME]",
    "type": "ExpressionEvaluationException"
  },
  "status": 400
}

What is your host/environment?

  • OS: MacOS
  • Version: 2.19.1
  • Plugins: opensearch-sql 2.19.1.0 & opensearch-job-scheduler 2.19.1.0
@yuancu yuancu added bug Something isn't working untriaged labels Mar 31, 2025
@yuancu yuancu changed the title [BUG] Mismatch in declared and actual function argument types [BUG] Mismatch in documented and actual function argument types Mar 31, 2025
@LantaoJin
Copy link
Member

LantaoJin commented Mar 31, 2025

unix_timestamp(time('\''12:00:00'\'')) triggers implicit conversion, converting input parameters from time type to timestamp type. I propose removing this implicit conversion in 3.0.0-beta for 4 reasons:

  1. The implicit conversion from time to timestamp type can easily lead to semantic errors. Most databases do not allow such implicit conversion. For example, in PostgreSQL.
select unix_timestamp(time '12:00:00' );

will fail with

LINE 1: select unix_timestamp(time '12:00:00' );
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

MySQL is an exception.

  1. Spark does not support the time data type.
  2. time is not declared as an input parameter in doc, https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/datetime.rst#utc-timestamp
  3. Support implicit conversion (time -> timestamp) for those functions in Calcite imports additional efforts.

@penghuo @dai-chen , any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants