[FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type #1057

penghuo · 2025-02-17T22:16:22Z

Is your feature request related to a problem?

create table in spark

CREATE TABLE numeric_types_table (
  tinyint_col    TINYINT, 
  smallint_col   SMALLINT, 
  int_col        INT,       
  bigint_col     BIGINT,     
  float_col      FLOAT,       
  double_col     DOUBLE,      
  decimal_col    DECIMAL(10,2)
)

The expectation of PPL describe command show OpenSearch PPL data type , but the actual result is Spark SQL data types.

>>> describe numeric_types_table;
tinyint_col             tinyint
smallint_col            smallint
int_col                 int
bigint_col              bigint
float_col               float
double_col              double
decimal_col             decimal(10,2)

What solution would you like?
[RFC] Unified PPL Data Type

The text was updated successfully, but these errors were encountered:

LantaoJin · 2025-02-18T09:57:35Z

My first thought was that the type system should be engine-related rather than language-related. Because we can't predict which engines PPL will be used with in future. This means we'd have to match all PPL types with these execution engines if we want to do alignment work. As we known the execution engines are concrete and designed for specific scenarios, while language definitions are usually more generalized. However, looking at it from another perspective, it is necessary to define types for a language. Even ANSI SQL has its predefined types (although these types generally need to be implemented or mapped in various SQL execution engines).

But this is challenge work to align all engine data type with PPL data type. The current OpenSearch PPL data type IMO, it looks more like OpenSearch data type. Fortunately, Spark supports relatively few types. It's almost a subset of OpenSearch's types. For example, we are adding more OpenSearch type to Spark: #1044

penghuo added enhancement New feature or request untriaged and removed untriaged labels Feb 17, 2025

github-actions bot added the untriaged label Feb 17, 2025

penghuo removed the untriaged label Feb 17, 2025

penghuo changed the title ~~[FEATURE] Spark PPL does not aligned with OpenSearch PPL Data Type~~ [FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type Feb 17, 2025

penghuo mentioned this issue Feb 19, 2025

[FEATURE] Unify OpenSearch PPL Data Type and Spark PPL Data Type opensearch-project/sql#3339

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type #1057

[FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type #1057

penghuo commented Feb 17, 2025

LantaoJin commented Feb 18, 2025 •

edited

Loading

[FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type #1057

[FEATURE] Align Spark PPL Data Type with OpenSearch PPL Data Type #1057

Comments

penghuo commented Feb 17, 2025

LantaoJin commented Feb 18, 2025 • edited Loading

LantaoJin commented Feb 18, 2025 •

edited

Loading