Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified OpenSearch PPL Data Type #3345

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions core/src/main/java/org/opensearch/sql/lang/LangSpec.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.lang;

import static org.opensearch.sql.lang.LangSpec.LangType.PPL;

import org.opensearch.sql.data.type.ExprType;

/**
* Represents a language specification for query processing.
*
* <p>This interface defines basic methods for language-specific behaviors, such as determining the
* language type and mapping expression types to type names. Two language specifications are
* provided: SQL and PPL.
*/
public interface LangSpec {
/** Enumerates the supported language types. */
enum LangType {
/** SQL language specification. */
SQL,
/** PPL (Piped Processing Language) language specification. */
PPL
}

/** The default SQL language specification instance. */
LangSpec SQL_SPEC = new LangSpec() {};

/**
* Returns a language specification instance based on the provided language name.
*
* @param language the name of the language, case-insensitive.
* @return the PPL language specification if the language is PPL (ignoring case); otherwise, the
* SQL language specification.
*/
static LangSpec fromLanguage(String language) {
if (PPL.name().equalsIgnoreCase(language)) {
return PPLLangSpec.PPL_SPEC;
} else {
return SQL_SPEC;
}
}

/**
* Returns the language type of this specification.
*
* <p>By default, the language is considered SQL.
*
* @return the language type, SQL by default.
*/
default LangType language() {
return LangType.SQL;
}

/**
* Returns the type name for the given expression type.
*
* <p>This default implementation returns the result of {@code exprType.typeName()}.
*
* @param exprType the expression type.
* @return the type name of the expression.
*/
default String typeName(ExprType exprType) {
return exprType.typeName();
}
}
50 changes: 50 additions & 0 deletions core/src/main/java/org/opensearch/sql/lang/PPLLangSpec.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.lang;

import java.util.HashMap;
import java.util.Map;
import org.opensearch.sql.data.type.ExprCoreType;
import org.opensearch.sql.data.type.ExprType;

/**
* PPL language specification implementation.
*
* <p>This class provides a singleton implementation of {@link LangSpec} for PPL. It defines a
* custom mapping from expression types to PPL type names.
*/
public class PPLLangSpec implements LangSpec {

public static final PPLLangSpec PPL_SPEC = new PPLLangSpec();

private static Map<ExprType, String> exprTypeToPPLType = new HashMap<>();

static {
exprTypeToPPLType.put(ExprCoreType.BYTE, "tinyint");
exprTypeToPPLType.put(ExprCoreType.SHORT, "smallint");
exprTypeToPPLType.put(ExprCoreType.INTEGER, "int");
exprTypeToPPLType.put(ExprCoreType.LONG, "bigint");
}
Comment on lines +25 to +30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a breaking change, why don't we directly change the old type to the new type, instead of introducing LangSpec? Isn't this unnecessarily adding complexity?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The core type was not upgraded because I intended for the data type changes to affect only PPL and not SQL. This PR focuses on unifying PPL data types, while SQL data types can be addressed in a separate issue, as changes there would impact JDBC, ODBC, and CLI.

Ideally, the query engine should use well-defined data types, with LangSpec serving as the protocol for translating these engine types to language-specific types. Once the Calcite implementation is complete, CalciteDataType will translate to ExprDataType, and LangSpec will translate from ExprDataType to the PPL response data type.


private PPLLangSpec() {}

@Override
public LangType language() {
return LangType.PPL;
}

/**
* Returns the corresponding PPL type name for the given expression type. If the expression type
* is not mapped, it returns the default type name.
*
* @param exprType the expression type.
* @return the PPL type name associated with the expression type, or the default type name.
*/
@Override
public String typeName(ExprType exprType) {
return exprTypeToPPLType.getOrDefault(exprType, exprType.typeName());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.experimental.UtilityClass;
import org.opensearch.sql.lang.LangSpec;

/** System Index Utils. Todo. Find the better name for this class. */
@UtilityClass
Expand Down Expand Up @@ -39,7 +40,47 @@ public static Boolean isSystemIndex(String indexName) {
* @return system mapping table.
*/
public static String mappingTable(String indexName) {
return String.join(".", indexName, SYS_MAPPINGS_SUFFIX);
return mappingTable(indexName, LangSpec.SQL_SPEC);
}

public static String mappingTable(String indexName, LangSpec langSpec) {

return String.join(".", indexName, encodeLangSpec(langSpec));
}

/**
* Encodes the language specification into a system mappings suffix.
*
* <p>The returned suffix is composed of the language name (e.g., "SQL" or "PPL") concatenated
* with an underscore and the system mappings suffix constant. For example:
* "SQL_MAPPINGS_ODFE_SYS_TABLE".
*
* @param spec the language specification.
* @return the encoded system mappings suffix.
*/
public static String encodeLangSpec(LangSpec spec) {
return spec.language().name() + "_" + SYS_MAPPINGS_SUFFIX;
}

/**
* Extracts the language specification from a given system mappings suffix.
*
* <p>This method expects the suffix to start with the language name followed by an underscore.
* For example, given "SQL_MAPPINGS_ODFE_SYS_TABLE", it extracts "SQL" and returns the
* corresponding language specification via {@link LangSpec#fromLanguage(String)}. If the expected
* format is not met, the default SQL specification is returned.
*
* @param systemMappingsSuffix the system mappings suffix.
* @return the language specification extracted from the suffix, or {@link LangSpec#SQL_SPEC} if
* the format is invalid.
*/
public static LangSpec extractLangSpec(String systemMappingsSuffix) {
int underscoreIndex = systemMappingsSuffix.indexOf('_');
if (underscoreIndex <= 0) {
return LangSpec.SQL_SPEC;
}
String langName = systemMappingsSuffix.substring(0, underscoreIndex);
return LangSpec.fromLanguage(langName);
}

/**
Expand All @@ -52,10 +93,10 @@ public static SystemTable systemTable(String indexName) {
String suffix = indexName.substring(lastDot + 1);
String tableName = indexName.substring(0, lastDot).replace("%", "*");

if (suffix.equalsIgnoreCase(SYS_META_SUFFIX)) {
if (suffix.endsWith(SYS_META_SUFFIX)) {
return new SystemInfoTable(tableName);
} else if (suffix.equalsIgnoreCase(SYS_MAPPINGS_SUFFIX)) {
return new MetaInfoTable(tableName);
} else if (suffix.endsWith(SYS_MAPPINGS_SUFFIX)) {
return new MetaInfoTable(tableName, extractLangSpec(suffix));
} else {
throw new IllegalStateException("Invalid system index name: " + indexName);
}
Expand All @@ -66,6 +107,10 @@ public interface SystemTable {

String getTableName();

default LangSpec getLangSpec() {
return LangSpec.SQL_SPEC;
}

default boolean isSystemInfoTable() {
return false;
}
Expand Down Expand Up @@ -93,6 +138,7 @@ public boolean isSystemInfoTable() {
public static class MetaInfoTable implements SystemTable {

private final String tableName;
private final LangSpec langSpec;

public boolean isMetaInfoTable() {
return true;
Expand Down
26 changes: 26 additions & 0 deletions core/src/test/java/org/opensearch/sql/lang/LangSpecTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.lang;

import static org.junit.jupiter.api.Assertions.*;

import org.junit.jupiter.api.Test;
import org.opensearch.sql.data.type.ExprCoreType;

class LangSpecTest {
@Test
public void testFromLanguageSQL() {
LangSpec spec = LangSpec.fromLanguage("SQL");
assertEquals(LangSpec.LangType.SQL, spec.language(), "Expected language type to be SQL");
assertSame(LangSpec.SQL_SPEC, spec, "Expected SQL_SPEC instance for SQL language.");
}

@Test
public void testSQLSpecDefaultTypeName() {
String result = LangSpec.SQL_SPEC.typeName(ExprCoreType.BYTE);
assertEquals("BYTE", result, "SQL_SPEC should return the expression type's default type name.");
}
}
43 changes: 43 additions & 0 deletions core/src/test/java/org/opensearch/sql/lang/PPLLangSpecTest.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.lang;

import static org.junit.jupiter.api.Assertions.*;
import static org.opensearch.sql.lang.PPLLangSpec.PPL_SPEC;

import org.junit.jupiter.api.Test;
import org.opensearch.sql.data.type.ExprCoreType;

class PPLLangSpecTest {
/** Tests that the language type of the PPL specification is PPL. */
@Test
public void testPPLSpecLanguage() {
PPLLangSpec spec = PPL_SPEC;
assertEquals(LangSpec.LangType.PPL, spec.language(), "Expected language to be PPL.");
}

/**
* Tests that the PPL specification returns the correct type name mapping for known expression
* types. Assumes that ExprCoreType constants are available.
*/
@Test
public void testPPLSpecTypeNameMapping() {
PPLLangSpec spec = PPL_SPEC;
assertEquals("tinyint", spec.typeName(ExprCoreType.BYTE), "BYTE should map to tinyint.");
assertEquals("smallint", spec.typeName(ExprCoreType.SHORT), "SHORT should map to smallint.");
assertEquals("int", spec.typeName(ExprCoreType.INTEGER), "INTEGER should map to int.");
assertEquals("bigint", spec.typeName(ExprCoreType.LONG), "LONG should map to bigint.");
}

/**
* Tests that an unmapped expression type returns its default type name in the PPL specification.
*/
@Test
public void testPPLSpecDefaultTypeName() {
String result = PPL_SPEC.typeName(ExprCoreType.STRING);
assertEquals("STRING", result, "Unmapped expression type should return its default type name.");
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
import static org.opensearch.sql.utils.SystemIndexUtils.systemTable;

import org.junit.jupiter.api.Test;
import org.opensearch.sql.lang.LangSpec;
import org.opensearch.sql.lang.PPLLangSpec;

class SystemIndexUtilsTest {

Expand All @@ -25,7 +27,7 @@ void test_system_index() {

@Test
void test_compose_mapping_table() {
assertEquals("employee.MAPPINGS_ODFE_SYS_TABLE", mappingTable("employee"));
assertEquals("employee.SQL_MAPPINGS_ODFE_SYS_TABLE", mappingTable("employee"));
}

@Test
Expand Down Expand Up @@ -61,4 +63,50 @@ void throw_exception_for_invalid_index() {
assertThrows(IllegalStateException.class, () -> systemTable("employee._ODFE_SYS_TABLE"));
assertEquals("Invalid system index name: employee._ODFE_SYS_TABLE", exception.getMessage());
}

/** Tests that encoding the SQL specification produces the expected suffix. */
@Test
public void testEncodeLangSpecSQL() {
String expected = "SQL_MAPPINGS_ODFE_SYS_TABLE";
String encoded = SystemIndexUtils.encodeLangSpec(LangSpec.SQL_SPEC);
assertEquals(expected, encoded, "Encoded SQL lang spec should match expected suffix.");
}

/** Tests that encoding the PPL specification produces the expected suffix. */
@Test
public void testEncodeLangSpecPPL() {
String expected = "PPL_MAPPINGS_ODFE_SYS_TABLE";
String encoded = SystemIndexUtils.encodeLangSpec(PPLLangSpec.PPL_SPEC);
assertEquals(expected, encoded, "Encoded PPL lang spec should match expected suffix.");
}

/** Tests that extracting the language specification from a valid SQL suffix returns SQL_SPEC. */
@Test
public void testExtractLangSpecValidSQL() {
LangSpec spec = SystemIndexUtils.extractLangSpec("SQL_MAPPINGS_ODFE_SYS_TABLE");
assertEquals(LangSpec.SQL_SPEC, spec, "Extracting from SQL suffix should return SQL_SPEC.");
}

/** Tests that extracting the language specification from a valid PPL suffix returns PPL_SPEC. */
@Test
public void testExtractLangSpecValidPPL() {
LangSpec spec = SystemIndexUtils.extractLangSpec("PPL_MAPPINGS_ODFE_SYS_TABLE");
assertEquals(PPLLangSpec.PPL_SPEC, spec, "Extracting from PPL suffix should return PPL_SPEC.");
}

/** Tests that an improperly formatted suffix defaults to SQL_SPEC. */
@Test
public void testExtractLangSpecInvalidFormat() {
assertEquals(
LangSpec.SQL_SPEC,
SystemIndexUtils.extractLangSpec("FORMAT"),
"Invalid format should default to SQL_SPEC.");
}

@Test
public void testGetLangSpec() {
assertEquals(
LangSpec.SQL_SPEC,
new SystemIndexUtils.SystemInfoTable("employee.MAPPINGS_ODFE_SYS_TABLE").getLangSpec());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,10 @@ public void test_numeric_data_types() throws IOException {
JSONObject result = executeQuery(String.format("source=%s", TEST_INDEX_DATATYPE_NUMERIC));
verifySchema(
result,
schema("long_number", "long"),
schema("integer_number", "integer"),
schema("short_number", "short"),
schema("byte_number", "byte"),
schema("long_number", "bigint"),
schema("integer_number", "int"),
schema("short_number", "smallint"),
schema("byte_number", "tinyint"),
schema("double_number", "double"),
schema("float_number", "float"),
schema("half_float_number", "float"),
Expand Down Expand Up @@ -73,10 +73,10 @@ public void test_long_integer_data_type() throws IOException {
TEST_INDEX_DATATYPE_NUMERIC));
verifySchema(
result,
schema("int1", "integer"),
schema("int2", "integer"),
schema("long1", "long"),
schema("long2", "long"));
schema("int1", "int"),
schema("int2", "int"),
schema("long1", "bigint"),
schema("long2", "bigint"));
}

@Test
Expand All @@ -86,6 +86,6 @@ public void test_alias_data_type() throws IOException {
String.format(
"source=%s | where alias_col > 1 " + "| fields original_col, alias_col ",
TEST_INDEX_ALIAS));
verifySchema(result, schema("original_col", "integer"), schema("alias_col", "integer"));
verifySchema(result, schema("original_col", "int"), schema("alias_col", "int"));
}
}
Loading
Loading