JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

RealDeanZhao · 2024-11-01T01:38:08Z

What happened?

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL

Stack Trace

Invalid Input Error: arrow_scan: get_next failed(): java.lang.RuntimeException: Error occurred while getting next schema root.
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:190)
at org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader.loadNextBatch(JdbcArrowReader.java:87)
at org.apache.arrow.c.ArrayStreamExporter$ExportedArrayStreamPrivateData.getNext(ArrayStreamExporter.java:66)
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:112)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:163)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:183)
... 2 more
Caused by: java.lang.UnsupportedOperationException: Cannot get simple type for type DECIMAL
at org.apache.arrow.vector.types.Types$MinorType.getType(Types.java:815)
at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:49)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:98)
... 4 more
�À

How can we reproduce the bug?

Numeric field without scale and precision will cause the error

create table  xxx (
 numeric_a numeric
)

Also tried debug the code and found pg jdbc driver getBigDecimal will return a BigDecimal with precision 1. This will cause the actual error: "BigDecimal precision cannot be greater than that in the Arrow vector'

// org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer.NullableDecimalConsumer.consume
public void consume(ResultSet resultSet) throws SQLException {
// value's scale is 0 and precision is 1, this will cause the error
            BigDecimal value = resultSet.getBigDecimal(this.columnIndexInResultSet);
            if (!resultSet.wasNull()) {
                this.set(value);
            }

            ++this.currentIndex;
        }

// org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale
 public static boolean checkPrecisionAndScale(BigDecimal value, int vectorPrecision, int vectorScale) {
        int var10002;
        if (value.scale() != vectorScale) {
            var10002 = value.scale();
            throw new UnsupportedOperationException("BigDecimal scale must equal that in the Arrow vector: " + var10002 + " != " + vectorScale);
        } else if (value.precision() > vectorPrecision) {
// value precision is 1 and vector precision is 0 
            var10002 = value.precision();
            throw new UnsupportedOperationException("BigDecimal precision cannot be greater than that in the Arrow vector: " + var10002 + " > " + vectorPrecision);
        } else {
            return true;
        }
    }

Environment/Setup

No response

lidavidm · 2024-11-05T01:09:59Z

Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.

For BigQuery, we need to read the type correctly.

Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).

RealDeanZhao · 2024-11-15T03:33:57Z

Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.

For BigQuery, we need to read the type correctly.

Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).

https://arrow.apache.org/cookbook/java/jdbc.html#id5

Is it possible to use a custom JdbcToArrowConfig to avoid this issue? Seems that the JdbcArrowReader use a default config.

   JdbcArrowReader(BufferAllocator allocator, ResultSet resultSet, @Nullable Schema overrideSchema) throws AdbcException {
        super(allocator);
        JdbcToArrowConfig config = makeJdbcConfig(allocator);

RealDeanZhao added the Type: bug Something isn't working label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

RealDeanZhao commented Nov 1, 2024 •

edited

Loading

lidavidm commented Nov 5, 2024

RealDeanZhao commented Nov 15, 2024

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

Comments

RealDeanZhao commented Nov 1, 2024 • edited Loading

What happened?

Stack Trace

How can we reproduce the bug?

Environment/Setup

lidavidm commented Nov 5, 2024

RealDeanZhao commented Nov 15, 2024

RealDeanZhao commented Nov 1, 2024 •

edited

Loading