Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL #2297

Open
RealDeanZhao opened this issue Nov 1, 2024 · 2 comments
Labels
Type: bug Something isn't working

Comments

@RealDeanZhao
Copy link

RealDeanZhao commented Nov 1, 2024

What happened?

JDBC adapter query Postgres numeric field error: Cannot get simple type for type DECIMAL

Stack Trace

Invalid Input Error: arrow_scan: get_next failed(): java.lang.RuntimeException: Error occurred while getting next schema root.
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:190)
at org.apache.arrow.adbc.driver.jdbc.JdbcArrowReader.loadNextBatch(JdbcArrowReader.java:87)
at org.apache.arrow.c.ArrayStreamExporter$ExportedArrayStreamPrivateData.getNext(ArrayStreamExporter.java:66)
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:112)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:163)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:183)
... 2 more
Caused by: java.lang.UnsupportedOperationException: Cannot get simple type for type DECIMAL
at org.apache.arrow.vector.types.Types$MinorType.getType(Types.java:815)
at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:49)
at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:98)
... 4 more
�
How can we reproduce the bug?

Numeric field without scale and precision will cause the error

create table  xxx (
 numeric_a numeric
)

Also tried debug the code and found pg jdbc driver getBigDecimal will return a BigDecimal with precision 1. This will cause the actual error: "BigDecimal precision cannot be greater than that in the Arrow vector'

// org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer.NullableDecimalConsumer.consume
public void consume(ResultSet resultSet) throws SQLException {
// value's scale is 0 and precision is 1, this will cause the error
            BigDecimal value = resultSet.getBigDecimal(this.columnIndexInResultSet);
            if (!resultSet.wasNull()) {
                this.set(value);
            }

            ++this.currentIndex;
        }
// org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale
 public static boolean checkPrecisionAndScale(BigDecimal value, int vectorPrecision, int vectorScale) {
        int var10002;
        if (value.scale() != vectorScale) {
            var10002 = value.scale();
            throw new UnsupportedOperationException("BigDecimal scale must equal that in the Arrow vector: " + var10002 + " != " + vectorScale);
        } else if (value.precision() > vectorPrecision) {
// value precision is 1 and vector precision is 0 
            var10002 = value.precision();
            throw new UnsupportedOperationException("BigDecimal precision cannot be greater than that in the Arrow vector: " + var10002 + " > " + vectorPrecision);
        } else {
            return true;
        }
    }

Environment/Setup

No response

@RealDeanZhao RealDeanZhao added the Type: bug Something isn't working label Nov 1, 2024
@lidavidm
Copy link
Member

lidavidm commented Nov 5, 2024

Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.

For BigQuery, we need to read the type correctly.

Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).

@RealDeanZhao
Copy link
Author

Hmm, Postgres NUMERIC fields without a fixed precision/scale can't actually be supported by Arrow because those are variable/unlimited precision and Arrow assumes a fixed precision per field.

For BigQuery, we need to read the type correctly.

Note that we have been considering a JNI bridge to use the native ADBC drivers for both these databases. That should be faster than the JDBC driver and should handle these cases better as the drivers have had more individual attention for each database's quirks (vs for JDBC which just tries to generically adapt the results from JDBC).

https://arrow.apache.org/cookbook/java/jdbc.html#id5

Is it possible to use a custom JdbcToArrowConfig to avoid this issue? Seems that the JdbcArrowReader use a default config.

   JdbcArrowReader(BufferAllocator allocator, ResultSet resultSet, @Nullable Schema overrideSchema) throws AdbcException {
        super(allocator);
        JdbcToArrowConfig config = makeJdbcConfig(allocator);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants