-
Notifications
You must be signed in to change notification settings - Fork 128
Description
I've run into this issue on every environment I've tried it in (MacOS Sonoma, MX Linux on kernel 5.14, Rocky Linux 8.10 on kernel 4.18), all using OpenJDK 21.0.7. This is with google-cloud-bigquery v2.52.0
The best practices document (https://cloud.google.com/apis/docs/client-libraries-best-practices) indicates that you should reuse the same client object across multiple requests. However, there appears to be a thread safety issue involving JSON parsing when doing this. When using a shared BigQuery client (obtained with BigQueryOptions.newBuilder().build().service()
) in multiple threads, simultaneous calls can trigger the following exception:
com.google.cloud.bigquery.BigQuerySQLException: java.lang.NoSuchMethodError: "com.google.gson.stream.JsonWriter com.google.gson.stream.JsonWriter.value(float)"
at com.google.cloud.bigquery.ConnectionImpl.getExecuteSelectResponse(ConnectionImpl.java:264)
at com.google.cloud.bigquery.ConnectionImpl.executeSelect(ConnectionImpl.java:213)
...
This particular example happened while using a unique Connection object (obtained from createConnection()
on the shared BigQuery client) and calling executeSelect()
, but I've also seen it happen using the normal query()
API call that returns a TableResult
.
I attempted to work around this issue by having each thread obtain its own client by calling build().service()
on a shared BigQueryOptions.Builder
, with the build().service()
call and the subsequent createConnection()
call being effectively single-threaded by way of a ReentrantReadWriteLock
just to be safe, but this didn't seem to have the effect I desired, making me wonder if this is actually a concurrency issue at all.
Doing some more tests, I've found that it only happens on non-cached queries after enough time has passed (about an hour) since the last request attempts were made, and only when several requests are being made simultaneously. Wrapping a retry loop around the executeSelect()
call with a random sleep between 250 and 500ms seems to do the trick, usually resolving after the first retry.