feat(go/adbc/driver/bigquery): add `BIGQUERY:type` field metadata #3604

serramatutu · 2025-10-21T13:26:20Z

Motivation

The Type metadata key has two limitations which stems from BigQuery's API:

it says fields of type ARRAY<T> are just T with Repeated=true
it says STRUCT<...> fields are simply RECORD, and erases any information about the inner fields.

These limitations can cause problems when trying to parse the Type key or when using it verbatim against the warehouse in a statement, e.g a CREATE TABLE statement or a AS T cast.

Summary

This PR adds a new BIGQUERY:type key that formats the original SQL string as specified by BigQuery.

Most types remain unchanged as they come from gobigquery, and in those cases this key will contain the same value as Type.

However, arrays and structs get transformed to match the richer type string.

Testing

I ran a CREATE TABLE AS query against BigQuery. Here's the result for fields of different types

[1] Regular non-nested types are simply copied over from the value of Type

1

[2] An array of integers becomes ARRAY<INTEGER>, while Type remains INTEGER

2

[3] An array of structs becomes ARRAY<STRUCT<...>>

3

[4] A struct of arrays' inner types are ARRAY<...>

4

[5] A deeply nested struct also has the correct inner types

5

Related issues

Proposal for standardization of metadata keys of schema fields containing the SQL type string #3449

The `Type` metadata key has two limitations which stems from BigQuery's API: 1. it says fields of type `ARRAY<T>` are just `T` with `Repeated=true` 2. it says `STRUCT<...>` fields are simply `RECORD`, and erases any information about the inner fields. These limitations can cause problems when trying to parse the `Type` key or when using it verbating against the warehouse in a statement, e.g a `CREATE TABLE` statement or a `AS T` cast. This PR adds a new `BIGQUERY:type` key that formats the original SQL string as specified by BigQuery. Most types remain unchanged as they come from `gobigquery`, and in those cases this key will contain the same value as `Type`. However, arrays and structs get transformed to match the richer type string.

serramatutu · 2025-10-21T13:45:45Z

Hmmm seeing some Python failures in CI, not sure how they're related?

lidavidm · 2025-10-22T04:35:43Z

go/adbc/driver/bigquery/connection.go

 	metadata["Repeated"] = strconv.FormatBool(schema.Repeated)
 	metadata["Required"] = strconv.FormatBool(schema.Required)
 	field.Nullable = !schema.Required
 	metadata["Type"] = string(schema.Type)


Do we want to keep this? Should we rename it to something like BIGQUERY:simple_type?

I was wondering the same thing. I thought of keeping it as-is to avoid breaking changes to end users. But if the project maintainers are fine with a breaking change I can do it!

I would personally prefer we namespace all the properties now that we want to introduce this convention. Possibly we can keep the existing properties under their current name with a deprecation notice.

@lidavidm I pushed a new commit that does that. I called it BIGQUERY:raw_type since it's the "raw" unprocessed thing coming directly from the API. I think this is a bit more descriptive than simple_type.

I am not sure if this should be split across two PRs though. IMO these should be two separate changelog entries: one for standardizing the keys under BIGQUERY:... and another for adding the new rich type key.

If that's the case I can merge the last commit first in a separate PR, then rebase this one on top of that.

@lidavidm I think the convention in the BigQuery driver is that all the fields in the JSON response are copies as-is into the Arrow field metadata.

That's not a law of physics, though. We can change things.

If you want to defer this to a separate PR, that's fine by me. But I think they should be consistent.

@serramatutu can you move the last commit to a separate PR? So we can merge this one with just the BIGQUERY:type addition.

I just removed the last commit from this branch. I have it on a separate branch and I can open a followup PR after this one to standardize all keys.

go/adbc/driver/bigquery/connection.go

felipecrv · 2025-10-30T02:09:15Z

Merging because the failing checks are Meson+PG and CMake specific.

serramatutu requested a review from zeroshade as a code owner October 21, 2025 13:26

github-actions bot added this to the ADBC Libraries 21 milestone Oct 21, 2025

serramatutu changed the title ~~feat(adbc/go/driver/bigquery): add BIGQUERY:type field metadata~~ feat(go/adbc/driver/bigquery): add BIGQUERY:type field metadata Oct 21, 2025

serramatutu mentioned this pull request Oct 21, 2025

Proposal for standardization of metadata keys of schema fields containing the SQL type string #3449

Open

felipecrv requested a review from lidavidm October 21, 2025 16:08

felipecrv approved these changes Oct 21, 2025

View reviewed changes

lidavidm mentioned this pull request Oct 22, 2025

[Go] Cherry-pick type change adbc-drivers/bigquery#7

Open

lidavidm reviewed Oct 22, 2025

View reviewed changes

felipecrv reviewed Oct 22, 2025

View reviewed changes

go/adbc/driver/bigquery/connection.go Outdated Show resolved Hide resolved

serramatutu force-pushed the serramatutu/upstream/bigquery-rich-type-string branch from c87220a to bb7440d Compare October 27, 2025 16:08

felipecrv approved these changes Oct 27, 2025

View reviewed changes

lidavidm approved these changes Oct 29, 2025

View reviewed changes

felipecrv merged commit 6a82d7b into apache:main Oct 30, 2025
77 of 90 checks passed

felipecrv deleted the serramatutu/upstream/bigquery-rich-type-string branch October 30, 2025 02:09

feat(go/adbc/driver/bigquery): add BIGQUERY:type field metadata #3604

feat(go/adbc/driver/bigquery): add BIGQUERY:type field metadata #3604

Uh oh!

Conversation

serramatutu commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Summary

Testing

Related issues

Uh oh!

serramatutu commented Oct 21, 2025

Uh oh!

lidavidm Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

lidavidm Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serramatutu Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

felipecrv Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

lidavidm Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

felipecrv Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

serramatutu Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

felipecrv commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(go/adbc/driver/bigquery): add `BIGQUERY:type` field metadata #3604

feat(go/adbc/driver/bigquery): add `BIGQUERY:type` field metadata #3604

serramatutu commented Oct 21, 2025 •

edited

Loading

lidavidm Oct 22, 2025 •

edited

Loading

serramatutu Oct 22, 2025 •

edited

Loading