Skip to content

Commit 60d069a

Browse files
committed
bq: add gcp_bigquery_write_api enterprise output
New enterprise-only output that streams data into BigQuery using the Storage Write API. Supports JSON and Protobuf message formats, default stream type, connection multiplexing via managed stream cache, and IAM/credential-based authentication.
1 parent a5aef23 commit 60d069a

7 files changed

Lines changed: 1286 additions & 3 deletions

File tree

Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
= gcp_bigquery_write_api
2+
:type: output
3+
:status: stable
4+
:categories: ["GCP","Services"]
5+
6+
7+
8+
////
9+
THIS FILE IS AUTOGENERATED!
10+
11+
To make changes, edit the corresponding source file under:
12+
13+
https://github.com/redpanda-data/connect/tree/main/internal/impl/<provider>.
14+
15+
And:
16+
17+
https://github.com/redpanda-data/connect/tree/main/cmd/tools/docs_gen/templates/plugin.adoc.tmpl
18+
////
19+
20+
// © 2024 Redpanda Data Inc.
21+
22+
23+
component_type_dropdown::[]
24+
25+
26+
Streams data into BigQuery using the Storage Write API.
27+
28+
Introduced in version 4.87.0.
29+
30+
31+
[tabs]
32+
======
33+
Common::
34+
+
35+
--
36+
37+
```yml
38+
# Common config fields, showing default values
39+
output:
40+
label: ""
41+
gcp_bigquery_write_api:
42+
project: ""
43+
dataset: "" # No default (required)
44+
table: "" # No default (required)
45+
message_format: json
46+
credentials_json: ""
47+
max_in_flight: 64
48+
batching:
49+
count: 0
50+
byte_size: 0
51+
period: ""
52+
check: ""
53+
```
54+
55+
--
56+
Advanced::
57+
+
58+
--
59+
60+
```yml
61+
# All config fields, showing default values
62+
output:
63+
label: ""
64+
gcp_bigquery_write_api:
65+
project: ""
66+
dataset: "" # No default (required)
67+
table: "" # No default (required)
68+
message_format: json
69+
credentials_json: ""
70+
endpoint:
71+
http: ""
72+
grpc: ""
73+
max_in_flight: 64
74+
batching:
75+
count: 0
76+
byte_size: 0
77+
period: ""
78+
check: ""
79+
processors: [] # No default (optional)
80+
```
81+
82+
--
83+
======
84+
85+
Writes messages to a BigQuery table using the Storage Write API, which provides
86+
higher throughput and lower latency than the legacy streaming API or load jobs.
87+
88+
Messages can be formatted as JSON (default) or raw protobuf bytes. When using
89+
JSON format the component automatically fetches the table schema and converts
90+
each message to the corresponding proto representation.
91+
92+
WARNING: The proto3 JSON mapping encodes int64 and uint64 values as strings.
93+
JSON messages with integer fields must use string values (e.g. `"age": "30"`
94+
not `"age": 30`), otherwise the write will fail with an unmarshalling error.
95+
96+
When batching is enabled the table name is resolved from the first message in
97+
each batch; all messages in the same batch are written to that table.
98+
99+
100+
== Fields
101+
102+
=== `project`
103+
104+
The GCP project ID. If empty, the project is auto-detected from the environment.
105+
106+
107+
*Type*: `string`
108+
109+
*Default*: `""`
110+
111+
=== `dataset`
112+
113+
The BigQuery dataset ID.
114+
115+
116+
*Type*: `string`
117+
118+
119+
=== `table`
120+
121+
The BigQuery table ID. Supports interpolation functions. When batching, resolved from the first message in each batch.
122+
This field supports xref:configuration:interpolation.adoc#bloblang-queries[interpolation functions].
123+
124+
125+
*Type*: `string`
126+
127+
128+
=== `message_format`
129+
130+
The format of input messages. Use 'json' to have the component convert JSON to proto automatically, or 'protobuf' to supply raw proto-encoded bytes.
131+
132+
133+
*Type*: `string`
134+
135+
*Default*: `"json"`
136+
137+
Options:
138+
`json`
139+
, `protobuf`
140+
.
141+
142+
=== `credentials_json`
143+
144+
An optional JSON string containing GCP credentials. If empty, credentials are loaded from the environment.
145+
[CAUTION]
146+
====
147+
This field contains sensitive information that usually shouldn't be added to a config directly, read our xref:configuration:secrets.adoc[secrets page for more info].
148+
====
149+
150+
151+
152+
*Type*: `string`
153+
154+
*Default*: `""`
155+
156+
=== `endpoint`
157+
158+
Optional endpoint overrides for the BigQuery and Storage Write API clients.
159+
160+
161+
*Type*: `object`
162+
163+
164+
=== `endpoint.http`
165+
166+
Override the BigQuery HTTP endpoint. Useful for local emulators.
167+
168+
169+
*Type*: `string`
170+
171+
*Default*: `""`
172+
173+
=== `endpoint.grpc`
174+
175+
Override the BigQuery Storage gRPC endpoint. Useful for local emulators.
176+
177+
178+
*Type*: `string`
179+
180+
*Default*: `""`
181+
182+
=== `max_in_flight`
183+
184+
The maximum number of messages to have in flight at a given time. Increase this to improve throughput.
185+
186+
187+
*Type*: `int`
188+
189+
*Default*: `64`
190+
191+
=== `batching`
192+
193+
Allows you to configure a xref:configuration:batching.adoc[batching policy].
194+
195+
196+
*Type*: `object`
197+
198+
199+
```yml
200+
# Examples
201+
202+
batching:
203+
byte_size: 5000
204+
count: 0
205+
period: 1s
206+
207+
batching:
208+
count: 10
209+
period: 1s
210+
211+
batching:
212+
check: this.contains("END BATCH")
213+
count: 0
214+
period: 1m
215+
```
216+
217+
=== `batching.count`
218+
219+
A number of messages at which the batch should be flushed. If `0` disables count based batching.
220+
221+
222+
*Type*: `int`
223+
224+
*Default*: `0`
225+
226+
=== `batching.byte_size`
227+
228+
An amount of bytes at which the batch should be flushed. If `0` disables size based batching.
229+
230+
231+
*Type*: `int`
232+
233+
*Default*: `0`
234+
235+
=== `batching.period`
236+
237+
A period in which an incomplete batch should be flushed regardless of its size.
238+
239+
240+
*Type*: `string`
241+
242+
*Default*: `""`
243+
244+
```yml
245+
# Examples
246+
247+
period: 1s
248+
249+
period: 1m
250+
251+
period: 500ms
252+
```
253+
254+
=== `batching.check`
255+
256+
A xref:guides:bloblang/about.adoc[Bloblang query] that should return a boolean value indicating whether a message should end a batch.
257+
258+
259+
*Type*: `string`
260+
261+
*Default*: `""`
262+
263+
```yml
264+
# Examples
265+
266+
check: this.type == "end_of_transaction"
267+
```
268+
269+
=== `batching.processors`
270+
271+
A list of xref:components:processors/about.adoc[processors] to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
272+
273+
274+
*Type*: `array`
275+
276+
277+
```yml
278+
# Examples
279+
280+
processors:
281+
- archive:
282+
format: concatenate
283+
284+
processors:
285+
- archive:
286+
format: lines
287+
288+
processors:
289+
- archive:
290+
format: json_array
291+
```
292+
293+

go.mod

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ require (
2323
buf.build/gen/go/redpandadata/common/protocolbuffers/go v1.36.11-20260323171043-6e06f84ad823.1
2424
buf.build/gen/go/redpandadata/otel/protocolbuffers/go v1.36.11-20260323171043-3635d3966b23.1
2525
cloud.google.com/go/aiplatform v1.121.0
26-
cloud.google.com/go/bigquery v1.74.0
26+
cloud.google.com/go/bigquery v1.75.0
2727
cloud.google.com/go/pubsub v1.50.1
2828
cloud.google.com/go/spanner v1.88.0
2929
cloud.google.com/go/storage v1.61.3

go.sum

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,8 @@ cloud.google.com/go/bigquery v1.4.0/go.mod h1:S8dzgnTigyfTmLBfrtrhyYhwRxG72rYxvf
6666
cloud.google.com/go/bigquery v1.5.0/go.mod h1:snEHRnqQbz117VIFhE8bmtwIDY80NLUZUMb4Nv6dBIg=
6767
cloud.google.com/go/bigquery v1.7.0/go.mod h1://okPTzCYNXSlb24MZs83e2Do+h+VXtc4gLoIoXIAPc=
6868
cloud.google.com/go/bigquery v1.8.0/go.mod h1:J5hqkt3O0uAFnINi6JXValWIb1v0goeZM77hZzJN/fQ=
69-
cloud.google.com/go/bigquery v1.74.0 h1:Q6bAMv+eyvufOpIrfrYxhM46qq1D3ZQTdgUDQqKS+n8=
70-
cloud.google.com/go/bigquery v1.74.0/go.mod h1:iViO7Cx3A/cRKcHNRsHB3yqGAMInFBswrE9Pxazsc90=
69+
cloud.google.com/go/bigquery v1.75.0 h1:gI4AgIhXNZ8hxvPDOp4hLGUnpNBjoBor6POSLcrdWkY=
70+
cloud.google.com/go/bigquery v1.75.0/go.mod h1:zNCHWok+hfTgKCwNqT+V7GH/YmFFgZqjzljKCZBJTWc=
7171
cloud.google.com/go/compute v0.1.0/go.mod h1:GAesmwr110a34z04OlxYkATPBEfVhkymfTBXtfbBFow=
7272
cloud.google.com/go/compute v1.2.0/go.mod h1:xlogom/6gr8RJGBe7nT2eGsQYAFUbbv8dbC29qE3Xmw=
7373
cloud.google.com/go/compute v1.3.0/go.mod h1:cCZiE1NHEtai4wiufUhW8I8S1JKkAnhnQJWM7YD99wM=

0 commit comments

Comments
 (0)