Internally, Bloblang numbers are JSON numbers. Connect passes them to the ClickHouse driver as strings by default. When using the ClickHouse driver, you nearly always need to explicitly cast integers to sized types using methods like int64().
We encountered a footgun related to this. One of our pipelines queries ClickHouse to see if a record exists like so:
SELECT id FROM tbl WHERE id = ?
Because we forgot to cast the id Bloblang value to an int64(), ClickHouse actually executed:
SELECT id FROM tbl WHERE id = '12345'
This type mismatch causes ClickHouse to scan the table instead of using the primary key.
The sql_select processor could support a validate_column_types field to validate incoming args_mapping values match the target data types.
sql_select already accepts a table name and column names. Connect could inspect the table when it starts, or when the processor is used for the first time, to determine the column data types. With validate_column_types: true, sql_select would validate incoming data against the detected data types.
The validate_column_types field could also control whether Connect inspects the target table in the first place.
Initially, this field would only affect the ClickHouse output.
Internally, Bloblang numbers are JSON numbers. Connect passes them to the ClickHouse driver as strings by default. When using the ClickHouse driver, you nearly always need to explicitly cast integers to sized types using methods like
int64().We encountered a footgun related to this. One of our pipelines queries ClickHouse to see if a record exists like so:
Because we forgot to cast the
idBloblang value to anint64(), ClickHouse actually executed:This type mismatch causes ClickHouse to scan the table instead of using the primary key.
The
sql_selectprocessor could support avalidate_column_typesfield to validate incomingargs_mappingvalues match the target data types.sql_selectalready accepts a table name and column names. Connect could inspect the table when it starts, or when the processor is used for the first time, to determine the column data types. Withvalidate_column_types: true,sql_selectwould validate incoming data against the detected data types.The
validate_column_typesfield could also control whether Connect inspects the target table in the first place.Initially, this field would only affect the ClickHouse output.