Skip to content

Conversation

@fivetran-amrutabhimsenayachit
Copy link
Collaborator

When STARTS_WITH is used with BLOB/BYTES types that are not literals (e.g., from CAST, table columns, function results), the transpiled DuckDB query fails because DuckDB's starts_with only accepts VARCHAR.

Before:

sqlglot %  bq --project_id fivetran-wild-west query --use_legacy_sql=false "SELECT STARTS_WITH(CAST('foo' AS BYTES), CAST('f' AS BYTES))"                            
+------+
| f0_  |
+------+
| true |
+------+
sqlglot % python3 -c "import sqlglot; print(sqlglot.transpile(\"SELECT STARTS_WITH(CAST('foo' AS BYTES), CAST('f' AS BYTES))\", read='bigquery', write='duckdb')[0])"
SELECT STARTS_WITH(CAST('foo' AS BLOB), CAST('f' AS BLOB))

sqlglot % duckdb -c "SELECT STARTS_WITH(CAST('foo' AS BLOB), CAST('f' AS BLOB))"                                                         
Binder Error:
No function matches the given name and argument types 'starts_with(BLOB, BLOB)'. You might need to add explicit type casts.
        Candidate functions:
        starts_with(VARCHAR, VARCHAR) -> BOOLEAN


LINE 1: SELECT STARTS_WITH(CAST('foo' AS BLOB), CAST('f' AS BLOB))

After:

sqlglot % bq --project_id fivetran-wild-west query --use_legacy_sql=false "SELECT STARTS_WITH(CAST('foo' AS BYTES), CAST('f' AS BYTES))"     
+------+
| f0_  |
+------+
| true |
+------+
sqlglot % python3 -c "import sqlglot; print(sqlglot.transpile(\"SELECT STARTS_WITH(CAST('foo' AS BYTES), CAST('f' AS BYTES))\", read='bigquery', write='duckdb')[0])"
SELECT STARTS_WITH(CAST(CAST('foo' AS BLOB) AS TEXT), CAST(CAST('f' AS BLOB) AS TEXT))

sqlglot % duckdb -c "SELECT STARTS_WITH(CAST(CAST('foo' AS BLOB) AS TEXT), CAST(CAST('f' AS BLOB) AS TEXT))"
┌───────────────────────────────────────────────────────────────────────┐
│ starts_with(CAST('foo'::BLOB AS VARCHAR), CAST('f'::BLOB AS VARCHAR)) │
│                                boolean                                │
├───────────────────────────────────────────────────────────────────────┤
│ true                                                                  │
└───────────────────────────────────────────────────────────────────────┘

Comment on lines +1201 to +1204
if expr.is_type(exp.DataType.Type.BINARY):
expression.expression.replace(
exp.cast(expression.expression, exp.DataType.Type.VARCHAR)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path is not tested– moreover, I don't think it's correct.

Comment on lines +1197 to +1199
# DuckDB's starts_with only accepts VARCHAR, not BLOB
if this.is_type(exp.DataType.Type.BINARY):
expression.this.replace(exp.cast(expression.this, exp.DataType.Type.VARCHAR))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do if not is_type(VARCHAR, UNKNOWN): ... instead? Check out my comment in Tori's PR.

@fivetran-amrutabhimsenayachit fivetran-amrutabhimsenayachit force-pushed the RD-1050424-transpile-big-querys-starts-with-string-function-to-duck-db branch from 42cd33f to a99ba1f Compare November 5, 2025 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants