[SEA] Fix CloudFetch path to wrap ARRAY and MAP columns as JSON strings (PECO-3016) by eric-wang-1990 · Pull Request #1440 · databricks/databricks-jdbc

eric-wang-1990 · 2026-05-05T20:09:05Z

Problem

On the CloudFetch path (SEA mode), ARRAY and MAP columns were not being wrapped as JSON strings and not being returned as DatabricksArray/DatabricksMap objects.

Root Cause

The getObjectWithComplexTypeHandling() method in ArrowStreamResult determines how to handle a column by inspecting requiredType (sourced from the column metadata in the SEA manifest). On the CloudFetch path, the SEA manifest reports ARRAY/MAP/STRUCT columns with a STRING wire type — because CloudFetch transmits these as Arrow UTF-8 strings in the IPC file. As a result, isComplexType(requiredType) returns false and the complex-type handling branch is never entered.

However, the Arrow IPC file embedded in each CloudFetch chunk carries richer metadata: the "Spark:DataType:SqlName" field metadata key (ARROW_METADATA_KEY) is set to the true SQL type, e.g. "ARRAY<INT>" or "MAP<STRING,INT>". This arrowMetadata string was already being extracted and passed into the method, but was only used after the complex-type check — too late.

Fix

Before the complex-type branch, derive an effectiveType from arrowMetadata using DatabricksTypeUtil.isComplexType(String). When requiredType is not already a complex type but arrowMetadata identifies the column as ARRAY/MAP/STRUCT, override effectiveType accordingly. All subsequent branching uses effectiveType instead of requiredType, so both:

the JSON-string path (EnableComplexDatatypeSupport=false) and
the DatabricksArray/DatabricksMap path (EnableComplexDatatypeSupport=true)

work correctly for CloudFetch ARRAY/MAP/STRUCT columns.

Changes

ArrowStreamResult.java: Added effectiveType derivation logic in getObjectWithComplexTypeHandling(); replaced all uses of requiredType with effectiveType in the complex-type handling branches. Added import for DatabricksTypeUtil.
ArrowStreamResultTest.java: Added 4 unit tests covering:
- ARRAY column (arrowMetadata="ARRAY<INT>"), complex support disabled → returns JSON String
- ARRAY column (arrowMetadata="ARRAY<STRING>"), complex support enabled → returns DatabricksArray
- MAP column (arrowMetadata="MAP<STRING,INT>"), complex support disabled → returns JSON String
- ARRAY column via requiredType=ARRAY (existing path), complex support disabled → returns JSON String
NEXT_CHANGELOG.md: Added entry under ### Fixed.

Test Plan

New unit tests in ArrowStreamResultTest pass
Existing ArrowStreamResultTest tests pass
Manual verification: connect to a Databricks workspace via SEA with CloudFetch enabled, query a table with ARRAY/MAP columns, and verify they are returned as JSON strings (complex support disabled) or DatabricksArray/DatabricksMap (complex support enabled)

Fixes: PECO-3016

…gs (PECO-3016) On the CloudFetch path, the SEA manifest reports ARRAY/MAP/STRUCT columns with a STRING wire type. The driver's getObjectWithComplexTypeHandling() only checked requiredType (from the manifest), so it never triggered complex-type handling for those columns. Fix: before the complex-type branch, derive an effectiveType from the Arrow schema metadata (ARROW_METADATA_KEY = "Spark:DataType:SqlName") embedded in the CloudFetch IPC file. When requiredType is not already a complex type but arrowMetadata identifies the column as ARRAY/MAP/STRUCT, use the Arrow metadata to set effectiveType appropriately. The rest of the method then uses effectiveType so both the JSON-string path (isComplexDatatypeSupportEnabled=false) and the DatabricksArray/DatabricksMap path (isComplexDatatypeSupportEnabled=true) work correctly on CloudFetch. Added unit tests covering: - ARRAY column (arrowMetadata="ARRAY<INT>"), complex support disabled → JSON String - ARRAY column (arrowMetadata="ARRAY<STRING>"), complex support enabled → DatabricksArray - MAP column (arrowMetadata="MAP<STRING,INT>"), complex support disabled → JSON String - ARRAY column via requiredType=ARRAY (existing path), complex support disabled → JSON String Signed-off-by: Eric Wang <e.wang@databricks.com>

Copilot

Pull request overview

This PR fixes complex-type handling on the SEA CloudFetch path by using the Arrow IPC schema metadata (e.g., Spark:DataType:SqlName) to detect ARRAY/MAP/STRUCT types when the SEA manifest reports a STRING wire type, ensuring values are returned as JSON strings when complex support is disabled and as DatabricksArray/DatabricksMap when enabled.

Changes:

Derives an effectiveType from arrowMetadata in ArrowStreamResult.getObjectWithComplexTypeHandling() and uses it for subsequent branching.
Adds unit tests covering CloudFetch complex-type wrapping behavior for ARRAY/MAP, including both complex-support enabled/disabled scenarios.
Updates NEXT_CHANGELOG.md with a “Fixed” entry describing the CloudFetch complex-type handling fix.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
src/main/java/com/databricks/jdbc/api/impl/arrow/ArrowStreamResult.java	Uses Arrow schema metadata to correctly identify complex types on CloudFetch and apply the right conversion path.
src/test/java/com/databricks/jdbc/api/impl/arrow/ArrowStreamResultTest.java	Adds targeted unit tests to validate CloudFetch complex-type wrapping behavior.
NEXT_CHANGELOG.md	Documents the fix in the upcoming changelog.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    // When complex type support is enabled, the converter should get the raw value as ARRAY
+    when(mockIterator.getColumnObjectAtCurrentRow(
+            eq(0), eq(ColumnInfoTypeName.ARRAY), eq(arrowMetadata), eq(columnInfo)))
+        .thenReturn(new DatabricksArray(java.util.Arrays.asList("a", "b")));


+    // Should return a formatted string representation, not a DatabricksMap
+    assertInstanceOf(String.class, result);


eric-wang-1990 · 2026-05-05T20:14:28Z

Closing — this was created in the wrong repo (JDBC instead of ADBC). The fix belongs in the ADBC C# driver.

Copilot AI review requested due to automatic review settings May 5, 2026 20:09

Copilot started reviewing on behalf of eric-wang-1990 May 5, 2026 20:10 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

eric-wang-1990 closed this May 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SEA] Fix CloudFetch path to wrap ARRAY and MAP columns as JSON strings (PECO-3016)#1440

[SEA] Fix CloudFetch path to wrap ARRAY and MAP columns as JSON strings (PECO-3016)#1440
eric-wang-1990 wants to merge 1 commit into
mainfrom
worktree-agent-a355298fdfa02b222

eric-wang-1990 commented May 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

eric-wang-1990 commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Should return a formatted string representation, not a DatabricksMap
		assertInstanceOf(String.class, result);

Conversation

eric-wang-1990 commented May 5, 2026

Problem

Root Cause

Fix

Changes

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

eric-wang-1990 commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants