Skip to content

Proactive server-side statement close when results consumed#1444

Open
gopalldb wants to merge 9 commits into
databricks:mainfrom
gopalldb:feature/proactive-server-statement-close
Open

Proactive server-side statement close when results consumed#1444
gopalldb wants to merge 9 commits into
databricks:mainfrom
gopalldb:feature/proactive-server-statement-close

Conversation

@gopalldb
Copy link
Copy Markdown
Collaborator

Summary

Proactively closes the server-side operation when all results have been consumed or the ResultSet is explicitly closed, while keeping the client-side Statement open for reuse. This reduces server resource usage and warehouse cost for slow consumers who don't explicitly close statements.

When server operation is closed:

  1. next() returns false — all rows consumed
  2. ResultSet.close() called explicitly

What stays open:

  • Client-side Statement — fully reusable for re-execution
  • Statement.close() becomes a no-op for the server RPC (operation already closed), just cleans up client state

New flag: serverOperationClosed

  • Set after proactive closeStatement RPC
  • Prevents duplicate RPCs when Statement.close() is called later
  • Reset on re-execution (resetForNewExecution())
  • Skipped when directResultsReceived is already true (server already closed)
  • Best-effort: server errors during proactive close are swallowed (logged at DEBUG)

Test plan

  • testCloseServerOperation_closesServerAndSkipsRpcOnStatementClose — proactive close sends RPC, Statement.close() skips it
  • testCloseServerOperation_idempotent — double call sends only one RPC
  • testCloseServerOperation_skippedForDirectResults — no-op when server already closed
  • testCloseServerOperation_skippedWhenNoStatementId — no-op when no statement ID
  • testCloseServerOperation_resetsAfterFlagClear — flag prevents all subsequent RPCs
  • testCloseServerOperation_errorSwallowed — server errors don't propagate
  • Full suite: 193 tests pass (DatabricksStatementTest + DatabricksResultSetTest + DatabricksConnectionTest)

This pull request was AI-assisted by Isaac.

gopalldb added 9 commits May 12, 2026 11:37
Close the server-side operation proactively when:
1. next() returns false — all rows consumed
2. ResultSet.close() is called explicitly

The client-side Statement remains open for reuse. A new
serverOperationClosed flag prevents duplicate closeStatement RPCs
when Statement.close() is called later.

Behavior:
- next() exhausts rows → closeStatement RPC → serverOperationClosed=true
- ResultSet.close() → closeStatement RPC (if not already closed)
- Statement.close() → skips RPC if serverOperationClosed, just cleans up client
- Re-execution resets the flag

6 new tests covering:
- Proactive close skips RPC on Statement.close()
- Idempotent (double call sends one RPC)
- Skipped for direct results
- Skipped when no statementId
- Flag reset verified
- Server errors swallowed (best-effort)

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Add comment explaining why proactive server close in next() is safe:
by the time next() returns false, all chunks/batches are already
client-side. No further server RPCs needed for CloudFetch link
refresh, lazy getUpdateCount, or getMoreResults.

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
- cancel(): Skip cancelStatement RPC when server operation already
  closed proactively. Prevents 404 on cancel after results consumed.
- getExecutionResult(): Return cached result when server operation
  already closed. Prevents getStatementResult RPC on closed operation.

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
If proactive closeStatement RPC fails (transient network error),
keep serverOperationClosed=false so Statement.close() retries the
RPC. Prevents leaving server operations alive until session timeout.

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
New tests:
- cancel() is no-op after proactive close (cancel + close interaction)
- getExecutionResult() returns cached result after proactive close
- Statement is reusable for re-execution after proactive close
- Error swallowed test verifies flag NOT set on failure (retry behavior)

Covers review feedback databricks#8-databricks#13 (wiring tests, reuse, cancel interaction,
getExecutionResult guard, error retry semantics).

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Proactive closeServerOperation() in next() when returning false was too
aggressive: batch execution reuses statements, and the early close broke
subsequent batch commands (HTTP 404 on ExecuteStatement). Also consumed
WireMock stubs meant for final Statement.close().

Keep proactive close only in ResultSet.close() — this is the explicit
user signal that results are no longer needed. next() exhaustion alone
does not justify closing the server operation since the statement may
be reused for batch or re-execution.

Fixes integration test failures:
- ExecutionIntegrationTests.testStatementBatch_MixedDMLOperations
- ExecutionIntegrationTests.testStatementBatch_MultipleInserts
- NonRowcountQueryPrefixesIntegrationTests.testConfigured_MultiplePrefixes

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Root cause of integration test failures: resetForNewExecution() reset
directResultsReceived and serverOperationClosed BEFORE closing the old
ResultSet. When resultSet.close() triggered closeServerOperation(), the
flags were already false, causing an unnecessary closeStatement RPC on
a direct-results operation (server already closed it) → HTTP 404.

Fix: close the previous ResultSet BEFORE resetting flags so
closeServerOperation() sees the correct state and skips the RPC for
direct results.

Proactive close now fires for all statement types in next() — no
type-based exclusions. The guards in closeServerOperation()
(directResultsReceived, serverOperationClosed, null statementId)
handle all edge cases correctly.

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
gopalldb added a commit that referenced this pull request May 12, 2026
Fixes:
- LazyThriftResult.isCompletelyFetched(): null guard for currentResponse
  (set to null in close())
- LazyThriftInlineArrowResult.isAllDataFetched(): same null guard
- AbstractRemoteChunkProvider.isAllDataFetched(): require !hasNextChunk()
  in addition to all downloads submitted — ensures in-flight downloads
  complete before signaling server is no longer needed
- DatabricksResultSet.next(): throw meaningful exception when
  executionResult is null (pre-existing NPE, async path)
- DatabricksResultSet.close(): null guard for executionResult.close()
- IsAllDataFetchedTest: updated for tightened RemoteChunkProvider logic
- Removed proactive server close from close() — belongs to PR #1444

Co-authored-by: Isaac
Signed-off-by: Gopal Lal <gopal.lal@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant