[bug fix] set finish before notifying load in LanceArrowWriter setFinished method #1985
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
… method
Purpose
Linked issue: close #xxx
This commit is copied from lance-format/lance-spark#92.
I found that the Lance writer has a certain probability of hanging. After some troubleshooting, I discovered this is related to the LanceArrowWriter.setFinished method.
The original code appears to have a bug where it sets the finished status after notifying loadNextBatch, which could cause loadNextBatch to hang.
Root Cause
The ideal flow should be:
(thread 1) loadToken.release
(thread 1) finished = true
(thread 2) loadNextBatch
(thread 2) finished is true and count is 0 so return false
However, there's a chance it becomes:
(thread 1) loadToken.release
(thread 2) loadNextBatch
(thread 2) finished is false so return true and waiting
(thread 1) finished = false
If the second scenario occurs, thread 2 will hang indefinitely and cannot receive new notifications. jstack will show stacks hanging in LanceDataWriter.commit.
Brief change log
Tests
API and Format
Documentation