Commit 8cc07d1
authored
fix: set finish before notifying load in LanceArrowWriter setFinished method (#92)
Recently I was testing the integration between Lance and Spark.
I found that the Lance writer has a certain probability of hanging.
After some troubleshooting, I discovered this is related to the
LanceArrowWriter.setFinished method.
The original code appears to have a bug where it sets the finished
status after notifying loadNextBatch, which could cause loadNextBatch to
hang.
**Root Cause**
The ideal flow should be:
1. (thread 1) loadToken.release
2. (thread 1) finished = true
3. (thread 2) loadNextBatch
4. (thread 2) finished is true and count is 0 so return false
However, there's a chance it becomes:
1. (thread 1) loadToken.release
2. (thread 2) loadNextBatch
3. (thread 2) finished is false so return true and waiting
4. (thread 1) finished = false
If the second scenario occurs, thread 2 will hang indefinitely and
cannot receive new notifications. jstack will show stacks hanging in
LanceDataWriter.commit.
**Reproduction**
This issue is hard to reproduce. I encountered it in a very low-resource
environment (Spark executor with only 1 core 4g) when creating a new
table and writing 600 rows of data at once, where one column is a
1024-dimensional vector column.
It also occurs intermittently.
**Further Confirmation**
Although the current fix seems reasonable, I hope to get confirmation
from maintainers to avoid introducing new unknown issues.
Any comments about this. @jackye19951 parent 9c57adc commit 8cc07d1
File tree
2 files changed
+2
-5
lines changed- lance-spark-base_2.12/src
- main/java/com/lancedb/lance/spark/write
- test/java/com/lancedb/lance/spark/write
2 files changed
+2
-5
lines changedLines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
74 | 73 | | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
Lines changed: 0 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
30 | 29 | | |
31 | 30 | | |
32 | 31 | | |
| |||
35 | 34 | | |
36 | 35 | | |
37 | 36 | | |
38 | | - | |
39 | | - | |
40 | 37 | | |
41 | 38 | | |
42 | 39 | | |
| |||
0 commit comments