Skip to content

Conversation

@ohadzeliger
Copy link
Contributor

The issue at hand is that when running multiple update operations in a single transaction, the partition's document counts and the PK-segment index may get into an inconsistent state. The root cause is that the first update in the transaction clears the doc from the Lucene index and the PK index. Since the changes are not flushed, the IndexWriter has them cached in the NRT cache. The second record update would then not find the record in the PK index (because the segment has changed but the IndexReader does not yet reflect that) and therefore the delete is skipped, including updating the partition count. Note that it does attempt a delete-by-query that actually removes the doc from the Lucene index, but since we can't know that, the partition is not updated.
The solution is to refresh the DirectoryReader when doing an update, so that any previously written changes are showing up. The refresh operation uses DirectoryReader.openIfChanged that is more efficient in resources than using a brand new open call.

Resolve #3704

@ohadzeliger ohadzeliger self-assigned this Nov 4, 2025
@ohadzeliger ohadzeliger added the bug fix Change that fixes a bug label Nov 4, 2025
@ohadzeliger ohadzeliger marked this pull request as ready for review November 7, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix Change that fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lucene partition record counts are inaccurate when a document is updated multiple times in the same transaction

1 participant