-
-
Notifications
You must be signed in to change notification settings - Fork 98
Fix OR filters returning duplicate documents when using multiple indexes #1184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the Comment |
Co-authored-by: anidotnet <[email protected]>
Co-authored-by: anidotnet <[email protected]>
Co-authored-by: anidotnet <[email protected]>
Co-authored-by: anidotnet <[email protected]>
Changed the expected cursor size from 5 to 3 in CollectionFindByCompoundIndexTest for both mvstore and rocksdb adapters to reflect the correct number of results returned by the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a bug where OR filters with multiple indexed fields incorrectly return duplicate documents when a single document matches multiple OR clauses through different indexes. The fix ensures proper set union semantics by always deduplicating results for OR filters.
- Applied
DistinctStreamunconditionally for OR filters with subplans inReadOperations.java - Added regression test in
IssueTest.javato validate the fix - Updated existing tests that incorrectly expected duplicate behavior
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| nitrite/src/main/java/org/dizitart/no2/collection/operation/ReadOperations.java | Removed conditional check and always apply DistinctStream for OR filters to prevent duplicates |
| nitrite/src/test/java/org/dizitart/no2/integration/collection/IssueTest.java | Added test case reproducing the reported issue with OR filters across multiple indexes |
| nitrite/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java | Updated expected result counts from 5 to 3 and 3 to 2 to reflect correct deduplication behavior |
| nitrite-rocksdb-adapter/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java | Updated expected result count from 5 to 3 for OR filter test |
| nitrite-mvstore-adapter/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java | Updated expected result count from 5 to 3 for OR filter test |
| assertEquals(1, cursor.toList().stream().filter(d -> | ||
| d.get("firstName", String.class).equals("fn3") | ||
| && d.get("lastName", String.class).equals("ln2")).count()); |
Copilot
AI
Oct 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling cursor.toList() twice (lines 106 and 110) iterates through the cursor multiple times and creates multiple list copies. Since cursors are typically forward-only, cache the list in a variable and reuse it for both filter operations to improve performance.
OR filters with multiple indexed fields return the same document multiple times when it matches multiple clauses through different indexes. For example:
Changes
DistinctStreamunconditionally for OR filters with subplans to deduplicate byNitriteIdThe fix ensures OR filters implement proper set union semantics where each document appears at most once regardless of how many conditions it satisfies.
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.