Fix OR filters returning duplicate documents when using multiple indexes #1184

Copilot · 2025-10-26T12:04:22Z

OR filters with multiple indexed fields return the same document multiple times when it matches multiple clauses through different indexes. For example:

NitriteCollection items = db.getCollection("items");
items.createIndex(IndexOptions.indexOptions(IndexType.NON_UNIQUE), "field_a");
items.createIndex(IndexOptions.indexOptions(IndexType.NON_UNIQUE), "field_b");

Document doc = Document.createDocument();
doc.put("field_a", "A");
doc.put("field_b", "B");
items.insert(doc);

Filter orFilter = Filter.or(
    FluentFilter.where("field_a").eq("A"),
    FluentFilter.where("field_b").eq("B")
);

DocumentCursor cursor = items.find(orFilter);
// Returns 2 results instead of 1 - same document appears twice

Changes

ReadOperations.java: Apply DistinctStream unconditionally for OR filters with subplans to deduplicate by NitriteId
IssueTest.java: Add test case validating single document matching multiple OR clauses returns once
CollectionFindByCompoundIndexTest.java: Update tests expecting duplicate behavior to reflect correct OR semantics

The fix ensures OR filters implement proper set union semantics where each document appears at most once regardless of how many conditions it satisfies.

Original prompt

This section details on the original issue you should resolve

<issue_title>Multiple indexes make records appear multiple times in result</issue_title>
<issue_description>Using 'or' with filters for fields in diffent indexes will cause the matching document to be produced once for each index.

NitriteCollection items = db.getCollection("items");
items.createIndex(IndexOptions.indexOptions(IndexType.NON_UNIQUE), "field_a");
items.createIndex(IndexOptions.indexOptions(IndexType.NON_UNIQUE), "field_b");
  	
Document doc = Document.createDocument();
doc.put("field_a", "A");
doc.put("field_b", "B");
items.insert(doc);

Filter aFilter = FluentFilter.where("field_a").eq("A");
Filter bFilter = FluentFilter.where("field_b").eq("B");
  	
Filter orFilter = Filter.or(aFilter, bFilter);

DocumentCursor cursor = items.find(orFilter);
Iterator<Document> docIter = cursor.iterator();
  	
List<Long> matches = new ArrayList<>();
while (docIter.hasNext()) {
    Document match = docIter.next();
    long id = match.getId().getIdValue();
    matches.add(id);
}
assertEquals("Single document must yiels single match", 1, matches.size());</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes Multiple indexes make records appear multiple times in result #1168

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

coderabbitai · 2025-10-26T12:04:31Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: anidotnet <[email protected]>

Changed the expected cursor size from 5 to 3 in CollectionFindByCompoundIndexTest for both mvstore and rocksdb adapters to reflect the correct number of results returned by the query.

Copilot

Pull Request Overview

This PR fixes a bug where OR filters with multiple indexed fields incorrectly return duplicate documents when a single document matches multiple OR clauses through different indexes. The fix ensures proper set union semantics by always deduplicating results for OR filters.

Applied DistinctStream unconditionally for OR filters with subplans in ReadOperations.java
Added regression test in IssueTest.java to validate the fix
Updated existing tests that incorrectly expected duplicate behavior

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
nitrite/src/main/java/org/dizitart/no2/collection/operation/ReadOperations.java	Removed conditional check and always apply `DistinctStream` for OR filters to prevent duplicates
nitrite/src/test/java/org/dizitart/no2/integration/collection/IssueTest.java	Added test case reproducing the reported issue with OR filters across multiple indexes
nitrite/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java	Updated expected result counts from 5 to 3 and 3 to 2 to reflect correct deduplication behavior
nitrite-rocksdb-adapter/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java	Updated expected result count from 5 to 3 for OR filter test
nitrite-mvstore-adapter/src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java	Updated expected result count from 5 to 3 for OR filter test

Copilot · 2025-10-27T06:20:20Z

...src/test/java/org/dizitart/no2/integration/collection/CollectionFindByCompoundIndexTest.java

+        assertEquals(1, cursor.toList().stream().filter(d ->
            d.get("firstName", String.class).equals("fn3")
                && d.get("lastName", String.class).equals("ln2")).count());


Calling cursor.toList() twice (lines 106 and 110) iterates through the cursor multiple times and creates multiple list copies. Since cursors are typically forward-only, cache the list in a variable and reuse it for both filter operations to improve performance.

Initial plan

4f0d084

Copilot AI assigned Copilot and anidotnet Oct 26, 2025

Copilot started work on behalf of anidotnet October 26, 2025 12:08 View session

Copilot AI and others added 4 commits October 26, 2025 12:12

Add test case for OR filter duplicate results issue

64383e1

Co-authored-by: anidotnet <[email protected]>

Fix OR filter duplicate results with multiple indexes

06e34bd

Co-authored-by: anidotnet <[email protected]>

Update tests to reflect correct OR filter behavior

7356ad0

Co-authored-by: anidotnet <[email protected]>

Improve test comments with document field values

71c202a

Co-authored-by: anidotnet <[email protected]>

Copilot AI changed the title ~~[WIP] Fix multiple indexes causing duplicate records in result set~~ Fix OR filters returning duplicate documents when using multiple indexes Oct 26, 2025

Copilot AI requested a review from anidotnet October 26, 2025 12:25

Copilot finished work on behalf of anidotnet October 26, 2025 12:25

Update expected cursor size in compound index tests

0ff19cd

Changed the expected cursor size from 5 to 3 in CollectionFindByCompoundIndexTest for both mvstore and rocksdb adapters to reflect the correct number of results returned by the query.

anidotnet marked this pull request as ready for review October 27, 2025 06:19

Copilot AI review requested due to automatic review settings October 27, 2025 06:19

anidotnet merged commit 3616f9f into main Oct 27, 2025
15 checks passed

Copilot AI reviewed Oct 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix OR filters returning duplicate documents when using multiple indexes #1184

Fix OR filters returning duplicate documents when using multiple indexes #1184

Uh oh!

Copilot AI commented Oct 26, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 26, 2025

Review skipped

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix OR filters returning duplicate documents when using multiple indexes #1184

Fix OR filters returning duplicate documents when using multiple indexes #1184

Uh oh!

Conversation

Copilot AI commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Comments on the Issue (you are @copilot in this section)

Uh oh!

coderabbitai bot commented Oct 26, 2025

Review skipped

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Oct 26, 2025 •

edited

Loading