Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] SSDV.nextOrd() should not make use of NO_MORE_DOCS #17628

Open
rishabhmaurya opened this issue Mar 18, 2025 · 5 comments · May be fixed by #17647
Open

[BUG] SSDV.nextOrd() should not make use of NO_MORE_DOCS #17628

rishabhmaurya opened this issue Mar 18, 2025 · 5 comments · May be fixed by #17647
Assignees
Labels
bug Something isn't working untriaged v3.0.0 Issues and PRs related to version 3.0.0

Comments

@rishabhmaurya
Copy link
Contributor

rishabhmaurya commented Mar 18, 2025

Describe the bug

Lucene 10 makes it illegal to call SortedSetDocValues#nextOrd() more than#docValueCount() times for the currently-positioned doc.
https://github.com/apache/lucene/blob/e7f9bc837e419672d9bfc829d01e643df667e9d4/lucene/core/src/java/org/apache/lucene/index/SortedSetDocValues.java#L36

Remove all occurrences of SSDV.nextOrd() != NO_MORE_DOCS and replace with use of SSDV.docValuesCount() like in #17626

I consider this as a blocker to OpenSearch 3.0 which is making use of lucene 10.

Related component

Search

To Reproduce

This was causing flakiness in MultiOrdinalsTests as encountered in PR #17446 with failure https://build.ci.opensearch.org/job/gradle-check/54686/testReport/junit/org.opensearch.index.fielddata.ordinals/MultiOrdinalsTests/testRandomValues/

Expected behavior

Remove all occurrences of SSDV.nextOrd() != NO_MORE_DOCS and replace with use of SSDV.docValuesCount()

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

@rishabhmaurya rishabhmaurya added bug Something isn't working untriaged v3.0.0 Issues and PRs related to version 3.0.0 labels Mar 18, 2025
@rishabhmaurya
Copy link
Contributor Author

@expani

@expani
Copy link
Contributor

expani commented Mar 20, 2025

Thanks for opening the issue @rishabhmaurya

Please assign this issue to me.

Was able to reproduce with

./gradlew ':server:test' --tests "org.opensearch.index.fielddata.ordinals.MultiOrdinalsTests.testRandomValues" -Dtests.seed=E4F93E607FE81E76 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=en-AU -Dtests.timezone=America/Indiana/Vevay -Druntime.java=21

Error

  2> java.lang.AssertionError: expected:<2147483647> but was:<0>
        at __randomizedtesting.SeedInfo.seed([E4F93E607FE81E76:9A5E63094A6A3168]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:647)
        at org.junit.Assert.assertEquals(Assert.java:633)
        at org.opensearch.index.fielddata.ordinals.MultiOrdinalsTests.testRandomValues(MultiOrdinalsTests.java:134)
  2> NOTE: leaving temporary files on disk at: /Users/anijainc/Desktop/GithubWorkspaces/OS_30_Aplha1/OpenSearch/server/build/testrun/test/temp/org.opensearch.index.fielddata.ordinals.MultiOrdinalsTests_E4F93E607FE81E76-001
  2> NOTE: test params are: codec=Asserting(Lucene101): {}, docValues:{}, maxPointsInLeafNode=446, maxMBSortInHeap=5.21462929723478, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=en-AU, timezone=America/Indiana/Vevay
  2> NOTE: Mac OS X 15.3.1 aarch64/Amazon.com Inc. 21.0.5 (64-bit)/cpus=10,threads=2,free=436266416,total=536870912
  2> NOTE: All tests run in this JVM: [MultiOrdinalsTests]

@msfroh
Copy link
Collaborator

msfroh commented Mar 20, 2025

Looks like there's also an occurrence in MissingValuesTests

I think the implementation of MissingValues may need some work. Specifically, it pretends that a doc that has no values instead has 1 value. Unfortunately, it always passes through values.docValueCount(), which may be incorrect if the call to advanceExact() failed:

public boolean advanceExact(int doc) throws IOException {
hasOrds = values.advanceExact(doc);
nextMissingOrd = missingOrd;
// always return true because we want to return a value even if
// the document does not have a value
return true;
}
@Override
public int docValueCount() {
return values.docValueCount();
}

I believe the correct implementation of docValuesCount() will return 1 if hasOrd == false.

We also have some redundant checks where we're checking both docValueCount() and NO_MORE_DOCS:

Those redundant checks aren't a problem AFAIK, but we should probably just check docValueCount().

@msfroh msfroh linked a pull request Mar 21, 2025 that will close this issue
3 tasks
@msfroh
Copy link
Collaborator

msfroh commented Mar 21, 2025

Oh -- sorry, @expani, I just fixed MissingValues: #17647

@expani
Copy link
Contributor

expani commented Mar 21, 2025

Thanks @msfroh opening the PR with fix and this draft PR #17649 to check if any callers are still misbehaving by throwing exceptions instead of NO_MORE_DOCS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants