Description
What is the bug?
When:
- using multiple
script_score
neural
queries on multiple (different) vector fields, like in this comment - each
script
references_score
explain=true
Then, if a document is returned by some neural field queries (within the sub-query's top-k
) but not some others, the query fails with a script runtime exception and the error: Null score for the docID: 2147483647
(At least I think this is why... I'm new to OpenSearch and neural search, so apologies - my explanation for why this happens is just my best guess!)
How can one reproduce the bug?
- Follow the docs instructions to set up neural search.
- Set up two fields like
title_embedding
anddescription_embedding
. - Ingest some documents (their embedding fields should by set in the ingest pipeline) - the example query below should have 100 documents
- Run a query like:
GET /myindex/_search?explain=true
{
"from": 0,
"size": 100,
"query": {
"bool" : {
"should" : [
{
"script_score": {
"query": {
"neural": {
"title_embedding": {
"query_text": "test",
"model_id": "xGbq_YcB3ggx1CR0Nfls",
"k": 10
}
}
},
"script": {
"source": "_score * 1"
}
}
},
{
"script_score": {
"query": {
"neural": {
"description_embedding": {
"query_text": "test",
"model_id": "xGbq_YcB3ggx1CR0Nfls",
"k": 10
}
}
},
"script": {
"source": "_score * 1"
}
}
}
]
}
}
}
See an error like:
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"org.opensearch.knn.index.query.KNNScorer.score(KNNScorer.java:51)",
"org.opensearch.script.ScoreScript.lambda$setScorer$4(ScoreScript.java:156)",
"org.opensearch.script.ScoreScript.get_score(ScoreScript.java:168)",
"_score * 1",
"^---- HERE"
],
"script": "_score * 1",
"lang": "painless",
"position": {
"offset": 0,
"start": 0,
"end": 10
}
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "opensearch_content",
"node": "vnyA5s-aQUOmTj6IHosYXA",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"org.opensearch.knn.index.query.KNNScorer.score(KNNScorer.java:51)",
"org.opensearch.script.ScoreScript.lambda$setScorer$4(ScoreScript.java:156)",
"org.opensearch.script.ScoreScript.get_score(ScoreScript.java:168)",
"_score * 1",
"^---- HERE"
],
"script": "_score * 1",
"lang": "painless",
"position": {
"offset": 0,
"start": 0,
"end": 10
},
"caused_by": {
"type": "runtime_exception",
"reason": "Null score for the docID: 2147483647"
}
}
}
]
},
"status": 400
}
Note the high size
and low k
. You might need to adjust the query_text
or k
to find a combination where a document is returned in one neural query's top k
and not the other.
Remove explain=true
from the query and notice it succeeds.
What is the expected behavior?
- The query succeeds - it does not throw an error.
_score
for the affected field is 0 or the affected field is excluded entirely - either way, the_explanation
should accurately reflect this.
What is your host/environment?
OpenSearch 2.7, Ubuntu 22.04.
Do you have any additional context?
I'm not sure why it only happens with explain=true
. (I can't explain it)
It also only happens if using script_score
. If using multiple neural
queries directly, there is no error. But then there is no score per-field in _explanation
- the total is correct, but each field score value is reported as 1
. #875 describes this problem. My use case is: I'd like to try using the similarity scores of each field as features in a Learning to Rank model, which means I need to get each score individually.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status