generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 169
Open
Labels
FeaturesIntroduces a new unit of functionality that satisfies a requirementIntroduces a new unit of functionality that satisfies a requirementRFCRequest for commentsRequest for commentsenhancementv3.3.0
Description
Statement
In ColPali paper we found multi vector
would optimize large document search ndcg result with Late-interaction
like paper shows.
we have been tried using nested document implement multiVector, and do multi KNN query responses calculating ColPali score. but it is not ok, because we found that when KNN queries results with different TopK,
[Q1->Res Document:1,2,3]
[Q2->Res Document:4,5,6]
....
Score function is max(Q1, D)+max(Q2,D) + ... which should include Document [1,6] socres
we can not get the complete score for a paper(parent document):
📌 so i want to introduce Late-interaction
capability with new field: multiVector
Proposal
1st introduce new field type: multi_knn_vector
"mappings": {
"properties": {
"my_multi_knn_vector_field": {
"type": "multi_knn_vector"
"dimension": 3
}
}
}
PUT xx_index/_doc/1
{ "my_multi_knn_vector_field": [ [1,2,3], [4,5,6], ...]}
for multi_knn_vector we only use binary docvalues, and can combined with derived source.
2nd introduce multi knn query
we can reuse dot product calculation and implement: maximum dot product score like paper says
POST xx_index/_search
{
"query": {
"multi_knn_query" : {
"my_multi_knn_vector_field": {
"score_mode": "max_sim_dot",
"vectors": [ [1,2,3], [5,6,7] ...]
}
}
}
}
3rd introduce rescore for query which is Late-interaction Score
POST xx_index/_search
{
"query": {
"match_all": { }
},
"rescore" : {
"window_size" : 50,
"query" : {
"multi_knn_query" : {
"my_multi_knn_vector_field": {
"score_mode": "max_sim_dot",
"vectors": [ [1,2,3], [5,6,7] ...]
}
}
}
}
}
Future Plan
we can use binary quantize for the multi vector optimize performance
WIP PR: #2707
jmazanec15 and q-andy
Metadata
Metadata
Assignees
Labels
FeaturesIntroduces a new unit of functionality that satisfies a requirementIntroduces a new unit of functionality that satisfies a requirementRFCRequest for commentsRequest for commentsenhancementv3.3.0
Type
Projects
Status
New