langchain - qdrant interface is not correctly implemented to handle custom shards #30661
Closed
5 tasks done
Labels
Ɑ: vector store
Related to vector store module
Checked other resources
Example Code
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient, models
from qdrant_client.http.models import Distance, VectorParams
import os
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
embeddings = OpenAIEmbeddings(model="text-embedding-3-large", api_key=os.getenv("OPENAI_API_KEY"))
client = QdrantClient(url="http://localhost:6333")
client.create_collection(
collection_name="trial_collection",
vectors_config=VectorParams(size=3072, distance=Distance.COSINE),
shard_number=4,
replication_factor=2,
sharding_method=models.ShardingMethod.CUSTOM
)
client.create_shard_key (
collection_name="trial_collection", shard_key="Movo")
client.create_shard_key (
collection_name="trial_collection", shard_key="Bravo")
vector_store = QdrantVectorStore(
client=client,
collection_name="trial_collection",
embedding=embeddings,
)
document_1 = Document(
page_content="I had chocolate chip pancakes and scrambled eggs for breakfast this morning.",
payload={"tenant": "Movo"},
metadata={"source": "tweet"},
)
document_2 = Document(
page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees Fahrenheit.",
payload={"tenant": "Bravo"},
metadata={"source": "news"},
)
vector_store.add_documents ([document_1], kwargs={"shard_key": "Movo"})
vector_store.add_documents ([document_2], kwargs={"shard_key": "Bravo"})
Error Message and Stack Trace (if applicable)
it is very clear to me that in base.py VectorStore.add_documents which calls langchain_qdrant/qdrant.py
has not ability to take shard_key_selector . So it is ignored.
File /opt/miniconda3/envs/python_3.13.1/lib/python3.13/site-packages/langchain_qdrant/qdrant.py:444, in QdrantVectorStore.add_texts(self, texts, metadatas, ids, batch_size, **kwargs)
440 added_ids = []
441 for batch_ids, points in self._generate_batches(
442 texts, metadatas, ids, batch_size
443 ):
--> 444 self.client.upsert(
445 collection_name=self.collection_name, points=points, **kwargs
446 )
447 added_ids.extend(batch_ids)
449 return added_ids
File /opt/miniconda3/envs/python_3.13.1/lib/python3.13/site-packages/qdrant_client/qdrant_client.py:1542, in QdrantClient.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs)
1507 def upsert(
1508 self,
1509 collection_name: str,
(...) 1514 **kwargs: Any,
1515 ) -> types.UpdateResult:
1516 """
1517 Update or insert a new point into the collection.
1518
(...) 1540 Operation Result(UpdateResult)
1541 """
-> 1542 assert len(kwargs) == 0, f"Unknown arguments: {list(kwargs.keys())}"
1544 if (
1545 not isinstance(points, types.Batch)
1546 and len(points) > 0
1547 and isinstance(points[0], grpc.PointStruct)
1548 ):
1549 # gRPC structures won't support local inference feature, so we deprecated it
1550 show_warning_once(
1551 message="""
1552 Usage of
grpc.PointStruct
is deprecated. Please usemodels.PointStruct
instead.(...) 1556 stacklevel=4,
1557 )
AssertionError: Unknown arguments: ['kwargs']
Description
Looks like shard_selector support which is great feature in qdrant is not support in langchain implementation, this make it difficult to use in distributed complex deployments with many tenants . Let me know if you need anything more. the fix should be to allow passing shard_selector as optional.
System Info
System Information
Package Information
Optional packages not installed
Other Dependencies
The text was updated successfully, but these errors were encountered: