Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

langchain - qdrant interface is not correctly implemented to handle custom shards #30661

Closed
5 tasks done
simpliatanu opened this issue Apr 4, 2025 · 1 comment
Closed
5 tasks done
Labels
Ɑ: vector store Related to vector store module

Comments

@simpliatanu
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient, models
from qdrant_client.http.models import Distance, VectorParams
import os
from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document

embeddings = OpenAIEmbeddings(model="text-embedding-3-large", api_key=os.getenv("OPENAI_API_KEY"))
client = QdrantClient(url="http://localhost:6333")

client.create_collection(
collection_name="trial_collection",
vectors_config=VectorParams(size=3072, distance=Distance.COSINE),
shard_number=4,
replication_factor=2,
sharding_method=models.ShardingMethod.CUSTOM
)
client.create_shard_key (
collection_name="trial_collection", shard_key="Movo")
client.create_shard_key (
collection_name="trial_collection", shard_key="Bravo")

vector_store = QdrantVectorStore(
client=client,
collection_name="trial_collection",
embedding=embeddings,
)

document_1 = Document(
page_content="I had chocolate chip pancakes and scrambled eggs for breakfast this morning.",
payload={"tenant": "Movo"},
metadata={"source": "tweet"},
)

document_2 = Document(
page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees Fahrenheit.",
payload={"tenant": "Bravo"},
metadata={"source": "news"},
)

vector_store.add_documents ([document_1], kwargs={"shard_key": "Movo"})
vector_store.add_documents ([document_2], kwargs={"shard_key": "Bravo"})

Error Message and Stack Trace (if applicable)

it is very clear to me that in base.py VectorStore.add_documents which calls langchain_qdrant/qdrant.py
has not ability to take shard_key_selector . So it is ignored.

File /opt/miniconda3/envs/python_3.13.1/lib/python3.13/site-packages/langchain_qdrant/qdrant.py:444, in QdrantVectorStore.add_texts(self, texts, metadatas, ids, batch_size, **kwargs)
440 added_ids = []
441 for batch_ids, points in self._generate_batches(
442 texts, metadatas, ids, batch_size
443 ):
--> 444 self.client.upsert(
445 collection_name=self.collection_name, points=points, **kwargs
446 )
447 added_ids.extend(batch_ids)
449 return added_ids

File /opt/miniconda3/envs/python_3.13.1/lib/python3.13/site-packages/qdrant_client/qdrant_client.py:1542, in QdrantClient.upsert(self, collection_name, points, wait, ordering, shard_key_selector, **kwargs)
1507 def upsert(
1508 self,
1509 collection_name: str,
(...) 1514 **kwargs: Any,
1515 ) -> types.UpdateResult:
1516 """
1517 Update or insert a new point into the collection.
1518
(...) 1540 Operation Result(UpdateResult)
1541 """
-> 1542 assert len(kwargs) == 0, f"Unknown arguments: {list(kwargs.keys())}"
1544 if (
1545 not isinstance(points, types.Batch)
1546 and len(points) > 0
1547 and isinstance(points[0], grpc.PointStruct)
1548 ):
1549 # gRPC structures won't support local inference feature, so we deprecated it
1550 show_warning_once(
1551 message="""
1552 Usage of grpc.PointStruct is deprecated. Please use models.PointStruct instead.
(...) 1556 stacklevel=4,
1557 )

AssertionError: Unknown arguments: ['kwargs']

Description

Looks like shard_selector support which is great feature in qdrant is not support in langchain implementation, this make it difficult to use in distributed complex deployments with many tenants . Let me know if you need anything more. the fix should be to allow passing shard_selector as optional.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 24.3.0: Thu Jan 2 20:23:36 PST 2025; root:xnu-11215.81.4~3/RELEASE_ARM64_T8112
Python Version: 3.13.1 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 10:35:08) [Clang 14.0.6 ]

Package Information

langchain_core: 0.3.40
langchain: 0.3.19
langchain_community: 0.3.18
langsmith: 0.3.11
langchain_anthropic: 0.3.8
langchain_cohere: 0.4.2
langchain_experimental: 0.3.4
langchain_groq: 0.2.4
langchain_openai: 0.3.7
langchain_qdrant: 0.2.0
langchain_text_splitters: 0.3.6
langgraph_sdk: 0.1.53

Optional packages not installed

langserve

Other Dependencies

aiohttp<4.0.0,>=3.8.3: Installed. No version info available.
anthropic<1,>=0.47.0: Installed. No version info available.
async-timeout<5.0.0,>=4.0.0;: Installed. No version info available.
cohere: 5.14.0
dataclasses-json<0.7,>=0.5.7: Installed. No version info available.
fastembed: 0.6.0
groq: 0.18.0
httpx: 0.28.1
httpx-sse<1.0.0,>=0.4.0: Installed. No version info available.
jsonpatch<2.0,>=1.33: Installed. No version info available.
langchain-anthropic;: Installed. No version info available.
langchain-aws;: Installed. No version info available.
langchain-cohere;: Installed. No version info available.
langchain-community;: Installed. No version info available.
langchain-core<1.0.0,>=0.3.34: Installed. No version info available.
langchain-core<1.0.0,>=0.3.35: Installed. No version info available.
langchain-core<1.0.0,>=0.3.37: Installed. No version info available.
langchain-core<1.0.0,>=0.3.39: Installed. No version info available.
langchain-deepseek;: Installed. No version info available.
langchain-fireworks;: Installed. No version info available.
langchain-google-genai;: Installed. No version info available.
langchain-google-vertexai;: Installed. No version info available.
langchain-groq;: Installed. No version info available.
langchain-huggingface;: Installed. No version info available.
langchain-mistralai;: Installed. No version info available.
langchain-ollama;: Installed. No version info available.
langchain-openai;: Installed. No version info available.
langchain-text-splitters<1.0.0,>=0.3.6: Installed. No version info available.
langchain-together;: Installed. No version info available.
langchain-xai;: Installed. No version info available.
langchain<1.0.0,>=0.3.19: Installed. No version info available.
langsmith-pyo3: Installed. No version info available.
langsmith<0.4,>=0.1.125: Installed. No version info available.
langsmith<0.4,>=0.1.17: Installed. No version info available.
numpy<2,>=1.26.4;: Installed. No version info available.
numpy<3,>=1.26.2;: Installed. No version info available.
openai<2.0.0,>=1.58.1: Installed. No version info available.
orjson: 3.10.15
packaging: 24.2
packaging<25,>=23.2: Installed. No version info available.
pydantic: 2.10.6
pydantic-settings<3.0.0,>=2.4.0: Installed. No version info available.
pydantic<3.0.0,>=2.5.2;: Installed. No version info available.
pydantic<3.0.0,>=2.7.4: Installed. No version info available.
pydantic<3.0.0,>=2.7.4;: Installed. No version info available.
pytest: Installed. No version info available.
PyYAML>=5.3: Installed. No version info available.
qdrant-client: 1.13.2
requests: 2.32.3
requests-toolbelt: 1.0.0
requests<3,>=2: Installed. No version info available.
rich: 13.9.4
SQLAlchemy<3,>=1.4: Installed. No version info available.
tenacity!=8.4.0,<10,>=8.1.0: Installed. No version info available.
tenacity!=8.4.0,<10.0.0,>=8.1.0: Installed. No version info available.
tiktoken<1,>=0.7: Installed. No version info available.
types-pyyaml: 6.0.12.20241230
typing-extensions>=4.7: Installed. No version info available.
zstandard: 0.23.0

@dosubot dosubot bot added the Ɑ: vector store Related to vector store module label Apr 4, 2025
@simpliatanu
Copy link
Author

figured out the usage should be , vector_store.add_documents([document_1], shard_key_selector="Movo")
vector_store.add_documents([document_2], shard_key_selector="Bravo")
closing the issue, may be docs should be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

1 participant