This package contains the LangChain integration with OceanBase.
OceanBase Database is a distributed relational database. It is developed entirely by Ant Group. The OceanBase Database is built on a common server cluster. Based on the Paxos protocol and its distributed structure, the OceanBase Database provides high availability and linear scalability.
OceanBase currently has the ability to store vectors. Users can easily perform the following operations with SQL:
- Create a table containing vector type fields;
- Create a vector index table based on the HNSW algorithm;
- Perform vector approximate nearest neighbor queries;
- ...
- Vector Storage: Store embeddings from any LangChain embedding model in OceanBase with automatic table creation and index management.
- Similarity Search: Perform efficient similarity searches on vector data with multiple distance metrics (L2, cosine, inner product).
- Hybrid Search: Combine vector search with sparse vector search and full-text search for improved results with configurable weights.
- Maximal Marginal Relevance: Filter for diversity in search results to avoid redundant information.
- Multiple Index Types: Support for HNSW, IVF, FLAT and other vector index types with automatic parameter optimization.
- Sparse Embeddings: Native support for sparse vector embeddings with BM25-like functionality.
- Advanced Filtering: Built-in support for metadata filtering and complex query conditions.
- Async Support: Full support for async operations and high-concurrency scenarios.
pip install -U langchain-oceanbase- Python >=3.10
- langchain-core >=1.0.0
- pyobvector >=0.2.17
Tip: The current version supports
langchain-core >=1.0.0
We recommend using Docker to deploy OceanBase:
docker run --name=oceanbase -e MODE=mini -e OB_SERVER_IP=127.0.0.1 -p 2881:2881 -d oceanbase/oceanbase-ce:latestFor AI Functions support, use OceanBase 4.4.1 or later:
docker run --name=oceanbase -e MODE=mini -e OB_SERVER_IP=127.0.0.1 -p 2881:2881 -d oceanbase/oceanbase-ce:4.4.1.0-100000032025101610More methods to deploy OceanBase cluster
Choose your preferred format:
- Jupyter Notebook - Interactive notebook with executable code cells
- Markdown - Static documentation for easy reading
- Hybrid Search Guide - Interactive notebook for hybrid search features
- Hybrid Search Guide (Markdown) - Static documentation for hybrid search
- AI Functions Guide - Documentation for AI Functions (AI_EMBED, AI_COMPLETE, AI_RERANK)
- AI Functions Guide (Notebook) - Interactive notebook for AI Functions
- Setup - Deploy OceanBase and install packages
- Vector Search - Semantic similarity matching
- Sparse Vector Search - Keyword-based exact matching
- Full-text Search - Content-based text search
- Multi-modal Search - Combined search strategies
- Setup - Deploy OceanBase and configure AI models
- Initialization - Configure and create AI functions client
- AI_EMBED - Convert text to vector embeddings
- AI_COMPLETE - Generate text completions
- AI_RERANK - Rerank search results
- Model Configuration API - Setup AI models and endpoints
Get started quickly with the following sections:
- Setup - Deploy OceanBase and install dependencies
- Initialization - Configure and create vector store
- Manage vector store - Add, update, and delete vectors
- Query vector store - Search and retrieve vectors
- Build RAG(Retrieval Augmented Generation) - Build powerful RAG applications
- Full-text Search - Implement full-text search capabilities
- Hybrid Search - Combine vector and text search for better results
- Advanced Filtering - Metadata filtering and complex query conditions
- Maximal Marginal Relevance - Filter for diversity in search results
- Multiple Index Types - Different vector index types (HNSW, IVF, FLAT)