Skip to content

Commit 29d676b

Browse files
Add examples and change readme to reduce dependency on langchain docs (#44)
* add examples * Update readme
1 parent a9e9021 commit 29d676b

File tree

5 files changed

+3162
-27
lines changed

5 files changed

+3162
-27
lines changed

README.md

Lines changed: 39 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@ Integrates LangChain with SAP HANA Cloud to make use of vector search, knowledge
1313
- **Python Environment**: Ensure you have Python 3.9 or higher installed.
1414
- **SAP HANA Cloud**: Access to a running SAP HANA Cloud instance.
1515

16-
1716
### Installation
1817

1918
Install the LangChain SAP HANA Cloud integration package using `pip`:
@@ -22,49 +21,62 @@ Install the LangChain SAP HANA Cloud integration package using `pip`:
2221
pip install -U langchain-hana
2322
```
2423

25-
### Setting Up Vectorstore
24+
### Vectorstore
2625

2726
The `HanaDB` class is used to connect to SAP HANA Cloud Vector Engine.
2827

28+
>[SAP HANA Cloud Vector Engine](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/sap-hana-cloud-sap-hana-database-vector-engine-guide) is
29+
> a vector store fully integrated into the `SAP HANA Cloud` database.
30+
31+
See a [usage example](./examples/sap_hanavector.ipynb).
32+
33+
```python
34+
from langchain_hana import HanaDB
35+
```
36+
2937
> **Important**: You can use any embedding class that inherits from `langchain_core.embeddings.Embeddings`**including** `HanaInternalEmbeddings`, which runs SAP HANA’s `VECTOR_EMBEDDING()` function internally. See [SAP Help](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/vector-embedding-function-vector?locale=en-US) for more details.
3038
31-
Here’s how to set up the connection and initialize the vector store:
39+
### Self Query Retriever
40+
41+
>[SAP HANA Cloud Vector Engine](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/sap-hana-cloud-sap-hana-database-vector-engine-guide)
42+
> also provides a Self Query Retriever implementation using the `HanaTranslator` Class.
43+
44+
See a [usage example](./examples/hanavector_self_query.ipynb).
45+
46+
```python
47+
from langchain_hana import HanaTranslator
48+
```
49+
50+
### Graph
51+
52+
>[SAP HANA Cloud Knowledge Graph Engine](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-knowledge-graph-guide/sap-hana-cloud-sap-hana-database-knowledge-graph-engine-guide)
53+
> provides support to utilise knowledge graphs through the `HanaRdfGraph` Class.
54+
55+
See a [usage example](./examples/sap_hana_rdf_graph.ipynb).
3256

3357
```python
34-
from langchain_hana import HanaDB, HanaInternalEmbeddings
35-
from langchain_openai import OpenAIEmbeddings
36-
from hdbcli import dbapi
37-
38-
# 1) HANA-internal embedding
39-
internal_emb = HanaInternalEmbeddings(internal_embedding_model_id="SAP_NEB.20240715")
40-
# 2) External embedding
41-
external_emb = OpenAIEmbeddings()
42-
43-
# Establish the SAP HANA Cloud connection
44-
connection = dbapi.connect(
45-
address="<hostname>",
46-
port=3<NN>MM,
47-
user="<username>",
48-
password="<password>"
49-
)
50-
51-
# Initialize the HanaDB vector store
52-
vectorstore = HanaDB(
53-
connection=connection,
54-
embedding=internal_emb, # or external_emb
55-
table_name="<table_name>" # Optional: Default is "EMBEDDINGS"
56-
)
58+
from langchain_hana import HanaRdfGraph
59+
```
5760

61+
### Chains
62+
63+
A `SparqlQAChain` is also provided which can be used with `HanaRdfGraph` for SPARQL-QA tasks.
64+
See a [usage example](./examples/sap_hana_sparql_qa_chain.ipynb).
65+
66+
```python
67+
from langchain_hana import HanaSparqlQAChain
5868
```
69+
5970
## Documentation
6071

61-
For a detailed guide on using the package, please refer to [Langchain Hana Docs](https://python.langchain.com/docs/integrations/providers/sap/).
72+
For a detailed guide on using the package, please refer to the [examples](./examples/) here.
6273

6374
## Support, Feedback, Contributing
6475

6576
This project is open to feature requests/suggestions, bug reports etc. via [GitHub issues](https://github.com/SAP/langchain-integration-for-sap-hana-cloud/issues). Contribution and feedback are encouraged and always welcome. For more information about how to contribute, the project structure, as well as additional contribution information, see our [Contribution Guidelines](CONTRIBUTING.md).
6677

6778
## Security / Disclosure
79+
6880
If you find any bug that may be a security problem, please follow our instructions at [in our security policy](https://github.com/SAP/langchain-integration-for-sap-hana-cloud/security/policy) on how to report it. Please do not create GitHub issues for security-related doubts or problems.
6981

7082
## Code of Conduct
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# SAP HANA Cloud Vector Engine\n",
8+
"\n",
9+
"For more information on how to setup the SAP HANA vetor store, take a look at the [documentation](/docs/integrations/vectorstores/sap_hanavector.ipynb).\n",
10+
"\n",
11+
"We use the same setup here:"
12+
]
13+
},
14+
{
15+
"cell_type": "code",
16+
"execution_count": null,
17+
"metadata": {},
18+
"outputs": [],
19+
"source": [
20+
"import os\n",
21+
"\n",
22+
"# Use OPENAI_API_KEY env variable\n",
23+
"# os.environ[\"OPENAI_API_KEY\"] = \"Your OpenAI API key\"\n",
24+
"from hdbcli import dbapi\n",
25+
"\n",
26+
"# Use connection settings from the environment\n",
27+
"connection = dbapi.connect(\n",
28+
" address=os.environ.get(\"HANA_DB_ADDRESS\"),\n",
29+
" port=os.environ.get(\"HANA_DB_PORT\"),\n",
30+
" user=os.environ.get(\"HANA_DB_USER\"),\n",
31+
" password=os.environ.get(\"HANA_DB_PASSWORD\"),\n",
32+
" autocommit=True,\n",
33+
" sslValidateCertificate=False,\n",
34+
")"
35+
]
36+
},
37+
{
38+
"cell_type": "markdown",
39+
"metadata": {},
40+
"source": [
41+
"To be able to self query with good performance we create additional metadata fields\n",
42+
"for our vectorstore table in HANA:"
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"metadata": {},
49+
"outputs": [],
50+
"source": [
51+
"# Create custom table with attribute\n",
52+
"cur = connection.cursor()\n",
53+
"cur.execute(\"DROP TABLE LANGCHAIN_DEMO_SELF_QUERY\", ignoreErrors=True)\n",
54+
"cur.execute(\n",
55+
" (\n",
56+
" \"\"\"CREATE TABLE \"LANGCHAIN_DEMO_SELF_QUERY\" (\n",
57+
" \"name\" NVARCHAR(100), \"is_active\" BOOLEAN, \"id\" INTEGER, \"height\" DOUBLE,\n",
58+
" \"VEC_TEXT\" NCLOB, \n",
59+
" \"VEC_META\" NCLOB, \n",
60+
" \"VEC_VECTOR\" REAL_VECTOR\n",
61+
" )\"\"\"\n",
62+
" )\n",
63+
")"
64+
]
65+
},
66+
{
67+
"cell_type": "markdown",
68+
"metadata": {},
69+
"source": [
70+
"Let's add some documents."
71+
]
72+
},
73+
{
74+
"cell_type": "code",
75+
"execution_count": null,
76+
"metadata": {},
77+
"outputs": [],
78+
"source": [
79+
"from langchain_community.vectorstores.hanavector import HanaDB\n",
80+
"from langchain_core.documents import Document\n",
81+
"from langchain_openai import OpenAIEmbeddings\n",
82+
"\n",
83+
"embeddings = OpenAIEmbeddings()\n",
84+
"\n",
85+
"# Prepare some test documents\n",
86+
"docs = [\n",
87+
" Document(\n",
88+
" page_content=\"First\",\n",
89+
" metadata={\"name\": \"adam\", \"is_active\": True, \"id\": 1, \"height\": 10.0},\n",
90+
" ),\n",
91+
" Document(\n",
92+
" page_content=\"Second\",\n",
93+
" metadata={\"name\": \"bob\", \"is_active\": False, \"id\": 2, \"height\": 5.7},\n",
94+
" ),\n",
95+
" Document(\n",
96+
" page_content=\"Third\",\n",
97+
" metadata={\"name\": \"jane\", \"is_active\": True, \"id\": 3, \"height\": 2.4},\n",
98+
" ),\n",
99+
"]\n",
100+
"\n",
101+
"db = HanaDB(\n",
102+
" connection=connection,\n",
103+
" embedding=embeddings,\n",
104+
" table_name=\"LANGCHAIN_DEMO_SELF_QUERY\",\n",
105+
" specific_metadata_columns=[\"name\", \"is_active\", \"id\", \"height\"],\n",
106+
")\n",
107+
"\n",
108+
"# Delete already existing documents from the table\n",
109+
"db.delete(filter={})\n",
110+
"db.add_documents(docs)"
111+
]
112+
},
113+
{
114+
"cell_type": "markdown",
115+
"metadata": {},
116+
"source": [
117+
"## Self querying\n",
118+
"\n",
119+
"Now for the main act: here is how to construct a SelfQueryRetriever for HANA vectorstore:"
120+
]
121+
},
122+
{
123+
"cell_type": "code",
124+
"execution_count": null,
125+
"metadata": {},
126+
"outputs": [],
127+
"source": [
128+
"from langchain.chains.query_constructor.schema import AttributeInfo\n",
129+
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
130+
"from langchain_community.query_constructors.hanavector import HanaTranslator\n",
131+
"from langchain_openai import ChatOpenAI\n",
132+
"\n",
133+
"llm = ChatOpenAI(model=\"gpt-3.5-turbo\")\n",
134+
"\n",
135+
"metadata_field_info = [\n",
136+
" AttributeInfo(\n",
137+
" name=\"name\",\n",
138+
" description=\"The name of the person\",\n",
139+
" type=\"string\",\n",
140+
" ),\n",
141+
" AttributeInfo(\n",
142+
" name=\"is_active\",\n",
143+
" description=\"Whether the person is active\",\n",
144+
" type=\"boolean\",\n",
145+
" ),\n",
146+
" AttributeInfo(\n",
147+
" name=\"id\",\n",
148+
" description=\"The ID of the person\",\n",
149+
" type=\"integer\",\n",
150+
" ),\n",
151+
" AttributeInfo(\n",
152+
" name=\"height\",\n",
153+
" description=\"The height of the person\",\n",
154+
" type=\"float\",\n",
155+
" ),\n",
156+
"]\n",
157+
"\n",
158+
"document_content_description = \"A collection of persons\"\n",
159+
"\n",
160+
"hana_translator = HanaTranslator()\n",
161+
"\n",
162+
"retriever = SelfQueryRetriever.from_llm(\n",
163+
" llm,\n",
164+
" db,\n",
165+
" document_content_description,\n",
166+
" metadata_field_info,\n",
167+
" structured_query_translator=hana_translator,\n",
168+
")"
169+
]
170+
},
171+
{
172+
"cell_type": "markdown",
173+
"metadata": {},
174+
"source": [
175+
"Let's use this retriever to prepare a (self) query for a person:"
176+
]
177+
},
178+
{
179+
"cell_type": "code",
180+
"execution_count": null,
181+
"metadata": {},
182+
"outputs": [],
183+
"source": [
184+
"query_prompt = \"Which person is not active?\"\n",
185+
"\n",
186+
"docs = retriever.invoke(input=query_prompt)\n",
187+
"for doc in docs:\n",
188+
" print(\"-\" * 80)\n",
189+
" print(doc.page_content, \" \", doc.metadata)"
190+
]
191+
},
192+
{
193+
"cell_type": "markdown",
194+
"metadata": {},
195+
"source": [
196+
"We can also take a look at how the query is being constructed:"
197+
]
198+
},
199+
{
200+
"cell_type": "code",
201+
"execution_count": null,
202+
"metadata": {},
203+
"outputs": [],
204+
"source": [
205+
"from langchain.chains.query_constructor.base import (\n",
206+
" StructuredQueryOutputParser,\n",
207+
" get_query_constructor_prompt,\n",
208+
")\n",
209+
"\n",
210+
"prompt = get_query_constructor_prompt(\n",
211+
" document_content_description,\n",
212+
" metadata_field_info,\n",
213+
")\n",
214+
"output_parser = StructuredQueryOutputParser.from_components()\n",
215+
"query_constructor = prompt | llm | output_parser\n",
216+
"\n",
217+
"sq = query_constructor.invoke(input=query_prompt)\n",
218+
"\n",
219+
"print(\"Structured query: \", sq)\n",
220+
"\n",
221+
"print(\"Translated for hana vector store: \", hana_translator.visit_structured_query(sq))"
222+
]
223+
}
224+
],
225+
"metadata": {
226+
"kernelspec": {
227+
"display_name": ".venv",
228+
"language": "python",
229+
"name": "python3"
230+
},
231+
"language_info": {
232+
"codemirror_mode": {
233+
"name": "ipython",
234+
"version": 3
235+
},
236+
"file_extension": ".py",
237+
"mimetype": "text/x-python",
238+
"name": "python",
239+
"nbconvert_exporter": "python",
240+
"pygments_lexer": "ipython3",
241+
"version": "3.10.14"
242+
}
243+
},
244+
"nbformat": 4,
245+
"nbformat_minor": 2
246+
}

0 commit comments

Comments
 (0)