Skip to content

Commit c2efa26

Browse files
authored
Update LangChain examples to use upstream (#12388)
* Update LangChain examples to use upstream * Update README and fix links * Update LangChain CPU examples to use upstream * Update LangChain CPU voice_assistant example * Update CPU README * Update GPU README * Remove GPU Langchain vLLM example and fix comments * Change langchain -> LangChain * Add reference for both upstream llms and embeddings * Fix comments * Fix comments * Fix comments * Fix comments * Fix comment
1 parent 24b46b2 commit c2efa26

File tree

11 files changed

+325
-290
lines changed

11 files changed

+325
-290
lines changed
+96-45
Original file line numberDiff line numberDiff line change
@@ -1,90 +1,141 @@
1-
## Langchain Examples
1+
# LangChain Example
22

3-
This folder contains examples showcasing how to use `langchain` with `ipex-llm`.
3+
The examples in this folder shows how to use [LangChain](https://www.langchain.com/) with `ipex-llm` on Intel CPU.
44

5-
### Install-IPEX LLM
5+
> [!NOTE]
6+
> Please refer [here](https://python.langchain.com/docs/integrations/llms/ipex_llm) for upstream LangChain LLM documentation with ipex-llm and [here](https://python.langchain.com/docs/integrations/text_embedding/ipex_llm/) for upstream LangChain embedding documentation with ipex-llm.
67
7-
Ensure `ipex-llm` is installed by following the [IPEX-LLM Installation Guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_cpu.html).
8+
## 0. Requirements
9+
To run these examples with IPEX-LLM, we have some recommended requirements for your machine, please refer to [here](../README.md#recommended-requirements) for more information.
810

9-
### Install Dependences Required by the Examples
11+
## 1. Install
1012

13+
We suggest using conda to manage environment:
14+
15+
On Linux:
1116

1217
```bash
13-
pip install langchain==0.0.184
14-
pip install -U chromadb==0.3.25
15-
pip install -U pandas==2.0.3
18+
conda create -n llm python=3.11
19+
conda activate llm
20+
21+
# install ipex-llm with 'all' option
22+
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
1623
```
1724

25+
On Windows:
26+
```cmd
27+
onda create -n llm python=3.11
28+
conda activate llm
29+
30+
pip install --pre --upgrade ipex-llm[all]
31+
```
1832

19-
### Example: Chat
33+
## 2. Run examples with LangChain
2034

21-
The chat example ([chat.py](./chat.py)) shows how to use `LLMChain` to build a chat pipeline.
35+
### 2.1. Example: Streaming Chat
2236

23-
To run the example, execute the following command in the current directory:
37+
Install LangChain dependencies:
2438

2539
```bash
26-
python chat.py -m <path_to_model> [-q <your_question>]
40+
pip install -U langchain langchain-community
2741
```
28-
> Note: if `-q` is not specified, it will use `What is AI` by default.
29-
30-
### Example: RAG (Retrival Augmented Generation)
31-
32-
The RAG example ([rag.py](./rag.py)) shows how to load the input text into vector database, and then use `load_qa_chain` to build a retrival pipeline.
3342

34-
To run the example, execute the following command in the current directory:
43+
In the current directory, run the example with command:
3544

3645
```bash
37-
python rag.py -m <path_to_model> [-q <your_question>] [-i <path_to_input_txt>]
46+
python chat.py -m MODEL_PATH -q QUESTION
3847
```
39-
> Note: If `-i` is not specified, it will use a short introduction to Big-DL as input by default. if `-q` is not specified, `What is IPEX LLM?` will be used by default.
48+
**Additional Parameters for Configuration:**
49+
- `-m MODEL_PATH`: **required**, path to the model
50+
- `-q QUESTION`: question to ask. Default is `What is AI?`.
4051

52+
### 2.2. Example: Retrival Augmented Generation (RAG)
4153

42-
### Example: Math
54+
The RAG example ([rag.py](./rag.py)) shows how to load the input text into vector database, and then use LangChain to build a retrival pipeline.
4355

44-
The math example ([math.py](./llm_math.py)) shows how to build a chat pipeline specialized in solving math questions. For example, you can ask `What is 13 raised to the .3432 power?`
56+
Install LangChain dependencies:
4557

46-
To run the exmaple, execute the following command in the current directory:
58+
```bash
59+
pip install -U langchain langchain-community langchain-chroma sentence-transformers==3.0.1
60+
```
61+
62+
In the current directory, run the example with command:
4763

4864
```bash
49-
python llm_math.py -m <path_to_model> [-q <your_question>]
65+
python rag.py -m <path_to_llm_model> -e <path_to_embedding_model> [-q QUESTION] [-i INPUT_PATH]
5066
```
51-
> Note: if `-q` is not specified, it will use `What is 13 raised to the .3432 power?` by default.
67+
**Additional Parameters for Configuration:**
68+
- `-m LLM_MODEL_PATH`: **required**, path to the model.
69+
- `-e EMBEDDING_MODEL_PATH`: **required**, path to the embedding model.
70+
- `-q QUESTION`: question to ask. Default is `What is IPEX-LLM?`.
71+
- `-i INPUT_PATH`: path to the input doc.
5272

5373

54-
### Example: Voice Assistant
74+
### 2.3. Example: Low Bit
5575

56-
The voice assistant example ([voiceassistant.py](./voiceassistant.py)) showcases how to use langchain to build a pipeline that takes in your speech as input in realtime, use an ASR model (e.g. [Whisper-Medium](https://huggingface.co/openai/whisper-medium)) to turn speech into text, and then feed the text into large language model to get response.
76+
The low_bit example ([low_bit.py](./low_bit.py)) showcases how to use use LangChain with low_bit optimized model.
77+
By `save_low_bit` we save the weights of low_bit model into the target folder.
78+
> [!NOTE]
79+
> `save_low_bit` only saves the weights of the model.
80+
> Users could copy the tokenizer model into the target folder or specify `tokenizer_id` during initialization.
5781
58-
To run the exmaple, execute the following command in the current directory:
82+
Install LangChain dependencies:
5983

6084
```bash
61-
python voiceassistant.py -m <path_to_model> [-q <your_question>]
85+
pip install -U langchain langchain-community
6286
```
63-
**Runtime Arguments Explained**:
64-
- `-m MODEL_PATH`: **Required**, the path to the
65-
- `-r RECOGNITION_MODEL_PATH`: **Required**, the path to the huggingface speech recognition model
66-
- `-x MAX_NEW_TOKENS`: the max new tokens of model tokens input
67-
- `-l LANGUAGE`: you can specify a language such as "english" or "chinese"
68-
- `-d True|False`: whether the model path specified in -m is saved low bit model.
69-
7087

71-
### Example: Low Bit
88+
In the current directory, run the example with command:
7289

73-
The low_bit example ([low_bit.py](./low_bit.py)) showcases how to use use langchain with low_bit optimized model.
74-
By `save_low_bit` we save the weights of low_bit model into the target folder.
75-
> Note: `save_low_bit` only saves the weights of the model.
76-
> Users could copy the tokenizer model into the target folder or specify `tokenizer_id` during initialization.
7790
```bash
7891
python low_bit.py -m <path_to_model> -t <path_to_target> [-q <your question>]
7992
```
80-
**Runtime Arguments Explained**:
93+
**Additional Parameters for Configuration:**
8194
- `-m MODEL_PATH`: **Required**, the path to the model
8295
- `-t TARGET_PATH`: **Required**, the path to save the low_bit model
83-
- `-q QUESTION`: the question
96+
- `-q QUESTION`: question to ask. Default is `What is AI?`.
97+
98+
### 2.4. Example: Math
8499

100+
The math example ([math.py](./llm_math.py)) shows how to build a chat pipeline specialized in solving math questions. For example, you can ask `What is 13 raised to the .3432 power?`
101+
102+
Install LangChain dependencies:
103+
104+
```bash
105+
pip install -U langchain langchain-community
106+
```
107+
108+
In the current directory, run the example with command:
109+
110+
```bash
111+
python llm_math.py -m <path_to_model> [-q <your_question>]
112+
```
113+
114+
**Additional Parameters for Configuration:**
115+
- `-m MODEL_PATH`: **Required**, the path to the model
116+
- `-q QUESTION`: question to ask. Default is `What is 13 raised to the .3432 power?`.
85117

118+
> [!NOTE]
119+
> If `-q` is not specified, it will use `What is 13 raised to the .3432 power?` by default.
86120
87-
### Legacy (Native INT4 examples)
121+
### 2.5. Example: Voice Assistant
88122

89-
IPEX-LLM also provides langchain integrations using native INT4 mode. Those examples can be foud in [native_int4](./native_int4/) folder. For detailed instructions of settting up and running `native_int4` examples, refer to [Native INT4 Examples README](./README_nativeint4.md).
123+
The voice assistant example ([voiceassistant.py](./voiceassistant.py)) showcases how to use LangChain to build a pipeline that takes in your speech as input in realtime, use an ASR model (e.g. [Whisper-Medium](https://huggingface.co/openai/whisper-medium)) to turn speech into text, and then feed the text into large language model to get response.
90124

125+
Install LangChain dependencies:
126+
```bash
127+
pip install -U langchain langchain-community
128+
pip install transformers==4.36.2
129+
```
130+
131+
To run the exmaple, execute the following command in the current directory:
132+
133+
```bash
134+
python voiceassistant.py -m <path_to_model> -r <path_to_recognition_model> [-q <your_question>]
135+
```
136+
**Additional Parameters for Configuration:**
137+
- `-m MODEL_PATH`: **Required**, the path to the
138+
- `-r RECOGNITION_MODEL_PATH`: **Required**, the path to the huggingface speech recognition model
139+
- `-x MAX_NEW_TOKENS`: the max new tokens of model tokens input
140+
- `-l LANGUAGE`: you can specify a language such as "english" or "chinese"
141+
- `-d True|False`: whether the model path specified in -m is saved low bit model.

python/llm/example/CPU/LangChain/chat.py

+14-13
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,13 @@
2020
# only search the first bigdl package and end up finding only one sub-package.
2121

2222
import argparse
23+
import warnings
2324

24-
from ipex_llm.langchain.llms import TransformersLLM, TransformersPipelineLLM
25-
from langchain import PromptTemplate, LLMChain
26-
from langchain import HuggingFacePipeline
25+
from langchain.chains import LLMChain
26+
from langchain_community.llms import IpexLLM
27+
from langchain_core.prompts import PromptTemplate
28+
29+
warnings.filterwarnings("ignore", category=UserWarning, message=".*padding_mask.*")
2730

2831

2932
def main(args):
@@ -38,20 +41,18 @@ def main(args):
3841

3942
prompt = PromptTemplate(template=template, input_variables=["question"])
4043

41-
# llm = TransformersPipelineLLM.from_model_id(
42-
# model_id=model_path,
43-
# task="text-generation",
44-
# model_kwargs={"temperature": 0, "max_length": 64, "trust_remote_code": True},
45-
# )
46-
47-
llm = TransformersLLM.from_model_id(
44+
llm = IpexLLM.from_model_id(
4845
model_id=model_path,
49-
model_kwargs={"temperature": 0, "max_length": 64, "trust_remote_code": True},
46+
model_kwargs={
47+
"temperature": 0,
48+
"max_length": 64,
49+
"trust_remote_code": True,
50+
},
5051
)
5152

52-
llm_chain = LLMChain(prompt=prompt, llm=llm)
53+
llm_chain = prompt | llm
5354

54-
output = llm_chain.run(question)
55+
output = llm_chain.invoke(question)
5556
print("====output=====")
5657
print(output)
5758

python/llm/example/CPU/LangChain/llm_math.py

+10-3
Original file line numberDiff line numberDiff line change
@@ -23,19 +23,26 @@
2323
# Code is adapted from https://python.langchain.com/docs/modules/chains/additional/llm_math
2424

2525
import argparse
26+
import warnings
2627

2728
from langchain.chains import LLMMathChain
28-
from ipex_llm.langchain.llms import TransformersLLM, TransformersPipelineLLM
29+
from langchain_community.llms import IpexLLM
30+
31+
warnings.filterwarnings("ignore", category=UserWarning, message=".*padding_mask.*")
2932

3033

3134
def main(args):
3235

3336
question = args.question
3437
model_path = args.model_path
3538

36-
llm = TransformersLLM.from_model_id(
39+
llm = IpexLLM.from_model_id(
3740
model_id=model_path,
38-
model_kwargs={"temperature": 0, "max_length": 1024, "trust_remote_code": True},
41+
model_kwargs={
42+
"temperature": 0,
43+
"max_length": 1024,
44+
"trust_remote_code": True,
45+
},
3946
)
4047

4148
llm_math = LLMMathChain.from_llm(llm, verbose=True)

python/llm/example/CPU/LangChain/low_bit.py

+22-9
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,13 @@
1616

1717

1818
import argparse
19-
from ipex_llm.langchain.llms import TransformersLLM, TransformersPipelineLLM
20-
from langchain import PromptTemplate, LLMChain
21-
from langchain import HuggingFacePipeline
19+
import warnings
20+
21+
from langchain.chains import LLMChain
22+
from langchain_community.llms import IpexLLM
23+
from langchain_core.prompts import PromptTemplate
24+
25+
warnings.filterwarnings("ignore", category=UserWarning, message=".*padding_mask.*")
2226

2327

2428
def main(args):
@@ -29,20 +33,29 @@ def main(args):
2933

3034
prompt = PromptTemplate(template=template, input_variables=["question"])
3135

32-
llm = TransformersLLM.from_model_id(
36+
llm = IpexLLM.from_model_id(
3337
model_id=model_path,
34-
model_kwargs={"temperature": 0, "max_length": 64, "trust_remote_code": True},
38+
model_kwargs={
39+
"temperature": 0,
40+
"max_length": 64,
41+
"trust_remote_code": True,
42+
},
3543
)
3644
llm.model.save_low_bit(low_bit_model_path)
3745
del llm
38-
low_bit_llm = TransformersLLM.from_model_id_low_bit(
46+
llm_lowbit = IpexLLM.from_model_id_low_bit(
3947
model_id=low_bit_model_path,
4048
tokenizer_id=model_path,
41-
model_kwargs={"temperature": 0, "max_length": 64, "trust_remote_code": True}
49+
# tokenizer_name=saved_lowbit_model_path, # copy the tokenizers to saved path if you want to use it this way
50+
model_kwargs={
51+
"temperature": 0,
52+
"max_length": 64,
53+
"trust_remote_code": True,
54+
},
4255
)
43-
llm_chain = LLMChain(prompt=prompt, llm=low_bit_llm)
56+
llm_chain = prompt | llm_lowbit
4457

45-
output = llm_chain.run(question)
58+
output = llm_chain.invoke(question)
4659
print("====output=====")
4760
print(output)
4861

0 commit comments

Comments
 (0)