You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* modify rag.py
* update readme of gpu example
* update llamaindex cpu example and readme
* add llamaindex doc
* update note style
* import before instancing IpexLLMEmbedding
* update index in readme
* update links
* update link
* update related links
Ensure `ipex-llm` is installed by following the [IPEX-LLM Installation Guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install.html) before proceeding with the examples provided here.
22
-
23
+
> [!NOTE]
24
+
> - You could refer [llama-index-llms-ipex-llm](https://docs.llamaindex.ai/en/stable/examples/llm/ipex_llm/) and [llama-index-embeddings-ipex-llm](https://docs.llamaindex.ai/en/stable/examples/embeddings/ipex_llm/) for more information.
25
+
> - The installation of `llama-index-llms-ipex-llm` or `llama-index-embeddings-ipex-llm` will also install `IPEX-LLM` and its dependencies.
26
+
> - `IpexLLMEmbedding` currently only provides optimization for Hugging Face Bge models.
Copy file name to clipboardExpand all lines: python/llm/example/GPU/LlamaIndex/README.md
+27-13
Original file line number
Diff line number
Diff line change
@@ -8,17 +8,31 @@ This folder contains examples showcasing how to use [**LlamaIndex**](https://git
8
8
## Retrieval-Augmented Generation (RAG) Example
9
9
The RAG example ([rag.py](./rag.py)) is adapted from the [Official llama index RAG example](https://docs.llamaindex.ai/en/stable/examples/low_level/oss_ingestion_retrieval.html). This example builds a pipeline to ingest data (e.g. llama2 paper in pdf format) into a vector database (e.g. PostgreSQL), and then build a retrieval pipeline from that vector database.
10
10
11
+
### 1. Install Prerequisites
11
12
13
+
To benefit from IPEX-LLM on Intel GPUs, there are several prerequisite steps for tools installation and environment preparation.
12
14
13
-
### 1. Setting up Dependencies
15
+
If you are a Windows user, visit the [Install IPEX-LLM on Windows with Intel GPU Guide](../../../../../docs/mddocs/Quickstart/install_windows_gpu.md), and follow [Install Prerequisites](../../../../../docs/mddocs/Quickstart/install_windows_gpu.md#install-prerequisites) to update GPU driver (optional) and install Conda.
16
+
17
+
If you are a Linux user, visit the [Install IPEX-LLM on Linux with Intel GPU](../../../../../docs/mddocs/Quickstart/install_linux_gpu.md), and follow [Install Prerequisites](../../../../../docs/mddocs/Quickstart/install_linux_gpu.md#install-prerequisites) to install GPU driver, Intel® oneAPI Base Toolkit 2024.0, and Conda.
Follow the instructions in [GPU Install Guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install.html) to install ipex-llm.
31
+
> [!NOTE]
32
+
> - You could refer [llama-index-llms-ipex-llm](https://docs.llamaindex.ai/en/stable/examples/llm/ipex_llm_gpu/) and [llama-index-embeddings-ipex-llm](https://docs.llamaindex.ai/en/stable/examples/embeddings/ipex_llm_gpu/) for more information.
33
+
> - The installation of `llama-index-llms-ipex-llm` or `llama-index-embeddings-ipex-llm` will also install `IPEX-LLM` and its dependencies.
34
+
> - You can also use `https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/` as the `extra-indel-url`.
35
+
> - `IpexLLMEmbedding` currently only provides optimization for Hugging Face Bge models.
22
36
23
37
***Database Setup (using PostgreSQL)**:
24
38
* Linux
@@ -71,7 +85,7 @@ The RAG example ([rag.py](./rag.py)) is adapted from the [Official llama index R
- `-t TOKENIZER_PATH`: **Required**, path to the tokenizer model
166
180
167
-
### 5. Example Output
181
+
### 6. Example Output
168
182
169
183
A query such as **"How does Llama 2 compare to other open-source models?"** with the Llama2 paper as the data source, using the `Llama-2-7b-chat-hf` model, will produce the output like below:
170
184
@@ -178,6 +192,6 @@ However, it's important to note that the performance of Llama 2 can vary dependi
178
192
In conclusion, while Llama 2 performs well on most benchmarks compared to other open-source models, its performance
179
193
```
180
194
181
-
### 6. Trouble shooting
182
-
#### 6.1 Core dump
195
+
### 7. Trouble shooting
196
+
#### 7.1 Core dump
183
197
If you encounter a core dump error in your Python code, it is crucial to verify that the `import torch` statement is placed at the top of your Python file, just as what we did in`rag.py`.
0 commit comments