intel
diff --git a/‎README.md
Lines changed: 1 addition & 1 deletion b/‎README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/readthedocs/source/doc/LLM/Overview/install_cpu.md
Lines changed: 1 addition & 1 deletion b/‎docs/readthedocs/source/doc/LLM/Overview/install_cpu.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/readthedocs/source/index.rst
Lines changed: 1 addition & 1 deletion b/‎docs/readthedocs/source/index.rst
Lines changed: 1 addition & 1 deletion
diff --git a/‎python/llm/example/CPU/Applications/hf-agent/README.md
Lines changed: 14 additions & 2 deletions b/‎python/llm/example/CPU/Applications/hf-agent/README.md
Lines changed: 14 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/Applications/streaming-llm/README.md
Lines changed: 10 additions & 0 deletions b/‎python/llm/example/CPU/Applications/streaming-llm/README.md
Lines changed: 10 additions & 0 deletions
diff --git a/‎python/llm/example/CPU/Deepspeed-AutoTP/install.sh
Lines changed: 1 addition & 1 deletion b/‎python/llm/example/CPU/Deepspeed-AutoTP/install.sh
Lines changed: 1 addition & 1 deletion
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ/README.md
Lines changed: 17 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/AWQ/README.md
Lines changed: 17 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF/README.md
Lines changed: 15 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/GGUF/README.md
Lines changed: 15 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ/README.md
Lines changed: 17 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Advanced-Quantizations/GPTQ/README.md
Lines changed: 17 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/README.md
Lines changed: 1 addition & 1 deletion b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/README.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila/README.md
Lines changed: 15 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila/README.md
Lines changed: 15 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila2/README.md
Lines changed: 15 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/aquila2/README.md
Lines changed: 15 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan/README.md
Lines changed: 3 additions & 1 deletion b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan/README.md
Lines changed: 3 additions & 1 deletion
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2/README.md
Lines changed: 16 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/baichuan2/README.md
Lines changed: 16 additions & 2 deletions
diff --git a/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/bluelm/README.md
Lines changed: 15 additions & 2 deletions b/‎python/llm/example/CPU/HF-Transformers-AutoModels/Model/bluelm/README.md
Lines changed: 15 additions & 2 deletions
@@ -110,7 +110,7 @@ See the demo of running [*Text-Generation-WebUI*](https://ipex-llm.readthedocs.i
   - LLM finetuning on Intel [GPU](python/llm/example/GPU/LLM-Finetuning), including [LoRA](python/llm/example/GPU/LLM-Finetuning/LoRA), [QLoRA](python/llm/example/GPU/LLM-Finetuning/QLoRA), [DPO](python/llm/example/GPU/LLM-Finetuning/DPO), [QA-LoRA](python/llm/example/GPU/LLM-Finetuning/QA-LoRA) and [ReLoRA](python/llm/example/GPU/LLM-Finetuning/ReLora)
   - QLoRA finetuning on Intel [CPU](python/llm/example/CPU/QLoRA-FineTuning)
 - Integration with community libraries
-  - [HuggingFace tansformers](python/llm/example/GPU/HF-Transformers-AutoModels)
+  - [HuggingFace transformers](python/llm/example/GPU/HF-Transformers-AutoModels)
   - [Standard PyTorch model](python/llm/example/GPU/PyTorch-Models)
   - [DeepSpeed-AutoTP](python/llm/example/GPU/Deepspeed-AutoTP)
   - [HuggingFace PEFT](python/llm/example/GPU/LLM-Finetuning/HF-PEFT)
 
@@ -97,4 +97,4 @@ Then for running a LLM model with IPEX-LLM optimizations (taking an `example.py`
          # e.g. for a server with 48 cores per socket
          export OMP_NUM_THREADS=48
          numactl -C 0-47 -m 0 python example.py
-```
+```
@@ -162,7 +162,7 @@ Code Examples
 
 * Integration with community libraries
 
-  * `HuggingFace tansformers <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>`_
+  * `HuggingFace transformers <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/HF-Transformers-AutoModels>`_
   * `Standard PyTorch model <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/PyTorch-Models>`_
   * `DeepSpeed-AutoTP <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/Deepspeed-AutoTP>`_
   * `HuggingFace PEFT <https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/GPU/LLM-Finetuning/HF-PEFT>`_
 
@@ -9,14 +9,26 @@ To run this example with IPEX-LLM, we have some recommended requirements for you
 
 ### 1. Install
 We suggest using conda to manage environment:
+
+On Linux:
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
-pip install ipex-llm[all] # install ipex-llm with 'all' option
+ # install ipex-llm with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
 pip install pillow # additional package required for opening images
 ```
 
+On Windows:
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all]
+pip install pillow
+```
+
 ### 2. Run
 ```
 python ./run_agent.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --image-path IMAGE_PATH
@@ -32,7 +44,7 @@ Arguments info:
 
 #### 2.1 Client
 On client Windows machine, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./run_agent.py --image-path IMAGE_PATH
 ```
 
 
@@ -9,10 +9,20 @@ model = AutoModelForCausalLM.from_pretrained(model_name_or_path, load_in_4bit=Tr
 
 ## Prepare Environment
 We suggest using conda to manage environment:
+
+On Linux
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+```
+
+On Windows:
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
 pip install --pre --upgrade ipex-llm[all]
 ```
 
 
@@ -20,4 +20,4 @@ pip install deepspeed==0.11.1
 # 4. exclude intel deepspeed extension, which is only for XPU
 pip uninstall intel-extension-for-deepspeed
 # 5. install ipex-llm
-pip install --pre --upgrade ipex-llm[all]
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
@@ -33,16 +33,31 @@ In the example [generate.py](./generate.py), we show a basic use case for a AWQ
 
 We suggest using conda to manage environment:
 
+On Linux
+
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
 pip install autoawq==0.1.8 --no-deps
-pip install --pre --upgrade ipex-llm[all] # install ipex-llm with 'all' option
+# install ipex-llm with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+pip install transformers==4.35.0
+pip install accelerate==0.25.0
+pip install einops
+```
+On Windows:
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install autoawq==0.1.8 --no-deps
+pip install --pre --upgrade ipex-llm[all]
 pip install transformers==4.35.0
 pip install accelerate==0.25.0
 pip install einops
 ```
+
 **Note: For Mixtral model, please use transformers 4.36.0:**
 ```bash
 pip install transformers==4.36.0
@@ -68,7 +83,7 @@ Arguments info:
 
 On client Windows machine, it is recommended to run directly with full utilization of all cores:
 
-```powershell
+```cmd
 python ./generate.py 
 ```
 
 
@@ -24,19 +24,32 @@ In the example [generate.py](./generate.py), we show a basic use case to load a
 We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
 
 After installing conda, create a Python environment for IPEX-LLM:
+
+On Linux
 ```bash
 conda create -n llm python=3.11 # recommend to use Python 3.11
 conda activate llm
 
-pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
+# install the latest ipex-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
 pip install transformers==4.36.0  # upgrade transformers
 ```
+
+On Windows:
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all] 
+pip install transformers==4.36.0
+```
+
 ### 2. Run
 After setting up the Python environment, you could run the example by following steps.
 
 #### 2.1 Client
 On client Windows machines, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py --model <path_to_gguf_model> --prompt 'What is AI?'
 ```
 More information about arguments can be found in [Arguments Info](#23-arguments-info) section. The expected output can be found in [Sample Output](#24-sample-output) section.
 
@@ -8,16 +8,31 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
 In the example [generate.py](./generate.py), we show a basic use case for a Llama2 model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations.
 ### 1. Install
 We suggest using conda to manage environment:
+
+On Linux
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
-pip install ipex-llm[all] # install ipex-llm with 'all' option
+# install ipex-llm with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
 pip install transformers==4.34.0
 BUILD_CUDA_EXT=0 pip install git+https://github.com/PanQiWei/AutoGPTQ.git@1de9ab6
 pip install optimum==0.14.0
 ```
 
+On Windows:
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all]
+pip install transformers==4.34.0
+set BUILD_CUDA_EXT=0
+pip install git+https://github.com/PanQiWei/AutoGPTQ.git@1de9ab6
+pip install optimum==0.14.0
+```
+
 ### 2. Run
 ```
 python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT
@@ -34,7 +49,7 @@ Arguments info:
 
 #### 2.1 Client
 On client Windows machine, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py 
 ```
 
 
@@ -9,6 +9,6 @@ For OS, IPEX-LLM supports Ubuntu 20.04 or later (glibc>=2.17), CentOS 7 or later
 ## Best Known Configuration on Linux
 For better performance, it is recommended to set environment variables on Linux with the help of IPEX-LLM:
 ```bash
-pip install ipex-llm
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
 source ipex-llm-init
 ```
@@ -15,11 +15,24 @@ In the example [generate.py](./generate.py), we show a basic use case for a Aqui
 We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
 
 After installing conda, create a Python environment for IPEX-LLM:
+
+On Linux
+
 ```bash
 conda create -n llm python=3.11 # recommend to use Python 3.11
 conda activate llm
 
-pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
+# install the latest ipex-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+```
+
+On Windows:
+
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all]
 ```
 
 ### 2. Run
@@ -31,7 +44,7 @@ After setting up the Python environment, you could run the example by following
 
 #### 2.1 Client
 On client Windows machines, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py --prompt 'AI是什么？'
 ```
 More information about arguments can be found in [Arguments Info](#23-arguments-info) section. The expected output can be found in [Sample Output](#24-sample-output) section.
 
@@ -15,11 +15,24 @@ In the example [generate.py](./generate.py), we show a basic use case for a Aqui
 We suggest using conda to manage the Python environment. For more information about conda installation, please refer to [here](https://docs.conda.io/en/latest/miniconda.html#).
 
 After installing conda, create a Python environment for IPEX-LLM:
+
+On Linux:
+
 ```bash
 conda create -n llm python=3.11 # recommend to use Python 3.11
 conda activate llm
 
-pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
+# install the latest ipex-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+```
+
+On Windows:
+
+```cmd
+conda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all]
 ```
 
 ### 2. Run
@@ -31,7 +44,7 @@ After setting up the Python environment, you could run the example by following
 
 #### 2.1 Client
 On client Windows machines, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py --prompt 'AI是什么？'
 ```
 More information about arguments can be found in [Arguments Info](#23-arguments-info) section. The expected output can be found in [Sample Output](#24-sample-output) section.
 
@@ -9,12 +9,14 @@ In the example [generate.py](./generate.py), we show a basic use case for a Baic
 ### 1. Install
 We suggest using conda to manage environment:
 
+
 On Linux:
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
-pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu # install ipex-llm with 'all' option
+# install ipex-llm with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
 pip install transformers_stream_generator  # additional package required for Baichuan-13B-Chat to conduct generation
 ```
 
 
@@ -8,14 +8,28 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
 In the example [generate.py](./generate.py), we show a basic use case for a Baichuan model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations.
 ### 1. Install
 We suggest using conda to manage environment:
+
+On Linux:
+
 ```bash
 conda create -n llm python=3.11
 conda activate llm
 
-pip install ipex-llm[all] # install ipex-llm with 'all' option
+# install ipex-llm with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+
 pip install transformers_stream_generator  # additional package required for Baichuan-13B-Chat to conduct generation
 ```
 
+On Windows:
+```cmd
+onda create -n llm python=3.11
+conda activate llm
+
+pip install --pre --upgrade ipex-llm[all]
+pip install transformers_stream_generator
+```
+
 ### 2. Run
 ```
 python ./generate.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --prompt PROMPT --n-predict N_PREDICT
@@ -32,7 +46,7 @@ Arguments info:
 
 #### 2.1 Client
 On client Windows machine, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py 
 ```
 
 
@@ -8,11 +8,24 @@ To run these examples with IPEX-LLM, we have some recommended requirements for y
 In the example [generate.py](./generate.py), we show a basic use case for a BlueLM model to predict the next N tokens using `generate()` API, with IPEX-LLM INT4 optimizations.
 ### 1. Install
 We suggest using conda to manage environment:
+
+On Linux:
+
 ```bash
+conda create -n llm python=3.11 # recommend to use Python 3.11
+conda activate llm
+
+# install the latest ipex-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
+```
+
+On Windows:
+
+```cmd
 conda create -n llm python=3.11
 conda activate llm
 
-pip install --pre --upgrade ipex-llm[all] # install the latest ipex-llm nightly build with 'all' option
+pip install --pre --upgrade ipex-llm[all]
 ```
 
 ### 2. Run
@@ -31,7 +44,7 @@ Arguments info:
 
 #### 2.1 Client
 On client Windows machine, it is recommended to run directly with full utilization of all cores:
-```powershell
+```cmd
 python ./generate.py 
 ```