Skip to content

Commit fc72193

Browse files
committed
Improve exporting Llama docs
1 parent 5994207 commit fc72193

File tree

1 file changed

+29
-13
lines changed

1 file changed

+29
-13
lines changed

docs/docs/guides/exporting-llama.mdx

+29-13
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,39 @@
11
---
2-
title: Exporting LLaMa
2+
title: Exporting Llama
33
sidebar_position: 2
44
---
55

66
In order to make the process of export as simple as possible for you, we created a script that runs a Docker container and exports the model.
77

8-
1. Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official LLaMa website](https://www.llama.com/llama-downloads/).
9-
2. Pick the model that suits your needs. Before you download it, you'll need to accept a license. For best performance, we recommend using Spin-Quant or QLoRA versions of the model:
10-
- [LLaMa 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original)
11-
- [LLaMa 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B/tree/main/original)
12-
- [LLaMa 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main)
13-
- [LLaMa 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main)
14-
- [LLaMa 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main)
15-
- [LLaMa 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main)
16-
3. Download the `consolidated.00.pth`, `params.json` and `tokenizer.model` files. If you can't see them, make sure to check the `original` directory. Sometimes the files might have other names, for example `original_params.json`.
17-
4. Run `mv tokenizer.model tokenizer.bin`. The library expects the tokenizers to have .bin extension.
18-
5. Run `./build_llama_binary.sh --model-path /path/to/consolidated.00.pth --params-path /path/to/params.json script that's located in the `llama-export` directory.
19-
6. The script will pull a Docker image from docker hub, and then run it to export the model. By default the output (llama3_2.pte file) will be saved in the `llama-export/outputs` directory. However, you can override that behavior with the `--output-path [path]` flag.
8+
## Steps to export Llama
9+
### 1. Create an Account:
10+
Get a [HuggingFace](https://huggingface.co/) account. This will allow you to download needed files. You can also use the [official Llama website](https://www.llama.com/llama-downloads/).
2011

12+
### 2. Select a Model:
13+
Pick the model that suits your needs. Before you download it, you'll need to accept a license. For best performance, we recommend using Spin-Quant or QLoRA versions of the model:
14+
- [Llama 3.2 3B](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct/tree/main/original)
15+
- [Llama 3.2 1B](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main/original)
16+
- [Llama 3.2 3B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-SpinQuant_INT4_EO8/tree/main)
17+
- [Llama 3.2 1B Spin-Quant](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8/tree/main)
18+
- [Llama 3.2 3B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct-QLORA_INT4_EO8/tree/main)
19+
- [Llama 3.2 1B QLoRA](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8/tree/main)
20+
21+
### 3. Download Files:
22+
Download the `consolidated.00.pth`, `params.json` and `tokenizer.model` files. If you can't see them, make sure to check the `original` directory.
23+
24+
### 4. Rename the Tokenizer File:
25+
Rename the `tokenizer.model` file to `tokenizer.bin` as required by the library:
26+
```bash
27+
mv tokenizer.model tokenizer.bin
28+
```
29+
30+
### 5. Run the Export Script:
31+
Navigate to the `llama_export` directory and run the following command:
32+
```bash
33+
./build_llama_binary.sh --model-path /path/to/consolidated.00.pth --params-path /path/to/params.json
34+
```
35+
36+
The script will pull a Docker image from docker hub, and then run it to export the model. By default the output (llama3_2.pte file) will be saved in the `llama-export/outputs` directory. However, you can override that behavior with the `--output-path [path]` flag.
2137

2238
:::note[Note]
2339
This Docker image was tested on MacOS with ARM chip. This might not work in other environments.

0 commit comments

Comments
 (0)