Skip to content

Commit 67c30e1

Browse files
committed
imatrix: add docs
1 parent ae0fcc2 commit 67c30e1

File tree

1 file changed

+59
-0
lines changed

1 file changed

+59
-0
lines changed

docs/imatrix.md

+59
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Importance Matrix (imatrix) Quantization
2+
3+
## What is an Importance Matrix?
4+
5+
Quantization reduces the precision of a model's weights, decreasing its size and computational requirements. However, this can lead to a loss of quality. An importance matrix helps mitigate this by identifying which weights are *most* important for the model's performance. During quantization, these important weights are preserved with higher precision, while less important weights are quantized more aggressively. This allows for better overall quality at a given quantization level.
6+
7+
This originates from work done with language models in [llama.cpp](https://github.com/ggml-org/llama.cpp/blob/master/examples/imatrix/README.md).
8+
9+
## Usage
10+
11+
The imatrix feature involves two main steps: *training* the matrix and *using* it during quantization.
12+
13+
### Training the Importance Matrix
14+
15+
To generate an imatrix, run stable-diffusion.cpp with the `--imat-out` flag, specifying the output filename. This process runs alongside normal image generation.
16+
17+
```bash
18+
sd.exe [same exact parameters as normal generation] --imat-out imatrix.dat
19+
```
20+
21+
* **`[same exact parameters as normal generation]`**: Use the same command-line arguments you would normally use for image generation (e.g., prompt, dimensions, sampling method, etc.).
22+
* **`--imat-out imatrix.dat`**: Specifies the output file for the generated imatrix.
23+
24+
You can generate multiple images at once using the `-b` flag to speed up the training process.
25+
26+
### Continuing Training an Existing Matrix
27+
28+
If you want to refine an existing imatrix, use the `--imat-in` flag *in addition* to `--imat-out`. This will load the existing matrix and continue training it.
29+
30+
```bash
31+
sd.exe [same exact parameters as normal generation] --imat-out imatrix.dat --imat-in imatrix.dat
32+
```
33+
With that, you can train and refine the imatrix while generating images like you'd normally do.
34+
35+
### Using Multiple Matrices
36+
37+
You can load and merge multiple imatrices together:
38+
39+
```bash
40+
sd.exe [same exact parameters as normal generation] --imat-out imatrix.dat --imat-in imatrix.dat --imat-in imatrix2.dat
41+
```
42+
43+
### Quantizing with an Importance Matrix
44+
45+
To quantize a model using a trained imatrix, use the `-M convert` option (or equivalent quantization command) and the `--imat-in` flag, specifying the imatrix file.
46+
47+
```bash
48+
sd.exe -M convert [same exact parameters as normal quantization] --imat-in imatrix.dat
49+
```
50+
51+
* **`[same exact parameters as normal quantization]`**: Use the same command-line arguments you would normally use for quantization (e.g., target quantization method, input/output filenames).
52+
* **`--imat-in imatrix.dat`**: Specifies the imatrix file to use during quantization. You can specify multiple `--imat-in` flags to combine multiple matrices.
53+
54+
## Important Considerations
55+
56+
* The quality of the imatrix depends on the prompts and settings used during training. Use prompts and settings representative of the types of images you intend to generate for the best results.
57+
* Experiment with different training parameters (e.g., number of images, prompt variations) to optimize the imatrix for your specific use case.
58+
* The performance impact of training an imatrix during image generation or using an imatrix for quantization is negligible.
59+
* Using already quantized models to train the imatrix seems to be working fine.

0 commit comments

Comments
 (0)