all (that I should have learned before I was 5) about imatrices #9832

robbiemu · 2024-10-11T00:43:28Z

robbiemu
Oct 11, 2024

I'm quantizing my first model in llama.cpp. I understand that imatrices can improve this process.

I'm starting with a 2b parameter model, and the thing about this model is that it is pretty multilingual (I think it is 36 languages). I see there is a tool to generate the imatrix for the model from a set (I imagine I do this from the fp16 GGUF, then use it when quantizing).

How many samples should I prefer from each language to use the llama-imatrix tool to generate the imatrix? Should the sample size grow with larger models?

Also what do the samples look like? Do they need to be like:

Q:

Foo

A:

Bar

or .. what does structure matter?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

all (that I should have learned before I was 5) about imatrices #9832

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

all (that I should have learned before I was 5) about imatrices #9832

Uh oh!

Uh oh!

robbiemu Oct 11, 2024

Replies: 0 comments

robbiemu
Oct 11, 2024