gemma2 perplexity - strange results #9020

fedric95 · 2024-08-14T01:26:09Z

fedric95
Aug 14, 2024

I was computing some conversion of Gemma 2 2b in GGUF (llama.cpp b3496 release) with the following code and wikitext-2-raw/wiki.test.raw as dataset:

``
git clone https://huggingface.co/google/gemma-2-2b
cd llama.cpp
python ./llama.cpp/convert_hf_to_gguf.py gemma-2-2b --outtype f32 --outfile gemma-2-2b.FP32.gguf
python ./llama.cpp/convert_hf_to_gguf.py gemma-2-2b --outtype q8_0 --outfile gemma-2-2b-Q8_0.gguf

./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q6_K.gguf Q6_K
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q5_K_M.gguf Q5_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q5_K_S.gguf Q5_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q4_K_M.gguf Q4_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q4_K_S.gguf Q4_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_L.gguf Q3_K_L
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_M.gguf Q3_K_M
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q3_K_S.gguf Q3_K_S
./llama-quantize ../gemma-2-2b.FP32.gguf ../gemma-2-2b-Q2_K.gguf Q2_K

./llama-perplexity -m ../gemma-2-2b.FP32.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q8_0.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q6_K.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q5_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q5_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q4_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q4_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_L.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_M.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q3_K_S.gguf -f ../wikitext-2-raw/wiki.test.raw
./llama-perplexity -m ../gemma-2-2b-Q2_K.gguf -f ../wikitext-2-raw/wiki.test.raw
``

And the results that I am obtaining for the FP32 version are very strange... they are the worst. For the other versions, the numbers look OK, I am, as expected a strong reduction in perplexity.

Quant type	Perplexity
FP32	13.7127 +/- 0.10515
Q8_0	8.9299 +/- 0.06377
Q6_K	8.9570 +/- 0.06404
Q5_K_M	9.0061 +/- 0.06461
Q5_K_S	9.0096 +/- 0.06451
Q4_K_M	9.2260 +/- 0.06643
Q4_K_S	9.3116 +/- 0.06726
Q3_K_L	9.5683 +/- 0.06909
Q3_K_M	9.7759 +/- 0.07120
Q3_K_S	10.8067 +/- 0.08032
Q2_K	13.8994 +/- 0.10723

ggerganov · 2024-08-14T08:04:50Z

ggerganov
Aug 14, 2024
Maintainer

I get PPL = 8.9236 +/- 0.06373 using the F32 on Mac. Double-check your conversion

1 reply

fedric95 Aug 14, 2024
Author

I am still getting high results. Now I am trying with the last release and the results are "normal". With FP32 I am obtaining your result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemma2 perplexity - strange results #9020

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

gemma2 perplexity - strange results #9020

fedric95 Aug 14, 2024

Replies: 1 comment · 1 reply

ggerganov Aug 14, 2024 Maintainer

fedric95 Aug 14, 2024 Author

fedric95
Aug 14, 2024

Replies: 1 comment 1 reply

ggerganov
Aug 14, 2024
Maintainer

fedric95 Aug 14, 2024
Author