-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HuggingFace Gemma-2-9b/27b incorrect #53
Comments
Hi @cinjon, The MMLU scores for the Gemma-2-9B-IT and Gemma-2-27B-IT models are 71.3% and 75.2%, respectively. For further reference, please refer to this paper. The performance degradation observed in the Gemma-2-27B-IT model is likely due to its sensitivity to bfloat16 precision settings, which can impact inference quality if not handled properly. For more detailed insights and related discussions, please check the following references: Thank you. |
Hi again. I am struggling with this and made a reproduction for you to look at: https://gist.github.com/cinjon/de9a22f57cfa0dc9ccb2afc255a8093e. The main problem are the results below, which show roughly reproductions on gemma-27b, slight degradation on gemma-27b-it, slight degradation on gemma-2-9b, and terrible result on gemma-2-9b-it. What am I doing wrong? Thanks.
|
To be clear, it's not the "bfloat16" in the gist either - it's roughly the same result with "float32" too. |
Hi, I was able to verify the MMLU score for HuggingFace gemma-2-9b-it to within .2. However, for gemma-2-27b-it, the score (52.3% on all) is way off. Is there some mistake on the repo there? Or is it particularly sensitive to bfloat16?
The text was updated successfully, but these errors were encountered: