Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral Nemo Quantized Support #2727

Open
leflambeur opened this issue Jan 18, 2025 · 0 comments
Open

Mistral Nemo Quantized Support #2727

leflambeur opened this issue Jan 18, 2025 · 0 comments

Comments

@leflambeur
Copy link

leflambeur commented Jan 18, 2025

Hi,

I have been learning more about ML, and also Rust, recently and love Candle for giving people the opportunity to use Rust directly.

I have been testing a couple of models and hit a couple of issues - mainly with Quantised Nemo 2407 models as a Q8 Nemo model seems to be the extent my device can handle.

At first I tried writing my own code from the Mistral examples as they mentioned 2407, until I realised from another issue I can't find immediately that it was recommended to use the 'quantized' example instead as the Mistral example was built for a very specific set of models.

The error I get running it either via my own code emulating the quantized example or simply running the 'quantized' example directly is the exact same:

Error: shape mismatch in reshape, lhs: [1, 11, 4096], rhs: [1, 11, 32, 160]

I tested with multiple Nemo 2407 models, notably TheBloke's one, based on the error being the same both in my code and running the example I am guessing it's because 2407 isn't supported with quantization.

Which your readme and this line in the example confirm:
https://github.com/huggingface/candle/blob/main/candle-examples/examples/mistral/main.rs#L266

Unfortunately I don't know enough about tensors and how to deconstruct a GGUF model to figure out what the fix is on my own without some guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant