You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been learning more about ML, and also Rust, recently and love Candle for giving people the opportunity to use Rust directly.
I have been testing a couple of models and hit a couple of issues - mainly with Quantised Nemo 2407 models as a Q8 Nemo model seems to be the extent my device can handle.
At first I tried writing my own code from the Mistral examples as they mentioned 2407, until I realised from another issue I can't find immediately that it was recommended to use the 'quantized' example instead as the Mistral example was built for a very specific set of models.
The error I get running it either via my own code emulating the quantized example or simply running the 'quantized' example directly is the exact same:
I tested with multiple Nemo 2407 models, notably TheBloke's one, based on the error being the same both in my code and running the example I am guessing it's because 2407 isn't supported with quantization.
Hi,
I have been learning more about ML, and also Rust, recently and love Candle for giving people the opportunity to use Rust directly.
I have been testing a couple of models and hit a couple of issues - mainly with Quantised Nemo 2407 models as a Q8 Nemo model seems to be the extent my device can handle.
At first I tried writing my own code from the Mistral examples as they mentioned 2407, until I realised from another issue I can't find immediately that it was recommended to use the 'quantized' example instead as the Mistral example was built for a very specific set of models.
The error I get running it either via my own code emulating the quantized example or simply running the 'quantized' example directly is the exact same:
I tested with multiple Nemo 2407 models, notably TheBloke's one, based on the error being the same both in my code and running the example I am guessing it's because 2407 isn't supported with quantization.
Which your readme and this line in the example confirm:
https://github.com/huggingface/candle/blob/main/candle-examples/examples/mistral/main.rs#L266
Unfortunately I don't know enough about tensors and how to deconstruct a GGUF model to figure out what the fix is on my own without some guidance.
The text was updated successfully, but these errors were encountered: