Llama freezes when loading model Llama-4-Scout-17B-16E-Instruct GGUF with Multimodal support #15500

CalculonPrime · 2025-08-22T07:00:19Z

CalculonPrime
Aug 22, 2025

I filed this bug again Ooba's Web UI, but diving into his code a bit, it looks like he's just calling into llama to do the loading, so the bug is likely in llama. Should I file an issue here then?

Copying from the bug reported on Ooba's website:

You'll hit the problem with either of these Unsloth quants of Llama-4-Scout-17B-16E-Instruct:

You can use either of the mmproj files:

All files were verified via SHA256 hashes.

The quants work fine when multimodal is not loaded.

If I run Filemon to see what's happening when the bug occurs, I see that when the main model file is done loading, it starts to load the mmproj file, loads about 25K of it, and then closes the file and hangs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama freezes when loading model Llama-4-Scout-17B-16E-Instruct GGUF with Multimodal support #15500

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Llama freezes when loading model Llama-4-Scout-17B-16E-Instruct GGUF with Multimodal support #15500

Uh oh!

CalculonPrime Aug 22, 2025

Replies: 0 comments

CalculonPrime
Aug 22, 2025