-
My machine is a MacBook Air M3 24 GB. I've noticed that the same models that other tools can run, fail with out-of-memory errors with llama.cpp.
with 20G
With the default 16384
which is on par with mlx-lm. I can't compare to Ollama, as this model's support only landed in a pre-release version. I wouldn't recommend stretching the |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
I have seen on non-Apple iGPUs that memory use is much higher if not using Someone who uses Apple's devices should be able to tell if this is necessary or not. |
Beta Was this translation helpful? Give feedback.
-
With Metal, you always want to add Increasing the memory limit is needed for such big models because they cannot fit in the default memory limit - not sure you can do anything else here. |
Beta Was this translation helpful? Give feedback.
With Metal, you always want to add
-fa
.Increasing the memory limit is needed for such big models because they cannot fit in the default memory limit - not sure you can do anything else here.