You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am playing around with mergekit, and trying out some unique models.
I am currently testing with 3x12B MoE's which fits really nicely on a 24GB card. When I make the clowncar MOE, I get a warning that the 3 experts are not in a power of two. However, when I test the model (Specifically using a Q4_K_M quant from "GGUF MY REPO" seems to load and work fine.)
But I am curious if there is some detriment that I am not aware of, or some other issues here. I cannot find a lot of documentation or conversations on the topic. Thanks!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am playing around with mergekit, and trying out some unique models.
I am currently testing with 3x12B MoE's which fits really nicely on a 24GB card. When I make the clowncar MOE, I get a warning that the 3 experts are not in a power of two. However, when I test the model (Specifically using a Q4_K_M quant from "GGUF MY REPO" seems to load and work fine.)
But I am curious if there is some detriment that I am not aware of, or some other issues here. I cannot find a lot of documentation or conversations on the topic. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions