Creating an MoE with mergekit - Power of 2 experts? #12977

icsy7867 · 2025-04-16T14:19:52Z

icsy7867
Apr 16, 2025

I am playing around with mergekit, and trying out some unique models.

I am currently testing with 3x12B MoE's which fits really nicely on a 24GB card. When I make the clowncar MOE, I get a warning that the 3 experts are not in a power of two. However, when I test the model (Specifically using a Q4_K_M quant from "GGUF MY REPO" seems to load and work fine.)

But I am curious if there is some detriment that I am not aware of, or some other issues here. I cannot find a lot of documentation or conversations on the topic. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating an MoE with mergekit - Power of 2 experts? #12977

{{title}}

Replies: 0 comments

Select a reply

Creating an MoE with mergekit - Power of 2 experts? #12977

icsy7867 Apr 16, 2025

Replies: 0 comments

icsy7867
Apr 16, 2025