Skip to content

fix: generate correct top_k columns when no shared expert#3

Open
Mikezhang001 wants to merge 1 commit into
Aleph-Alpha:mainfrom
Mikezhang001:ZB
Open

fix: generate correct top_k columns when no shared expert#3
Mikezhang001 wants to merge 1 commit into
Aleph-Alpha:mainfrom
Mikezhang001:ZB

Conversation

@Mikezhang001
Copy link
Copy Markdown

Bug

When running jit_moe.py with --no-shared-expert, topk_ids is
generated with top_k-1 columns instead of the expected top_k columns.

This happens because generate_topk_ids is always called with top_k-1,
but the if args.shared_expert block that appends the missing column is
skipped when there is no shared expert.

Fix

Move the generate_topk_ids call inside the if/else block:

  • With shared expert: generate top_k-1 routed experts, then append
    1 fixed shared expert (index E-1) → total top_k columns
  • Without shared expert: generate full top_k experts from all E
    experts → total top_k columns

@Mikezhang001
Copy link
Copy Markdown
Author

HI @SzymonOzog, I noticed a bug in jit_moe.py when using --no-shared-expert.
Could you take a look at this fix when you have time? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant