RuntimeError: wrong! device_gemm with the specified compilation parameters does not support this GEMM problem
(2) Tuning for fused_moe with inter_dim 384 failed.
token,model_dim,inter_dim,expert,topk,act_type,dtype,q_dtype_a,q_dtype_w,q_type,use_g1u1,doweight_stage1
2048,3072,384,256,8,ActivationType.Silu,torch.bfloat16,torch.float4_e2m1fn_x2,torch.float4_e2m1fn_x2,QuantType.per_1x32,1,0
Thank you.