why we need nxb since wave_tile_n exists? and why gemm_n split into nxb ?
why we need nxb since wave_tile_n exists? and why gemm_n split into nxb ?