[Feature Request] Support for MatMul with broadcasting (eg. 3Dx2D matmul) for Xnnpack execution provider #24107
Labels
ep:Xnnpack
issues related to XNNPACK EP
feature request
request for unsupported feature or enhancement
Describe the feature request
Currently the MatMul op can be mapped to XnnPack fully connected kernels. However, only 2Dx2D matrix multiply is supported.
Now, many models have Matmul layers where the input is a 3D tensor, which requires broadcasting to apply matmul 2D and generate the resulting 3D tensor.
Since this is not supported by the Xnnpack execution provider execution falls back on MLAS implementation which can handle this case. ( sgemm.cpp:MlasGemmPackB() )
From what I gather it would be possible to add broadcasting functionality in the Xnnpack execution provider layer?
Describe scenario use case
Support for models that have non 2Dx2D matmul to comply with ONNX spec for the MatMul operation.
See image of layer cutout from a model
The text was updated successfully, but these errors were encountered: