-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a script to run a single matmul configuration with custom MatmulParams #3918
Conversation
!test |
Review updated until commit f269852 Description
Changes walkthrough 📝
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks for adding this. Comments are pretty minor I think.
for shape in [[m, k], [n, k], [m, n]]: | ||
total_in_gbs += _estimate_size(shape, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bare minimum size required to compute the GEMM. If there is validation enabled, there will be more than that because of more outputs: 1 for the eager output plus intermediates required for torch.allclose
. So you might want to add a fudge factor.
4654545
to
6f56b47
Compare
@jacobhinkle I added the following usage: profile_matmul.py [-h] [--verbose] [--validate] m n k {NN,NT,TN,TT}
Run through a combination of matmul parameters and compare relative performance against nvjet for a single problem.
positional arguments:
m The size of M dimension
n The size of N dimension
k The size of K dimension
{NN,NT,TN,TT} The layout for matmul problem.
options:
-h, --help show this help message and exit
--verbose Print matmul parameters and exceptions.
--validate Validate nvfuser against pytorch matmul.
How to run script: NVFUSER_ENABLE=fuse_matmul NVFUSER_DISABLE=matmul_expr_eval python single_matmul.py nvjet_pybench.json 1752 4720 584 NN --verbose --validate |
!build |
This PR adds a script to run a single matmul configuration with custom
MatmulParams
. It profiles the nvfuser kernel and compares its runtime against nvjet kernel runtime. The nvjet kernel runtimes are stored in a json file, generated by python benchmarks.