Add a script to run a single matmul configuration with custom MatmulParams #3918

rdspring1 · 2025-02-18T23:36:00Z

This PR adds a script to run a single matmul configuration with custom MatmulParams. It profiles the nvfuser kernel and compares its runtime against nvjet kernel runtime. The nvjet kernel runtimes are stored in a json file, generated by python benchmarks.

Update python bindings to support constructing Matmul Options.

rdspring1 · 2025-02-18T23:36:08Z

!test

github-actions · 2025-02-18T23:36:49Z

Review updated until commit f269852

Description

Add script for profiling single matmul configuration
Update Python bindings for MatmulParams
Implement custom scheduler for matmul
Validate nvFuser against PyTorch matmul

Changes walkthrough 📝

Relevant files

Enhancement

python_bindings.cpp `Update Python bindings for MatmulParams` csrc/python_frontend/python_bindings.cpp Add constructors for GemmTile, MatMulTileOptions, CircularBufferOptions, SupportedVectorization, ClusterDims, and MmaMacroEncode Update class definitions to use py::class_ instead of DEFINECLASS macro	+23/-4
profile_matmul.py `Add script for profiling single matmul configuration` doc/dev/python_scheduling/profile_matmul.py Add script to run and profile a single matmul configuration Implement functions for estimating matmul size, getting kernel time, and defining matmul fusion Implement custom scheduler with custom parameters Add main function to parse arguments and run profiling	+209/-0

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests

⚡ Recommended focus areas for review

Initialization Order

The order of initialization for MatmulParams classes should be consistent with the class definitions to avoid potential issues.

    .def(py::init<int64_t, int64_t, int64_t>())
    .PARAM(GemmTile, m)
    .PARAM(GemmTile, n)
    .PARAM(GemmTile, k)
    .TOSTRINGTOPLEVEL(GemmTile);

DEFINECLASS(MatMulTileOptions)
    .def(py::init<GemmTile, GemmTile>())
    .PARAM(MatMulTileOptions, cta_tile)
    .PARAM(MatMulTileOptions, warp_tile)
    .TOSTRINGTOPLEVEL(MatMulTileOptions);

py::class_<MatmulParams::CircularBufferOptions>(

Missing Definitions

The DEFINECLASS macro is used instead of py::class_ for some classes, which might lead to missing bindings or incorrect behavior.

    .def(py::init<int64_t, int64_t, int64_t>())
    .PARAM(GemmTile, m)
    .PARAM(GemmTile, n)
    .PARAM(GemmTile, k)
    .TOSTRINGTOPLEVEL(GemmTile);

DEFINECLASS(MatMulTileOptions)
    .def(py::init<GemmTile, GemmTile>())
    .PARAM(MatMulTileOptions, cta_tile)
    .PARAM(MatMulTileOptions, warp_tile)
    .TOSTRINGTOPLEVEL(MatMulTileOptions);

py::class_<MatmulParams::CircularBufferOptions>(

Error Handling

The error handling in test_matmul_nvf could be improved to provide more informative messages or handle specific exceptions.

    nvf_outputs = scheduled_fd.execute([a, b], profile=True)
except Exception as e:
    if verbose:
        print(e)
    return -1

jacobhinkle

Looks good! Thanks for adding this. Comments are pretty minor I think.

csrc/python_frontend/python_bindings.cpp

doc/dev/python_scheduling/single_matmul.py

jacobhinkle · 2025-02-19T01:04:32Z

doc/dev/python_scheduling/single_matmul.py

+    for shape in [[m, k], [n, k], [m, n]]:
+        total_in_gbs += _estimate_size(shape, dtype)


This is the bare minimum size required to compute the GEMM. If there is validation enabled, there will be more than that because of more outputs: 1 for the eager output plus intermediates required for torch.allclose. So you might want to add a fudge factor.

…arams

rdspring1 · 2025-02-21T00:22:37Z

@jacobhinkle I added the following argparse.

usage: profile_matmul.py [-h] [--verbose] [--validate] m n k {NN,NT,TN,TT}

Run through a combination of matmul parameters and compare relative performance against nvjet for a single problem.

positional arguments:
  m              The size of M dimension
  n              The size of N dimension
  k              The size of K dimension
  {NN,NT,TN,TT}  The layout for matmul problem.

options:
  -h, --help     show this help message and exit
  --verbose      Print matmul parameters and exceptions.
  --validate     Validate nvfuser against pytorch matmul.

How to run script: NVFUSER_ENABLE=fuse_matmul NVFUSER_DISABLE=matmul_expr_eval python single_matmul.py nvjet_pybench.json 1752 4720 584 NN --verbose --validate

rdspring1 · 2025-02-21T00:26:18Z

!build

rdspring1 added the Matmuls label Feb 18, 2025

rdspring1 requested a review from jacobhinkle February 18, 2025 23:36

jacobhinkle approved these changes Feb 19, 2025

View reviewed changes

rdspring1 added 2 commits February 20, 2025 08:51

Add a script to run a single matmul configuration with custom MatmulP…

5f0884a

…arams

refactor and polish

6f56b47

rdspring1 force-pushed the single_problem_matmul branch from 4654545 to 6f56b47 Compare February 20, 2025 23:29

rdspring1 added 3 commits February 20, 2025 15:47

save

3b1b87f

profile inside script instead of using json

2860d6b

fix description

09ce087

rdspring1 added 2 commits February 20, 2025 16:23

lint

30cbff8

fix verbose

f269852

rdspring1 merged commit 2b5ea2a into main Feb 21, 2025
16 checks passed

rdspring1 deleted the single_problem_matmul branch February 21, 2025 03:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a script to run a single matmul configuration with custom MatmulParams #3918

Add a script to run a single matmul configuration with custom MatmulParams #3918

rdspring1 commented Feb 18, 2025

rdspring1 commented Feb 18, 2025

github-actions bot commented Feb 18, 2025 •

edited

Loading

jacobhinkle left a comment

jacobhinkle Feb 19, 2025

rdspring1 commented Feb 21, 2025

rdspring1 commented Feb 21, 2025

		for shape in [[m, k], [n, k], [m, n]]:
		total_in_gbs += _estimate_size(shape, dtype)

Add a script to run a single matmul configuration with custom MatmulParams #3918

Add a script to run a single matmul configuration with custom MatmulParams #3918

Conversation

rdspring1 commented Feb 18, 2025

rdspring1 commented Feb 18, 2025

github-actions bot commented Feb 18, 2025 • edited Loading

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

jacobhinkle left a comment

Choose a reason for hiding this comment

jacobhinkle Feb 19, 2025

Choose a reason for hiding this comment

rdspring1 commented Feb 21, 2025

rdspring1 commented Feb 21, 2025

github-actions bot commented Feb 18, 2025 •

edited

Loading