MLCommons™ AlgoPerf: Training Algorithms Benchmark

Leaderboard • Getting Started • Submit • Documentation • Contributing • Benchmark/Results Paper

Unlike benchmarks that focus on model architecture or hardware, the AlgoPerf benchmark isolates the training algorithm itself, measuring how quickly it can achieve target performance levels on a fixed set of representative deep learning tasks. These tasks span various domains, including image classification, speech recognition, machine translation, and more, all running on standardized hardware (8x NVIDIA V100 GPUs). The benchmark includes 8 base workloads, which are fully specified. In addition there are definitions for "randomized" workloads, which are variations of the fixed workloads, which are designed to discourage overfitting. These randomized workloads were used for scoring the AlgPerf competition but will not be used for future scoring.

Submissions are evaluated based on their "time-to-result", i.e., the wall-clock time it takes to reach predefined validation and test set performance targets on each workload. Submissions are scored under one of two different tuning rulesets. The external tuning rule set allows a limited amount of hyperparameter tuning (20 quasirandom trials) for each workload. The self-tuning rule set allows no external tuning, so any tuning is done "on-the-clock". For each submission, a single, overall benchmark score is computed by integrating its "performance profile" across all fixed workloads. The performance profile captures the relative training time of the submission to the best submission on each workload. Therefore the score of each submission is a function of other submissions in the submission pool. The higher the benchmark score, the better the submission's overall performance.

This is the repository for the AlgoPerf: Training Algorithms benchmark measuring neural network training speedups due to algorithmic improvements. It is developed by the MLCommons Algorithms Working Group. This repository holds the benchmark code, the benchmark's technical documentation and getting started guides. For a detailed description of the benchmark design, see our introductory paper, for the results of the inaugural competition see our results paper.

See our AlgoPerf Leaderboard for the latest results of the benchmark and to submit your algorithm.

Important

For future iterations of the AlgoPerf: Training Algorithms benchmark competition, we are switching to a rolling leaderboard, making a few changes to the competition rules, and also run all selected submissions on our hardware. To submit your algorithm to the next iteration of the benchmark, please see our How to Submit section and the submission repository which hosts the up to date AlgoPerf leaderboard.

Installation

Tip

If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us. Either file an issue, ask a question on our Discord or join our weekly meetings.

You can install this package and dependencies in a Python virtual environment or use a Docker/Singularity/Apptainer container (recommended). We recommend using a Docker container (or alternatively, a Singularity/Apptainer container) to ensure a similar environment to our scoring and testing environments. Both options are described in detail in the Getting Started document.

TL;DR to install the Jax version for GPU run:

pip3 install -e '.[pytorch_cpu]'
pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html'
pip3 install -e '.[full]'

TL;DR to install the PyTorch version for GPU run:

pip3 install -e '.[jax_cpu]'
pip3 install -e '.[pytorch_gpu]' -f 'https://download.pytorch.org/whl/cu121'
pip3 install -e '.[full]'

Getting Started

For detailed instructions on developing your own algorithm in the benchmark see the Getting Started document.

TL;DR running a JAX workload:

python3 submission_runner.py \
    --framework=jax \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/paper_baselines/adamw/jax/submission.py \
    --tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json

TL;DR running a PyTorch workload:

python3 submission_runner.py \
    --framework=pytorch \
    --workload=mnist \
    --experiment_dir=$HOME/experiments \
    --experiment_name=my_first_experiment \
    --submission_path=reference_algorithms/paper_baselines/adamw/pytorch/submission.py \
    --tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json

How to Submit

Once you have developed your training algorithm, you can submit it to the benchmark by creating a pull request to the submission repository, which hosts the AlgoPerf leaderboard. The AlgoPerf working group will review your PR. Based on our available resources and the perceived potential of the method, it will be selected for a free evaluation. If selected, we will run your algorithm on our hardware and update the leaderboard with the results.

Technical Documentation of the Benchmark & FAQs

We provide a technical documentation of the benchmark and answer frequently asked questions in a separate Documentation page. This includes which types of submissions are allowed. Please ensure that your submission is compliant with these rules before submitting. Suggestions, clarifications and questions can be raised via pull requests, creating an issue, or by sending an email to the working group.

Contributing

We invite everyone to look through our rules, documentation, and codebase and submit issues and pull requests, e.g. for rules changes, clarifications, or any bugs you might encounter. If you are interested in contributing to the work of the working group and influence the benchmark's design decisions, please join the weekly meetings and consider becoming a member of the working group.

Our Contributing document provides further MLCommons contributing guidelines and additional setup and workflow instructions.

License

The AlgoPerf codebase is licensed under the Apache License 2.0.

Paper and Citing the AlgoPerf Benchmark

In our paper "Benchmarking Neural Network Training Algorithms" we motivate, describe, and justify the AlgoPerf: Training Algorithms benchmark.

If you are using the AlgoPerf benchmark, its codebase, baselines, or workloads, please consider citing our paper:

Dahl, Schneider, Nado, et al.
Benchmarking Neural Network Training Algorithms
arXiv 2306.07179

@Misc{Dahl2023AlgoPerf,
  title         = {{Benchmarking Neural Network Training Algorithms}},
  author        = {Dahl, George E. and Schneider, Frank and Nado, Zachary and Agarwal, Naman and Sastry, Chandramouli Shama and Hennig, Philipp and Medapati, Sourabh and Eschenhagen, Runa and Kasimbeg, Priya and Suo, Daniel and Bae, Juhan and Gilmer, Justin and Peirson, Abel L. and Khan, Bilal and Anil, Rohan and Rabbat, Mike and Krishnan, Shankar and Snider, Daniel and Amid, Ehsan and Chen, Kongtao and Maddison, Chris J. and Vasudev, Rakshith and Badura, Michal and Garg, Ankush and Mattson, Peter},
  year          = {2023},
  archiveprefix = {arXiv},
  eprint        = {2306.07179},
}

If you use the results from the first AlgoPerf competition, please consider citing the results paper, as well as the relevant submissions:

Kasimbeg, Schneider, Eschenhagen, et al.
Accelerating neural network training: An analysis of the AlgoPerf competition
ICLR 2025

@inproceedings{Kasimbeg2025AlgoPerfResults,
title           = {Accelerating neural network training: An analysis of the {AlgoPerf} competition},
author          = {Kasimbeg, Priya and Schneider, Frank and Eschenhagen, Runa and Bae, Juhan and Sastry, Chandramouli Shama and Saroufim, Mark and Boyuan, Feng and Wright, Less and Yang, Edward Z. and Nado, Zachary and Medapati, Sourabh and Hennig, Philipp and Rabbat, Michael and Dahl, George E.},
booktitle       = {The Thirteenth International Conference on Learning Representations},
year            = {2025},
url             = {https://openreview.net/forum?id=CtM5xjRSfm}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4,304 Commits
.assets		.assets
.github		.github
algoperf		algoperf
datasets		datasets
docker		docker
docs		docs
prize_qualification_baselines		prize_qualification_baselines
reference_algorithms		reference_algorithms
scoring		scoring
submissions		submissions
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
submission_runner.py		submission_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLCommons™ AlgoPerf: Training Algorithms Benchmark

Table of Contents

Installation

Getting Started

How to Submit

Technical Documentation of the Benchmark & FAQs

Contributing

License

Paper and Citing the AlgoPerf Benchmark

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors 36

Uh oh!

Languages

License

mlcommons/algorithmic-efficiency

Folders and files

Latest commit

History

Repository files navigation

MLCommons™ AlgoPerf: Training Algorithms Benchmark

Table of Contents

Installation

Getting Started

How to Submit

Technical Documentation of the Benchmark & FAQs

Contributing

License

Paper and Citing the AlgoPerf Benchmark

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 36

Uh oh!

Languages

Packages