Announcing the MLCommons AlgoPerf Training Algorithms Benchmark Competition

Faster training allows researchers to build more capable machine learning (ML) models, but unlocking the most valuable capabilities requires improvements in every part of the training pipeline. The MLPerf™ Training benchmark suite has been extremely successful in encouraging innovation in neural network training systems, but more work needs to be done to encourage innovation in training algorithms. Improved training algorithms could save time, computational resources, and lead to better, more accurate, models. Unfortunately, as a community, we are currently unable to reliably identify training algorithm improvements, or even determine the state-of-the-art training algorithm. To accelerate this work, the MLCommons® Algorithms Working Group is delighted to announce the AlgoPerf: Training algorithms competition, which is designed to measure neural network training speedups due to algorithmic improvements (e.g. better optimizers or hyperparameter tuning protocols).

The AlgoPerf: Training algorithms benchmark is a competitive, time-to-result benchmark that runs on a fixed system and compares training algorithms on multiple deep learning workloads (see Table 1 below). In contrast, in MLPerf Training submitters typically compete on the ML training systems. For AlgoPerf, however, the hardware and lower-level software environments are fixed, so submitters must develop and compete on the basis of more efficient algorithms.

To ensure that the benchmark incentivizes generally useful training algorithms, submissions must simultaneously perform well across multiple workloads, including some randomized ones. This includes a wide variety of workloads across many domains to ensure that the results are broadly applicable and relevant to more ML practitioners. The competition will determine the best general-purpose method, measured by an aggregated score across all workloads. The current workloads are listed below.

Task	Dataset	Model
Clickthrough rate prediction	Criteo 1TB	DLRMSmall
MRI reconstruction	FastMRI	U-Net
Image classification	ImageNet	ResNet-50
Image classification	ImageNet	ViT
Speech recognition	LibriSpeech	Conformer
Speech recognition	LibriSpeech	DeepSpeech
Molecular property prediction	OGBG	GNN
Translation	WMT	Transformer

To further encourage generality, competition submissions must automate—and strictly account for—any workload-specific hyperparameter tuning they perform. Submissions are allowed under two separate tuning rulesets: an external tuning ruleset meant to simulate tuning with a fixed amount of parallel resources, or a self-tuning ruleset meant to simulate tuning on a single machine.

The Competition is Open NOW!

The AlgoPerf: Training algorithms benchmark competition opens on November 28, 2023, and is scheduled to close on March 28, 2024. To enter the competition please see the instructions on the competition website. Additionally, the accompanying technical report motivates and explains the design choices of the benchmark.

Sponsorship & Prize Money

MLCommons is offering a total prize pool of $50,000, to be awarded by a committee, for the top-performing submissions in each tuning ruleset.

We would also like to express our gratitude to Google for their generous support in providing computational resources to score the top submissions, and resources to help score promising submissions from submitters with more limited resources.

About the MLCommons Algorithms Working Group

The MLCommons AlgoPerf: Training algorithms benchmark was developed by the MLCommons Algorithms Working Group. Researchers from a variety of academic institutions and industry labs serve on the working group. The group’s mission is to create a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements. For additional information on the Algorithms Working Group and details on how to become a member or contribute to the benchmarks, please visit the working group website or reach out to [email protected].