We are thrilled to announce the results and winners of the first MLCommons® AlgoPerf: Training Algorithms benchmark competition, a competition designed to find better training algorithms that speed up neural network training across a diverse set of workloads.
The AlgoPerf: Training Algorithms Competition
To make building useful neural network models less time-consuming and costly, we need better training algorithms. The MLCommons Algorithms working group has developed the open-source AlgoPerf: Training Algorithms benchmark to measure how much faster neural networks can be trained through advancements in underlying training algorithms, such as better optimizers or more effective hyperparameter choices.
The AlgoPerf: Training Algorithms benchmark evaluates the time training required for different training algorithms across multiple realistic deep learning workloads when running on a fixed hardware configuration. To encourage generally useful methods, submissions must fully specify any required workload-specific tuning. Participants could choose to submit under two separate tuning rulesets: the external tuning ruleset, designed to simulate tuning with a limited amount of parallel resources, or the self-tuning ruleset, designed to simulate fully automated tuning on a single machine.
Participation
The first iteration of the AlgoPerf: Training Algorithms competition attracted 18 submissions (with 15 being scorable) from 10 different teams. Scoring involved over 4000 individual training runs across the 14 workloads used in the benchmark. Participants included researchers from Concordia University, ELLIS Tübingen, Google, Max Planck Institute for Intelligent Systems, Meta AI, Meta Platforms, Michigan State University, Mila, Samsung AI, UCLA, UT Austin, the University of Cambridge, the University of West Indies, and the Vector Institute.
The submissions collectively explored many interesting techniques and implementation choices, including submissions using both of our supported frameworks, JAX and PyTorch. As required by the rules, all submissions are released publicly under an Apache 2.0 open-source license.
The Winners & Results
Congratulations to Aaron Defazio (Meta), Alice Yang (Meta), and Konstantin Mishchenko (Samsung AI) who came in first place in the self-tuning ruleset with their “Schedule Free AdamW” submission (see Table 2). For the external tuning ruleset (see Table 1, below), first place goes to the “Distributed Shampoo” submission of Hao-Jun Michael Shi, Tsung-Hsien Lee, Anna Cai, Shintaro Iwasaki, Wenyin Fu, Yuchen Hao, and Mike Rabbat (all Meta).
The external tuning ruleset saw four submissions beating the challenging prize-qualification baseline, improving over the state-of-the-art training algorithm. The “Distributed Shampoo” submission provides an impressive 28% faster model training compared to the baseline. “Schedule Free AdamW” was the only submission in the self-tuning ruleset that beat the prize-qualification baseline, improving over it by providing an 8% faster neural network training process.
Congratulations to the winners and all participants for their contributions to advancing neural network training algorithms!
Score | Submission | Submitters | Institutions | Framework |
0.78 | Shampoo Submission | Hao-Jun Shi, Tsung-Hsien Lee, Anna Cai, Shintaro Iwasaki, Wenyin Fu, Yuchen Hao, Mike Rabbat | Meta Platforms | PyTorch |
0.71 | Schedule Free AdamW | Aaron Defazio, Alice Yang, Konstantin Mishchenko | Meta AI, Samsung AI | PyTorch |
0.64 | Generalized Adam | George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg | Google DeepMind | JAX |
0.63 | Cyclic LR | Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping | MPI-IS, ELLIS Tübingen | PyTorch |
0.59 | NadamP | George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg | Google DeepMind | JAX |
0.57 | Prize Qualification Baseline | |||
0.49 | Amos | Ran Tian | Google DeepMind | JAX |
0.47 | Casper Adaptive | Sai Surya Duvvuri, Inderjit Dhillon, Cho-Jui Hsieh | UT Austin, Google, UCLA | JAX |
0.37 | Lawa Queue | Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping | MPI-IS, ELLIS Tübingen | PyTorch |
0.34 | Lawa EMA | Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping | MPI-IS, ELLIS Tübingen | PyTorch |
0.00 | Schedule Free Prodigy | Aaron Defazio, Alice Yang, Konstantin Mishchenko | Meta AI, Samsung AI | PyTorch |
Score | Submission | Submitters | Institutions | Framework |
0.85 | Schedule Free AdamW | Aaron Defazio, Alice Yang, Konstantin Mishchenko | Meta AI, Samsung AI | PyTorch |
0.82 | Prize Qualification Baseline | |||
0.33 | NadamW Sequential | George Dahl, Sourabh Medapati, Zack Nado, Rohan Anil, Shankar Krishnan, Naman Agarwal, Priya Kasimbeg, Vlad Feinberg | Google DeepMind | JAX |
0.14 | sinv6_75 | Abhinav Moudgil | Mila, Concordia University | JAX |
0.09 | sinv6 | Abhinav Moudgil | Mila, Concordia University | JAX |
0.00 | AdamG | Yijiang Pang | Michigan State University | PyTorch |
To receive a cash prize, the competition rules require that at least one other submission outperforms the prize qualification baseline for the relevant ruleset, AND that none of the authors of this competing submission share an affiliation with either of the two MLCommons Algorithms working group chairs. This condition was met for the external tuning ruleset, and thus a cash prize of $25,000 will be awarded by MLCommons for the first place submission. Despite the outstanding performance of the first place submissions in the self-tuning ruleset, the prize requirement was not met, since several competing submissions involved overlapping affiliations with the working group chairs, and the prize qualification baselines were quite difficult to beat. The working group’s goal in designing the AlgoPerf: Training Algorithms benchmark competition was to, first and foremost, make sure that any submission that performed well under our rules had to achieve something truly impressive, and we are delighted that the first place submissions in both rulesets managed to produce such exceptional results.
To view the full results of the AlgoPerf: Training Algorithms competition, including the workload-specific performances of each submission, please visit the AlgoPerf results page. We plan to release a paper with a more in-depth discussion of the results after we are done analyzing them in detail.
The next steps for AlgoPerf
The first iteration of AlgoPerf: Training Algorithms demonstrated that neural network training can be accelerated significantly by improving the underlying training algorithms. This iteration was only the first step in driving innovation in machine learning algorithms. Now that we can reliably measure progress in training algorithms, we anticipate rapid progress in this field, both in terms of new research and better methods. The working group is already hard at work planning for the future of the benchmark. If you are interested in shaping this future, developing and scoring any particular submissions, or collaborating on research that builds on top of our benchmark, please consider joining the working group.
Acknowledgments
We extend our sincere thanks to Google for their generous support in providing computational resources to score and evaluate all submissions across the workloads. Our gratitude also goes to the entire MLCommons organization for supporting the Algorithms working group and funding the $50,000 prize pool. Special thanks are due to the members of the Algorithms Working Group who developed, implemented, and managed the benchmark competition. We particularly want to thank Priya Kasimbeg, the Engineering Lead of the working group, who led the scoring process.
About MLCommons and the Algorithms Working Group
MLCommons is the world leader in building benchmarks for AI. It is an open engineering consortium with a mission to make AI better for everyone through benchmarks and data.
The AlgoPerf: Training Algorithms benchmark was developed by the MLCommons Algorithms Working Group. Researchers from a variety of academic institutions and industry labs serve on the working group. The group’s mission is to create a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements. For additional information on the Algorithms Working Group, and details on how to become a member or contribute to the benchmarks, please visit the working group website or reach out to [email protected].
This blog has been updated with corrected scores on 10/23/24.