MLCommons

Research Working Group

Algorithms Working Group

Mission

Create a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements.

Purpose

We need a more scientifically sound methodology for evaluating training speedups due to new algorithms, including both new optimizers and new model architectures. Cutting edge machine learning (ML) models are exceeding the compute budgets of many researchers, and ML compute is becoming a larger and larger cost in industry. To reduce the compute cost of ML research and practice, we need rigorous benchmarking of efficiency. Such benchmarks will guide us in selecting the best directions to evolve existing techniques and ultimately enable progress toward models that produce not only better results, but better results at lower cost.

In order to drive innovation in machine learning algorithms that reduce the time needed to create useful models, we propose a new set of benchmarks to evaluate the training time for different algorithms (models, optimizers, preprocessing, etc.) on a fixed hardware configuration (future iterations can adopt new hardware configurations as needed). Our proposal includes two tracks: (1) a model track and (2) a training algorithm track. The goal of the model track is to find models that can be trained to achieve the target solution quality (out-of-sample error) in the least amount of time on each benchmark dataset. Similarly, the goal of the training algorithm track is to find training algorithms (optimizers, etc.) that train benchmark models to reach the goal out-of-sample error rate as fast as possible.

Deliverables

  1. Rules: We will produce a set of rules for algorithmic efficiency benchmarking, that specify an initial 2-3 benchmarks.
  2. Harness: We will produce a testing harness that is executable on commonly available clouds using MLCube™.
  3. Baseline training algorithm/model implementations: We will produce a baseline training algorithm and model implementation for each benchmark, which can also serve as submission skeletons.
  4. Call for participation.
  5. Initial Submission round: Once rules and harness/references are developed we will call for participation by the research/industry community.
  6. Additional submission rounds on a regular schedule.

Meeting Schedule

Weekly on Thursday from 11:30AM-12:30PM Pacific.

Mailing List

algorithms@mlcommons.org

Working Group Chair Emails

George Dahl (gdahl@google.com), Frank Schneider (f.schneider@uni-tuebingen.de)

Working Group Chair Bios

George Dahl received his Ph.D. from the University of Toronto under the supervision of Geoff Hinton, where he worked on deep learning approaches to problems in speech recognition, computational chemistry, and natural language text processing. Along with his collaborators, he created the first successful deep acoustic models for speech recognition, technology that now forms the basis for modern speech recognition.

He has been a research scientist at Google on the Brain team since 2015. His current research focuses on improving our empirical understanding of neural network training as well as on deep learning applications to linguistic, perceptual, chemical, biological, and medical data.

CV

Frank Schneider is a Ph.D. student in the Methods of Machine Learning group supervised by Prof. Dr. Philipp Hennig at the University of Tübingen. His research focuses on helping the community move beyond the unsatisfactory user experience of current optimization methods for deep learning. He holds a Bachelor's and Master's degree in Simulation Technology from the University of Stuttgart as well as a Master's degree in Industrial and Applied Mathematics from the Eindhoven University of Technology.

CV