MLPerf Inference Launched

Today a consortium involving more than 40 leading companies and university researchers introduced MLPerf™ Inference v0.5, the first industry standard machine learning benchmark suite for measuring system performance and power efficiency. The benchmark suite covers models applicable to a wide range of applications including autonomous driving and natural language processing, on a variety of form factors, including smartphones, PCs, edge servers, and cloud computing platforms in the data center. MLPerf Inference v0.5 uses a combination of carefully selected models and data sets to ensure that the results are relevant to real-world applications. It will stimulate innovation within the academic and research communities and push the state-of-the-art forward.

By measuring inference, this benchmark suite will give valuable information on how quickly a trained neural network can process new data to provide useful insights. Previously, MLPerf released the companion Training v0.5 benchmark suite leading to 29 different results measuring the performance of cutting-edge systems for training deep neural networks.

MLPerf Inference v0.5 consists of five benchmarks, focused on three common ML tasks:

Image Classification – predicting a “label” for a given image from the ImageNet dataset, such as identifying items in a photo.
Object Detection – picking out an object using a bounding box within an image from the MS-COCO dataset, commonly used in robotics, automation, and automotive.
Machine Translation – translating sentences between English and German using the WMT English-German benchmark, similar to auto-translate features in widely used chat and email applications.

MLPerf provides benchmark reference implementations that define the problem, model, and quality target, and provide instructions to run the code. The reference implementations are available in ONNX, PyTorch, and TensorFlow frameworks. The MLPerf inference benchmark working group follows an “agile” benchmarking methodology: launching early, involving a broad and open community, and iterating rapidly. The mlperf.org website provides a complete specification with guidelines on the reference code and will track future results.

The inference benchmarks were created thanks to the contributions and leadership of our members over the last 11 months, including representatives from: Arm, Cadence, Centaur Technology, Dividiti, Facebook, General Motors, Google, Habana Labs, Harvard University, Intel, MediaTek, Microsoft, Myrtle, Nvidia, Real World Insights, University of Illinois at Urbana-Champaign, University of Toronto, and Xilinx.

The General Chair Peter Mattson and Inference Working Group Co-Chairs Christine Cheng, David Kanter, Vijay Janapa Reddi, and Carole-Jean Wu make the following statement:

“The new MLPerf inference benchmarks will accelerate the development of hardware and software to unlock the full potential of ML applications. They will also stimulate innovation within the academic and research communities. By creating common and relevant metrics to assess new machine learning software frameworks, hardware accelerators, and cloud and edge computing platforms in real-life situations, these benchmarks will establish a level playing field that even the smallest companies can use.”

Now that the new benchmark suite has been released, organizations can submit results that demonstrate the benefits of their ML systems on these benchmarks. Interested organizations should contact [email protected].