Today, a group of researchers and engineers released MLPerf™, a benchmark for measuring the speed of machine learning software and hardware. MLPerf measures speed based on the time it takes to train deep neural networks to perform tasks including recognizing objects, translating languages, and playing the ancient game of Go. The effort is supported by a broad coalition of experts from tech companies and startups including AMD (NASDAQ: AMD), Baidu (NASDAQ: BIDU), Google (NASDAQ: GOOGL), Intel (NASDAQ: INTC), SambaNova, and Wave Computing and researchers from educational institutions including Harvard University, Stanford University, University of California Berkeley, University of Minnesota, and University of Toronto.

The promise of AI has sparked an explosion of work in machine learning. As this sector expands, systems need to evolve rapidly to meet its demands. According to ML pioneer Andrew Ng, “AI is transforming multiple industries, but for it to reach its full potential, we still need faster hardware and software.” With researchers pushing the bounds of computers’ capabilities and system designers beginning to hone machines for machine learning, there is a need for a new generation of benchmarks.

MLPerf aims to accelerate improvements in ML system performance just as the SPEC benchmark helped accelerate improvements in general purpose computing. SPEC was introduced in 1988 by a consortium of computing companies. CPU Performance improved 1.6X/year for the next 15 years. MLPerf combines best practices from previous benchmarks including: SPEC’s use of a suite of programs, SORT’s use one division to enable comparisons and another division to foster innovative ideas, DeepBench’s coverage of software deployed in production, and DAWNBench’s time-to-accuracy metric.

Benchmarks like SPEC and MLPerf catalyze technological improvement by aligning research and development efforts and guiding investment decisions.

  • “Good benchmarks enable researchers to compare different ideas quickly, which makes it easier to innovate.” summarizes researcher David Patterson, author of Computer Architecture: A Quantitative Approach.
  • According to Gregory Stoner, CTO of Machine Learning, Radeon Technologies Group, AMD: “AMD is at the forefront of building high-performance solutions, and benchmarks such as MLPerf are vital for providing a solid foundation for hardware and system software idea exploration, thereby giving our customers a more robust solution to measure Machine Learning system performance and underscoring the power of the AMD portfolio.”
  • MLPerf is a critical benchmark that showcases how our dataflow processor technology is optimized for ML workload performance.” remarks Chris Nicol, CTO of the startup Wave Computing.
  • AI powers an array of products and services at Baidu. A benchmark like MLPerf allows us to compare platforms and make better datacenter investment decisions,” reports Haifeng Wang, Vice President of Baidu who oversees the AI Group.

Because ML is such a fast moving field, the team is developing MLPerf as an “agile” benchmark: launching early, involving a broad community, and iterating rapidly. The mlperf.org website provides a complete specification with reference code, and will track future results. MLPerf invites hardware vendors and software framework providers to submit results before the July 31st deadline.