MLPerf Inference Delivers Power Efficiency and Performance Gains

Today, MLCommons®, the leading open AI engineering consortium, announced new results from the industry-standard MLPerf™ Inference v3.0 and Mobile v3.0 benchmark suites, which measure the performance and power-efficiency of applying a trained machine learning model to new data. The latest benchmark results illustrate the industry’s emphasis on power efficiency, with 50% more power efficiency results, and significant gains in performance by over 60% in some benchmark tests.

Inference is the critical operational step in machine learning, where a trained model is deployed for actual use, bringing intelligence into a vast array of applications and systems. Machine learning inference is behind everything from the latest generative AI chatbots to safety features in vehicles such as automatic lane-keeping, and speech-to-text interfaces. Improving performance and power efficiency will lead the way for deploying more capable AI systems that benefit society.

The MLPerf benchmark suites are comprehensive system tests that stress machine learning models including the underlying software and hardware and in some cases, optionally measuring power efficiency. The open-source and peer-reviewed benchmark suites create a level playing ground for competition, which fosters innovation and benefits society at large through better performance and power efficiency for AI and ML applications.

The MLPerf Inference benchmarks primarily focus on datacenter and edge systems. This round featured even greater participation across the community with a record-breaking 25 submitting organizations, over 6,700 performance results, and more than 2,400 performance and power efficiency measurements. The submitters include Alibaba, ASUSTeK, Azure, cTuning, Deci.ai, Dell, Gigabyte, H3C, HPE, Inspur, Intel, Krai, Lenovo, Moffett, Nettrix, NEUCHIPS, Neural Magic, NVIDIA, Qualcomm Technologies, Inc., Quanta Cloud Technology, rebellions, SiMa, Supermicro, VMware, and xFusion, with nearly half of the submitters also measuring power efficiency.

MLCommons congratulates our many first time MLPerf Inference submitters on their outstanding results and accomplishments. cTuning, Quanta Cloud Technology, rebellions, SiMa, and xFusion all debuted their first performance results. cTuning, NEUCHIPS, and SiMa also weighed in with their first power efficiency measurements. Lastly, HPE, NVIDIA, and Qualcomm all submitted their first results for inference over the network.

The MLPerf Mobile benchmark suite is tailored for smartphones, tablets, notebooks, and other client systems. The MLPerf Mobile application for Android and iOS is expected to be available shortly.

To view the results and find additional information about the benchmarks please visit https://mlcommons.org/en/inference-datacenter-30, https://mlcommons.org/en/inference-edge-30 and https://mlcommons.org/en/inference-mobile-30.

About MLCommons

MLCommons is an open engineering consortium with a mission to make machine learning better for everyone through benchmarks and data. The foundation for MLCommons began with the MLPerf benchmark in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. In collaboration with its 50+ founding partners – global technology providers, academics and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire machine learning industry through benchmarks and metrics, public datasets and best practices.

For additional information on MLCommons and details on becoming a Member or Affiliate of the organization, please visit http://mlcommons.org and contact [email protected].

Press Contact:
Kelly Berschauer
[email protected]