Announcing the New MLPerf Client Working Group

Today we are announcing the formation of a new MLPerf™ Client working group. Its goal is to produce machine learning benchmarks for client systems such as desktops, laptops, and workstations based on Microsoft Windows and other operating systems. The MLPerf suite of benchmarks is the gold standard for AI benchmarks in the data center, and we are now bringing our collaborative, community-focused development approach and deep technical understanding of machine learning (ML) towards creating a consumer client systems benchmark suite.

As the impact of AI grows and offers new capabilities to everyone, it is increasingly an integral part of the computing experience. Silicon for client systems incorporates AI-specific hardware acceleration capabilities of various types, and OS and application vendors are adding AI-driven features into software to boost productivity and to unleash the creativity of millions of end users. As these hardware and software capabilities proliferate, many ML models will execute locally on client systems. The industry will require reliable, standard ways to measure the performance and efficiency of AI acceleration solutions on client systems.

The MLPerf Client benchmarks will be scenario-driven focusing on real end-user use cases and grounded in feedback from the community. The first benchmark will focus on a large language model, specifically, the Llama 2 LLM. The MLCommons community has already navigated many of the challenges LLMs present in client systems, such as balancing performance against output quality, licensing issues involving datasets and models, and safety concerns through the incorporation of Llama 2-based workloads in the MLCommons training and inference benchmark suites. This learning will help jump-start this new client work.

Initial MLPerf Client working group participants include representatives from AMD, Arm, ASUSTeK, Dell Technologies, Intel, Lenovo, Microsoft, NVIDIA, and Qualcomm Technologies, Inc. among others.

“The time is ripe to bring MLPerf to client systems, as AI is becoming an expected part of computing everywhere,” said David Kanter, Executive Director at MLCommons®. “Large language models are a natural and exciting starting point for our MLPerf Client working group. We look forward to teaming up with our members to bring the excellence of MLPerf into client systems and drive new capabilities for the broader community.”

We’re happy to announce that Ramesh Jaladi, Senior Director of Engineering in the IP Performance group at Intel; Yannis Minadakis, Partner GM, Software Development at Microsoft; and Jani Joki, Director of Performance Benchmarking at NVIDIA have agreed to serve as co-chairs of the MLPerf Client working group. Additionally, Vinesh Sukumar, Senior Director, AI/ML Product Management at Qualcomm, has agreed to lead a benchmark development task force within the working group.

“Good measurements are the key to advancing AI acceleration,” said Jaladi. “They allow us to set targets, track progress, and deliver improved end-user experiences in successive product generations. The whole industry benefits when benchmarks are well aligned with customer needs, and that’s the role we expect the MLPerf Client suite to play in consumer computing.”

“Microsoft recognizes the need for quality benchmarking tools tailored to the AI acceleration capabilities of Windows client systems, and we welcome the opportunity to collaborate with the MLCommons community to tackle this challenge,” said Minadakis.

“The MLPerf benchmarks have served as a measuring stick for substantial advances in machine learning performance and efficiency in data center solutions,” said Joki. “We look forward to contributing to the creation of benchmarks that will serve a similar role in client systems.”

“Qualcomm is proud to advance the client ecosystem and looks forward to the innovative benchmarks that this MLPerf Working Group will establish for machine learning,” said Sukumar. “Benchmarks remain an important tool in the development and fine tuning of silicon, and MLCommons’ focus on end-user use cases will be key to on-device AI testing.”

We encourage all interested parties to participate in our effort. For more information on the MLPerf Client working group, including information on how to join and contribute to the benchmarks, please visit the working group page or contact the chairs via email at [email protected].