MLPerf Inference

Create a set of fair and representative inference benchmarks.

Connect with us:

Purpose


Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. 

Deliverables


  • Inference benchmark rules and definitions
  • Inference benchmark reference software
  • Inference benchmark submission rules
  • Inference benchmark roadmap
  • Publish inference benchmark results every ~6 months
Meeting Schedule

Tuesday June 24, 2025 – 08:35 – 10:00 Pacific Time

Submission Date

June 6, 2025 Friday


Technical Resources

Documentation


How to Join and Access MLPerf Inference Resources 


Inference Working Group Chairs 

To contact all MLPerf Inference working group chairs email [email protected].

Miro Hodak

Miro Hodak is a Senior Member of Technical Staff at AMD where he works on AI performance, strategy, and solutions. Before joining AMD, he worked as an AI Architect at Lenovo Infrastructure Solutions Group, and, prior to that, he was a Research Assistant Professor in Physics at North Carolina State University. Miro has participated in MLPerf/MLCommons activities since 2020 including submitting multiple rounds of Inference and Training benchmarks. Miro has journal publications spanning AI, computer science, materials science, physics, and biochemistry. His work has been cited over 2,000 times.

Frank Han

Frank Han is a seasoned benchmarking expert and Senior Principal Systems Development Engineer at Dell Technologies, with a strong background in GPU optimization and high-performance computing (HPC). He has been an active contributor to MLCommons from its early days, participating in MLPerf Training, Inference, HPC, Storage, and Client benchmarks. Frank has submitted results since MLPerf Training v1.0 and Inference v0.7, consistently contributing across multiple rounds. He also brings valuable experience from working with other industry-standard benchmarks, including TOP500, STAC, etc.
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.