Inference Working Group


Create a set of fair and representative inference benchmarks.


Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call.


  1. Inference benchmark rules and definitions
  2. Inference benchmark reference software
  3. Inference benchmark submission rules
  4. Inference benchmark roadmap
  5. Publish inference benchmark results every ~6 months

Meeting Schedule

Weekly on Tuesday from 8:30-10:30AM Pacific.

Mailing List

Working Group Resources

Google Drive (Members only)

Working Group Chair Emails

Christine Cheng (

Guenther Schmuelling (

Working Group Chair Bios

Christine Cheng is one of the engineering leads for deep learning benchmarking and optimization at Intel. Before joining MLPerf, she worked as a data scientist in sports analytics. She received her M.S. from Stanford University and B.S from Caltech.


Guenther Schmuelling is a principal software engineer in Microsoft’s Azure AI Platform division and spends his time working on ONNX and AI Platforms.