Inference Working Group
-
Working Groups
- Training
- Inference
- Datasets
- Best Practices
- Research
Mission
Create a set of fair and representative inference benchmarks.
Purpose
Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf™ Inference answers that call.
Deliverables
- Inference benchmark rules and definitions
- Inference benchmark reference software
- Inference benchmark submission rules
- Inference benchmark roadmap
- Publish inference benchmark results every ~6 months
Meeting Schedule
Weekly on Tuesday from 8:35-10:00AM Pacific.
How to Join and Access Working Group Resources
This group is limited to exclusively Members and Affiliates. If you are not already a Member/Affiliate or part of a Member/Affiliate company, you can learn more about Membership here.
- To sign up for the group mailing list, receive the meeting invite, and access shared documents and meeting minutes:
- Associate a Google account with your organizational email address.
- Request to join the Inference Google Group. Requests are manually reviewed, so please be patient.
- Once your request to join the Inference Google Group is approved, you'll be able to access the Inference folder in the Members Google Drive.
- To engage in group dicussions:
- Join the group's channels on the MLCommons Discord server.
- To access the GitHub repository (public):
- If you want to contribute code, please sign our CLA first.
- Visit the GitHub repository.
Working Group Chairs
To contact all Inference working group chairs email inference-chairs@mlcommons.org.
Miro Hodak (miro@mlcommons.org) - LinkedIn
Miro Hodak is a Senior Member of Technical Staff at AMD where he works on AI performance, strategy, and solutions. Before joining AMD, he worked as an AI Architect at Lenovo Infrastructure Solutions Group, and, prior to that, he was a Research Assistant Professor in Physics at North Carolina State University. Miro has participated in MLPerf/MLCommons activities since 2020 including submitting multiple rounds of Inference and Training benchmarks. Miro has journal publications spanning AI, computer science, materials science, physics, and biochemistry. His work has been cited over 2,000 times.
Mitchelle Rasquinha (mrasquinha@mlcommons.org) - LinkedIn
Mitchelle Rasquinha is a Senior Software Engineer working on the ML Performance Team within Google. She is interested in accurately capturing innovations in system architectures through robust benchmarking. Mitchelle has a background in Computer Architecture from the Georgia Institute of Technology.