Working Groups
MLCommons working groups are collaborative groups of experts who define, develop, and conduct the MLPerf benchmarks and research projects.
Benchmarks Working Groups
MLPerf Training
The MLPerf Training working group defines, develops, and conducts the MLPerf Training benchmarks.
MLPerf HPC
The MLPerf HPC working group creates MLPerf HPC benchmarks based on science applications to run on large-scale supercomputers.
MLPerf Inference
The MLPerf Inference working group creates a set of fair and representative inference benchmarks.
MLPerf Mobile
The MLPerf Mobile working group creates a set of fair and representative inference benchmarks for mobile consumer devices such as smartphones, tablets, and notebooks that is representative of the end user experience.
MLPerf Automotive
The MLPerf Automotive working group defines and develops an industry standard ML benchmark suite for automotive.
MLPerf Tiny
The MLPerf Tiny working group develops Tiny ML benchmarks to evaluate inference performance on ultra-low-power systems.
Infra
The Infra working group makes machine learning more reproducible and easier to manage for the broader community by building logging tools and recommending approaches for tracking and operating machine learning systems.
Power
The Power working group creates power measurement techniques for various MLPerf benchmarks to enable reporting and energy consumption comparison, performance and power for benchmarks run on submission systems.
MLPerf Storage
The MLPerf Storage working group defines and develops the MLPerf Storage benchmarks to characterize performance of storage systems that support machine learning workloads.
MLPerf Client
The MLPerf Client working group defines and develops an application that contains a set of fair and representative machine-learning benchmarks for client consumer systems.
AI Risk & Reliability Working Group
AI Risk & Reliability
The AI Risk & Reliability working group supports community development of safety tests for AI and organizes definition of research- and industry-standard AI safety benchmarks based on those tests.
Data Working Groups
Datasets
The Datasets working group creates new datasets to fuel innovation in machine learning.
Medical
The Medical working group develops benchmarks and best practices to help accelerate AI development in healthcare.
MLCube
The MLCube working group aims to improve AI ease-of-use and to scale AI to more people.
Croissant
The Croissant working group develops a metadata format to standardize how ML datasets are described.
Research Working Groups
Algorithms
The Algorithms working group creates a set of rigorous and relevant benchmarks to measure neural network training speedups due to algorithmic improvements.
Chakra
The Chakra working group advances performance benchmarking and co-design using standardized execution traces.
Data-centric ML
The DMLR working group accelerates machine innovation and increases scientific rigor in machine learning by defining, developing, and operating benchmarks for datasets and data-centric algorithms, facilitated by a flexible ML benchmarking platform.
Science
The Science working group evaluates, organizes, curates, and integrates artifacts around applications, models/algorithms, infrastructure, benchmarks, and datasets