Research

Democratizing new technological capabilities and ensuring wide-spread adoption requires an open approach. MLCommons regularly publishes and presents at top conferences and industry events along with our broad community—allowing all researchers, scientists, and professionals in AI and ML to access and learn from our work.

Publications

MLCommons is a community-driven effort. We regularly co-author papers with community members to share our collective learnings with the broader community.

AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons

ARXIV 2025

Research

Publications

AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons

MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Dataperf: Benchmarks for data-centric ai development

Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models

MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation

Speech Wikimedia: A 77 Language Multilingual Speech Dataset

The Dollar Street Dataset: Images Representing the Geographic and Socioeconomic Diversity of the World

MLPerf Mobile Inference Benchmark

MLPerf Tiny Benchmark

Benchmarking tinyml systems: Challenges and direction

The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage

Multilingual Spoken Words Corpus

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Software/hardware co-optimization on the IPU: An MLPerf™ case study

Data Engineering for Everyone

LSH methods for data deduplication in a Wikipedia artificial dataset

MLPerf Training and Inference

MLPerf Training Benchmark

MLPerf Inference Benchmark

MLPerf: A Benchmark Suite for Machine Learning from an Academic-Industry Cooperative

Talks

Driving ML Forward in Automotive

What is MLCube

MLPerf Automotive Overview

Medical Imaging Benchmark using MLPerf

MLPerf HPC: A Benchmark Suite for Large scale ML on HPC Systems

Data-centric Speech for Machine Learning Systems

Mobile AI Performance Benchmarking & Analysis with the MLPerf App

MLPerf Inference Benchmark Suite

MLPerf Training Benchmark Suite

Multilingual Spoken Words Corpus