Machine learning innovation to benefit everyone.
What’s New
-
AVCC and MLCommons Join Forces to Develop an Automotive Industry Standard
-
Launching GaNDLF for Scalable End-to-End Medical AI Workflows
A powerful framework combining AI with medical research to enable medical professionals to more effectively diagnose and treat patients -
New ML Benchmarks for Scientific Discovery
Four new open source benchmarks aim to uncover novel ML solutions to improve scientific discovery. -
MLPerf Inference Delivers Power Efficiency and Performance Gains
Record participation in MLCommons’ benchmark suite showcases improvements in efficiency and capabilities for deploying machine learning
MLCommons aims to accelerate machine learning innovation to benefit everyone.
MLCommons aims to accelerate machine learning innovation to benefit everyone. Machine learning has tremendous potential to save lives in areas like healthcare and automotive safety and to improve information access and understanding through technologies like voice interfaces, automatic translation, and natural language processing. However, machine learning is completely unlike conventional software -- developers train an application rather than program it -- and requires a whole new set of techniques analogous to the breakthroughs in precision measurement, raw materials, and manufacturing that drove the industrial revolution.
MLCommons aims to answer the needs of the nascent machine learning industry through open, collaborative engineering in three areas:
Benchmarking
Benchmarks provide consistent measurements of accuracy, speed, and efficiency. Consistent measurements enable engineers to design reliable products and services, and enable researchers to compare innovations and choose the best ideas to drive the solutions of tomorrow.
Datasets
Datasets are the raw materials for all of machine learning. Models are only as good as the data they are trained on. Academics and entrepreneurs in particular depend on public datasets to create new technologies and new companies.
Best Practices
Best Practices empower researchers and engineers to more easily exchange models, reproduce experiments, and build applications that leverages machine learning. Improving best practices accelerates progress in, and grows the market for, machine learning.
People’s Speech
The People’s Speech Dataset is among the world’s largest English speech recognition corpus today that is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0. It includes 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. This open dataset is large enough to train speech-to-text systems and crucially is available with a permissive license. Just as ImageNet catalyzed machine learning for vision,the People’s Speech will unleash innovation in speech research and products that are available to users across the globe.
MLCube
MLCube is a set of best practices for creating ML software that can just "plug-and-play" on many different systems. MLCube makes it easier for researchers to share innovative ML models, for a developer to experiment with many different models, and for software companies to create infrastructure for models. It creates opportunities by putting ML in the hands of more people. MLCube isn’t a new framework or service; MLCube is a consistent interface to machine learning models in containers like Docker. Models published with the MLCube interface can be run on local machines, on a variety of major clouds, or in Kubernetes clusters -- all using the same code.