We are pleased to announce the Data-centric Machine Learning Research (DMLR) working group. The goal of the new working group is to improve datasets that matter–in the ways that matter to the Machine Learning community.
Benchmarks for machine learning solutions based on static datasets have well-known issues: they saturate quickly, are susceptible to overfitting, contain exploitable annotator artifacts, and have unclear or imperfect evaluation metrics. To help overcome these and other data-centric challenges, the DMLR working group brings together two key projects, DataPerf and Dynabench, which deliver novel approaches towards data-centric methodology benchmarks, data quality benchmarks, and benchmark data curation with dynamic adversarial data collection.
“I’m excited to see the DMLR working group accelerate machine learning through defining, developing, and operating benchmarks for datasets and data-centric algorithms,” said David Kanter, MLCommons Executive Director. “This flexible ML benchmarking platform will unlock new approaches and methodologies in AI.”
The new paradigm of data-centric benchmarking is powered by Dynabench, a research platform for dynamic data collection and benchmarking. Dynabench challenges existing ML benchmarking dogma by embracing dynamic dataset generation with a human in the loop–where people are tasked with finding examples of data that fool a state-of-the-art AI model. This dynamic approach allows for rapid iteration of models by yielding data that can be used to further train even better state-of-the-art models.
DataPerf complements Dynabench by delivering an increased research focus on data quality and excellence. DataPerf evaluates the quality of training and test data, as well as the algorithms for constructing or optimizing datasets. By enabling construction and optimization of test sets, DataPerf plays a critical role in evaluating future AI systems for bias and advancing equity. This recent NeurIPS 2023 paper on DataPerf shares more on the first round of five DataPerf challenges across multiple modalities.
“We are excited to further DataPerf’s impact in close collaboration with Dynabench. Together, we will continue to address pressing research questions as part of one unified effort to advance data-centric approaches to machine learning,” said Lilith Bat-Leah, who co-chairs the DMLR working group along with Max Bartolo and Praveen Paritosh.
Core to understanding progress is measurement tracked on leaderboards. DataPerf and other Dynabench challenges will continue to be hosted by the new DMLR working group, and the platform will continue to evolve to address challenges core to the working group’s mission. Researchers from Coactive.AI, Cohere, Common Crawl, ETH Zurich, Factored, Google, Harvard, Meta, Mod Op, Oxford, Stanford and many other institutions have contributed to prior challenges. New collaborations are underway with Common Crawl, DataComp, NASA, and the University of Texas at Austin.
We invite others to get involved and help shape the future of the DMLR working group.