Research Working Group
DataPerf Working Group
- Training Working Group
- Inference Working Group
- Datasets Working Group
- Best Practices Working Group
- Research Working Group
Drive innovation in ML datasets by defining, developing, and operating benchmarks for datasets and data-centric algorithms.
We are building DataPerf, a benchmark suite for ML datasets and algorithms for working with datasets. Historically, ML research has focused primarily on models, and simply used the largest existing dataset for common ML tasks without considering the dataset’s breadth, difficulty, and fidelity to the underlying problem. This under-focus on data has led to a range of issues, from data cascades in real applications, to saturation of existing dataset-driven benchmarks for model quality impeding research progress. In order to catalyze increased research focus on data quality and foster data excellence, we created DataPerf: a suite of benchmarks that evaluate the quality of training and test data, and the algorithms for constructing or optimizing such datasets, such as core set selection or labeling error debugging, across a range of common ML tasks such as image classification. We leverage the DataPerf benchmarks through challenges and leaderboards.
Data benchmarking roadmap
Data benchmarking rules
Data benchmarking evaluation harnesses
Data benchmarking reference implementations
Leaderboards and challenges on an online platform
Weekly on Thursday from 12:00-12:30pm Pacific.
Working Group Resources
Working Group Chair Emails
Newsha Ardalani firstname.lastname@example.org
Praveen Paritosh email@example.com
Working Group Chair Bios
Newsha Ardalani is a Research Scientist at Facebook AI Research (FAIR), working on three thrusts of data: data scalability, data perishability and data valuation, and their implications on large-scale AI system design. She received her Ph.D. from UW-Madison in 2016.
Praveen Paritosh is a senior research scientist at Google, leading research on data excellence and evaluation for AI systems. He designed the large-scale human curation systems for Freebase and the Google Knowledge Graph. He was the co-organizer and chair for the AAAI Rigorous Evaluation workshops, Crowdcamp 2016, SIGIRWebQA 2015 workshop, the Crowdsourcing at Scale 2013, the shared task challenge at HCOMP 2013, and Connecting Online Learning and Work at HCOMP 2014, CSCW 2015, and CHI 2016 toward the goal of galvanizing research at the intersection of crowdsourcing, natural language understanding, knowledge representation, and rigorous evaluations for artificial intelligence.