Research Working Group

DataPerf Working Group


Drive innovation in ML datasets by defining, developing, and operating benchmarks for datasets and data-centric algorithms.


We are building DataPerf, a benchmark suite for ML datasets and algorithms for working with datasets. Historically, ML research has focused primarily on models, and simply used the largest existing dataset for common ML tasks without considering the dataset’s breadth, difficulty, and fidelity to the underlying problem. This under-focus on data has led to a range of issues, from data cascades in real applications, to saturation of existing dataset-driven benchmarks for model quality impeding research progress. In order to catalyze increased research focus on data quality and foster data excellence, we created DataPerf: a suite of benchmarks that evaluate the quality of training and test data, and the algorithms for constructing or optimizing such datasets, such as core set selection or labeling error debugging, across a range of common ML tasks such as image classification. We leverage the DataPerf benchmarks through challenges and leaderboards.


Data benchmarking roadmap
Data benchmarking rules
Data benchmarking evaluation harnesses
Data benchmarking reference implementations
Leaderboards and challenges on an online platform

Meeting Schedule

Weekly on Thursday from 12:00-12:30pm Pacific.

How to Join

Use this link to request to join the group/mailing list, and receive the meeting invite:
Dataperf Google Group.
Requests are manually reviewed, so please be patient.

Working Group Resources

Shared documents and meeting minutes:

  1. Associate a Google account with your e-mail address.
  2. Ask to join our Public Google Group.
  3. Once approved, go to the Dataperf folder in our Public Google Drive.

Working Group Chair Emails

Newsha Ardalani (

Praveen Paritosh (

Working Group Chair Bios

Newsha Ardalani is a Research Scientist at Facebook AI Research (FAIR), working on three thrusts of data: data scalability, data perishability and data valuation, and their implications on large-scale AI system design. She received her Ph.D. from UW-Madison in 2016.


Praveen Paritosh is a senior research scientist at Google, leading research on data excellence and evaluation for AI systems. He designed the large-scale human curation systems for Freebase and the Google Knowledge Graph. He was the co-organizer and chair for the AAAI Rigorous Evaluation workshops, Crowdcamp 2016, SIGIRWebQA 2015 workshop, the Crowdsourcing at Scale 2013, the shared task challenge at HCOMP 2013, and Connecting Online Learning and Work at HCOMP 2014, CSCW 2015, and CHI 2016 toward the goal of galvanizing research at the intersection of crowdsourcing, natural language understanding, knowledge representation, and rigorous evaluations for artificial intelligence.