Data-centric ML Research Working Group
Mission
Accelerate research innovation and increase scientific rigor in machine learning by defining, developing, and operating benchmarks for datasets and data-centric algorithms, facilitated by a flexible ML benchmarking platform.
Purpose
The Data-centric ML Research (DMLR) working group challenges existing ML benchmarking dogma by driving novel approaches to benchmark curation such as dynamic adversarial data collection. Benchmarks for machine learning solutions based on static datasets have well-known issues: they saturate quickly, are susceptible to overfitting, contain exploitable annotator artifacts and have unclear or imperfect evaluation metrics. This new paradigm of data-centric benchmarking is powered by the Dynabench platform. The key scientific question we investigate in this working group is: is it possible to make faster progress if data is collected dynamically, with humans and models in the loop, rather than in the old-fashioned static way? This further enables an ecosystem of other ML benchmarks in areas such as ML datasets and algorithms for working with datasets. We leverage these benchmarks through challenges and leaderboards.
Deliverables
- DataPerf
- Organize three workshops at major ML conferences per year
- Recruit more participants to WG
- Measure and improve quality of Common Crawl (based on human annotation)
- Domains for impact challenges
- LLMs + Commoncrawl
- Safety
- Science
- Research challenges beyond the chosen domains
- Dynabench
- Supporting existing tasks, add three new tasks a year
- A major task, such as Safety
- Support academic research using Dynabench
- Product improvements for LLM experiments
- Supporting existing tasks, add three new tasks a year
- Shared objectives:
- Come up with sustainable funding model for ongoing research
- Develop approach to product management of Dynabench
- Define standard processes and decision-making criteria around accepting new challenges
- Establish standard processes for promoting new challenges and driving engagement
Meeting Schedule
Second and fourth Thursdays each month from 10:30-11:30AM Pacific.
Join
Related Blog
-
Unveiling the PRISM Alignment Project
Prioritizing the Data-Centric Human Factors for Aligning Large Language Models
-
Announcing the Formation of the MLCommons Data-centric Machine Learning Research Working Group
DataPerf and Dynabench provide the foundation for new data-centric innovation
-
DMLR Working Group Projects
DataPerf: Benchmarking Data-Centric AI
Dynabench
DMLR
How to Join and Access DMLR Working Group Resources
- To sign up for the group mailing list, receive the meeting invite, and access shared documents and meeting minutes:
- Fill out our subscription form and indicate that you’d like to join the DMLR Working Group.
- Associate a Google account with your organizational email address.
- Once your request to join the DMLR Working Group is approved, you’ll be able to access the DMLR folder in the Public Google Drive.
- To engage in working group discussions, join the group’s channels on the MLCommons Discord server.
- To access the GitHub repository (public):
- If you want to contribute code, please submit your GitHub ID to our subscription form.
- Visit the GitHub repositories:
DMLR Working Group Chairs
To contact all DataPerf working group chairs email [email protected].
Lilith Bat-Leah
Lilith Bat-Leah is Vice President, Data Services at Mod Op, responsible for consulting on use cases for data analytics, data science, and machine learning. Lilith has over 11 years of experience managing, delivering, and consulting on identification, preservation, collection, processing, review, annotation, analysis, and production of data in legal proceedings. She also has experience leading research and development of AI / machine learning software. She speaks and writes about various topics such as evaluation of machine learning systems, ESI protocols, and discovery of databases. Lilith holds a BSGS in Organization Behavior from Northwestern University, where she graduated magna cum laude. She formerly served as Co-Trustee of the EDRM Analytics and Machine Learning project, as a member of the EDRM Global Advisory Council, as Vice President of the Chicago ACEDS chapter, and as President of the New York Metro ACEDS Chapter.
Max Bartolo
Max leads the Command modelling team at Cohere working on improving adversarial robustness and the overall capabilities of large language models. He is also one of the original contributors to the Dynabench working group, which he currently co-leads, and he also lectures at UCL.
Praveen Paritosh
Praveen Paritosh is a senior research scientist at Google, leading research on data excellence and evaluation for AI systems. He designed the large-scale human curation systems for Freebase and the Google Knowledge Graph. He was the co-organizer and chair for the AAAI Rigorous Evaluation workshops, Crowdcamp 2016, SIGIRWebQA 2015 workshop, the Crowdsourcing at Scale 2013, the shared task challenge at HCOMP 2013, and Connecting Online Learning and Work at HCOMP 2014, CSCW 2015, and CHI 2016 toward the goal of galvanizing research at the intersection of crowdsourcing, natural language understanding, knowledge representation, and rigorous evaluations for artificial intelligence.