Science Working Group
Mission
Evaluate, organize, curate, and integrate artifacts around applications, models/algorithms, infrastructure, benchmarks, and datasets. These artifacts are open source and accessible through the MLCommons GitHub. Our input comes from independently funded a activities and experts in industry, government, and research.
Purpose
Encourage and support the curation of large-scale experimental and scientific datasets and the engineering of ML benchmarks operating on those datasets. The working group will engage with scientists, academics, and national laboratories, such as synchrotrons, in securing, engineering, curating, and publishing datasets and machine learning benchmarks that operate on experimental scientific datasets. This will entail working across different domains of sciences, including material, life, environmental, and earth sciences, particle physics, and astronomy, to mention a few. We will include traditional observational and computer-generated data.
Although scientific data is widespread, curating, maintaining, and distributing large-scale, useful datasets for public consumption is a challenging process, covering various aspects of data (from FAIR principles to distribution to versioning). With large data products, various ML techniques have to be evaluated against different architectures and different datasets. Without these benchmarking efforts, the community has no clear pathway for utilizing these advanced models. We expect that the collection will have significant tutorial value as examples from one field, and one observational or computational experiment can be modified to advance other fields and experiments.
The working group’s goal is to assemble and distribute scientific data sets relevant to a scientific campaign in a systematic manner and pose quantifiable targets (“science benchmark”). A benchmark involves (i) a data set, (ii) objective criteria to meet, and (iii) an example implementation. The objective criteria depends on the scientific problem at hand. The metric should be well defined on the data but could come from a diverse set of measures (one or more of: accuracy targets, top-1 or 5% error, time to convergence, cross-validation rates, confusion matrices, type-1/type-2 error rates, inference times, surrogate accuracy, control stability measure, etc.). Although we compile system performance numbers across a variety of architectures, our goal is not performance measurements but rather improving scientific discovery performance.
Deliverables
- Develop a number of science benchmarks.
- Allow for open category benchmarks.
- Focus on scientific improvement.
Meeting Schedule
Bi-weekly on Wednesday from 8:05-9:00AM Pacific.
Join
Related Blog
-
New ML Benchmarks for Scientific Discovery
Four new open source benchmarks aim to uncover novel ML solutions to improve scientific discovery
How to Join and Access Science Working Group Resources
- To sign up for the group mailing list, receive the meeting invite, and access shared documents and meeting minutes:
- Fill out our subscription form and indicate that you’d like to join the Science Working Group.
- Associate a Google account with your organizational email address.
- Once your request to join the Science Working Group is approved, you’ll be able to access the Science folder in the Public Google Drive.
- To engage in group discussions, join the group’s channels on the MLCommons Discord server.
- To access the GitHub repository (public):
- If you want to contribute code, please submit your GitHub ID to our subscription form.
- Visit the GitHub repository.
- Policy document
- Submission rules
Science Working Group Chairs
To contact all Science working group chairs email [email protected].
Geoffrey Fox
Fox received a Ph.D. in Theoretical Physics from Cambridge University, where he was Senior Wrangler. He is now a Professor in the Biocomplexity Institute & Initiative and Computer Science Department at the University of Virginia. He previously held positions at Caltech, Syracuse University, Florida State University, and Indiana University, after being a postdoc at the Institute for Advanced Study at Princeton, Lawrence Berkeley Laboratory, and Peterhouse College Cambridge. He has supervised the Ph.D. of 77 students. He has an h-index of 87 with over 42,000 citations. He received the High-Performance Parallel and Distributed Computing (HPDC) Achievement Award and the ACM – IEEE CS Ken Kennedy Award for Foundational contributions to parallel computing in 2019. He is a Fellow of APS (Physics) and ACM (Computing) and works on the interdisciplinary interface between computing and applications. His current focus is on algorithms and software systems needed for the AI Science revolution.
Jeyan Thiyagalingam
Fox received a Ph.D. in Theoretical Physics from Cambridge University, where he was Senior Wrangler. He is now a Professor in the Biocomplexity Institute & Initiative and Computer Science Department at the University of Virginia. He previously held positions at Caltech, Syracuse University, Florida State University, and Indiana University, after being a postdoc at the Institute for Advanced Study at Princeton, Lawrence Berkeley Laboratory, and Peterhouse College Cambridge. He has supervised the Ph.D. of 77 students. He has an h-index of 87 with over 42,000 citations. He received the High-Performance Parallel and Distributed Computing (HPDC) Achievement Award and the ACM – IEEE CS Ken Kennedy Award for Foundational contributions to parallel computing in 2019. He is a Fellow of APS (Physics) and ACM (Computing) and works on the interdisciplinary interface between computing and applications. His current focus is on algorithms and software systems needed for the AI Science revolution.
Juri Papay
Juri Papay is a senior data scientist at STFC Rutherford Appleton Laboratory UK. He received a PhD in Computer Science from Warwick University. His current work focuses on benchmarking machine learning applications and investigating the performance of large-scale GPU systems. Previously, he worked as a research scientist at Southampton University on numerous EU funded projects, covering a wide range of topics such as HPC, security modeling, discrete event simulations, image generation, and semantic research.