AI Safety - MLCommons

Mission

Support community development of AI safety tests and organize definition of research- and industry-standard AI safety benchmarks based on those tests.

Purpose

Our goal is for these benchmarks to:

1. Guide responsible development: Computing performance benchmarks such as MLPerf have repeatedly shown their ability to concretely define a common objective such as “faster” and thereby accelerate overall progress towards that objective. Similarly, the safety benchmarks can help define “safer” and thus accelerate the development of “safer” AI systems.

2. Support consumer/purchaser decision making: AI systems are complex and determining if an AI system is suitable for a particular use-case is challenging. The AI safety benchmarks should help individual consumers and corporate purchasers make more informed decisions.

3. Enable technically sound and risk-based policy regulation: Spurred by public concern, governments in the EU, UK, US, and elsewhere are increasingly examining the safety of AI systems. The safety benchmarks should enable data-driven decision making for informing regulations.

Deliverables

Specifically, the working group has the following four major tasks:

Tests: Curate a pool of safety tests from diverse sources, including facilitating the development of better tests and testing methodologies.
Benchmarks: Define benchmarks for specific AI use-cases, each of which uses a subset of the tests and summarizes the results in a way that enables decision making by non-experts.
Platform: Develop a community platform for safety testing of AI systems that supports registration of tests, definition of benchmarks, testing of AI systems, management of test results, and viewing of benchmark scores.
Governance: Define a set of principles and policies and initiate a broad multi-stakeholder process to ensure trustworthy decision making.

Meeting Schedule

Weekly on Friday from 8:35-9:30AM Pacific.

Join

Join the Working Group

Join Discord

Related Blogs and News

16 Apr. 2024

Featured, News

Announcing MLCommons AI Safety v0.5 Proof of Concept

Achieving a major milestone towards standard benchmarks for evaluating AI Safety
16 Apr. 2024

Blog, Featured

The AI Safety Ecosystem Needs Standard Benchmarks

IEEE Spectrum contributed blog excerpt, authored by the MLCommons AI Safety working group
28 Mar. 2024

Blog

Our comments to the NTIA on Open Foundation models

Open Foundations models play an important role in developing AI safety benchmarks.

See all blogs

AI Safety Working Group Projects

MLCommons AI Safety

MLCommons AI Safety Benchmarks

How to Join and Access AI Safety Working Group Resources

To sign up for the group mailing list and receive the meeting invite:
1. Fill out our subscription form and indicate that you’d like to join the AI Safety Working Group.
2. Associate a Google account with your organizational email address.
3. Once your request to join the AI Safety Working Group is approved, you’ll be able to access the AI Safety folder in the Public Google Drive.

AI Safety Working Group Chairs

To contact all AI Safety working group chairs email [email protected].

Joaquin Vanschoren

[email protected]

Joaquin Vanschoren is an Associate Professor of Computer Science at the Eindhoven University of Technology. His research focuses on understanding machine learning algorithms and turning insights into progressively more automated and efficient AI systems. He founded and leads OpenML.org, initiated and chaired the NeurIPS Datasets and Benchmarks track, and has won the Dutch Data Prize, an Amazon Research Award, and an ECMLPKDD Best Demo award. He has given over 30 invited talks, was a tutorial speaker at NeurIPS 2018 and AAAI 2021, and has authored over 150 scientific papers, as well as reference books on Automated Machine Learning and Meta-learning. He is editor-in-chief of DMLR, action editor of JMLR, and moderator for ArXiv. He is a founding member of the European AI networks ELLIS and CLAIRE.

Percy Liang

[email protected]

Percy Liang is an Associate Professor of Computer Science at Stanford University (B.S. from MIT, 2004; Ph.D. from UC Berkeley, 2011) and the director of the Center for Research on Foundation Models. His research spans many topics in machine learning and natural language processing, including robustness, interpretability, semantics, and reasoning. He is also a strong proponent of reproducibility through the creation of CodaLab Worksheets. His awards include the Presidential Early Career Award for Scientists and Engineers (2019), IJCAI Computers and Thought Award (2016), an NSF CAREER Award (2016), a Sloan Research Fellowship (2015), a Microsoft Research Faculty Fellowship (2014), and multiple paper awards at ACL, EMNLP, ICML, and COLT.

Peter Mattson

[email protected]

Peter Mattson is a Senior Staff Engineer at Google. He co-founded and is President of MLCommons®, and co-founded and was General Chair of the MLPerf consortium that preceded it. Previously, he founded the Programming Systems and Applications Group at NVIDIA Research, was VP of software infrastructure for Stream Processors Inc (SPI), and was a managing engineer at Reservoir Labs. His research focuses on understanding machine learning models and data through quantitative metrics and analysis. Peter holds a PhD and MS from Stanford University and a BS from the University of Washington.

AI Safety Workstream Chairs

Benchmarks & Tests – James Goel, Qualcomm Technologies, Inc. and Ahmed Ahmed, Stanford

Platform Technology – Kurt Bollacker, MLCommons, Besmira Nushi, Microsoft, and Forough Poursabzi, Microsoft

Stakeholder Engagement – Rebecca Weiss, MLCommons and Simeon Campos, SaferAI

Cross-group coordination – Bertie Vidgen, MLCommons and Eleonora Presani, Meta

AI Safety Working Group

Mission

Purpose

Deliverables

Meeting Schedule

Join

Related Blogs and News

Announcing MLCommons AI Safety v0.5 Proof of Concept

The AI Safety Ecosystem Needs Standard Benchmarks

Our comments to the NTIA on Open Foundation models

AI Safety Working Group Projects

MLCommons AI Safety

MLCommons AI Safety Benchmarks

How to Join and Access AI Safety Working Group Resources

AI Safety Working Group Chairs

Joaquin Vanschoren

Percy Liang

Peter Mattson

AI Safety Workstream Chairs