We are excited to announce that the MLCommons® Association has formed the Dynabench Working Group to develop the open-source Dynabench Platform to support benchmarking of datasets, data-centric algorithms, and models. Dynabench allows groups of researchers to host online benchmarks for ML with a focus on data and rigorous science. MLCommons aims to use Dynabench to do for ML software what the MLPerf™ benchmarks suites have done for ML hardware: define field-standard metrics that catalyze constructive competition and rapidly advance the state of the art.
Dynabench was launched by Facebook Artificial Intelligence Research (FAIR) in 2018 to collect adversarial test examples for tasks such as sentiment analysis (examples that are hard for models but relatively easy for humans) then automatically apply that adversarial data to evaluate models, potentially highlighting weaknesses and enabling development of better approaches. Since then, DynaBench has attracted over 1,800 registered platform users. Collectively, the Dynabench community has contributed more than 500K examples of new data, all freely accessible to the public, to help to refine and improve ML models.
We believe that MLCommons can take the Dynabench platform to the next level, further cementing its role as an important benchmarking and data collection platform by and for the greater ML community. MLCommons engineers and the Dynabench community are collaborating on an upcoming release of the platform that will significantly improve the ease-of-use for researchers and enable a broader range of benchmark types.
We are especially excited that this new Dynabench platform will host the upcoming MLCommons DataPerf competition competition. The DataPerf competition is creating a set of data-centric challenges to evaluate the quality of datasets and data-centric algorithms, and their impact on ML capabilities and efficiency. Dynabench offers an easy-to-use platform for hosting this and other challenges, enabling researchers from across the globe to submit their datasets, models, and other ML artifacts for evaluation across both training and inference.
MLCommons is an open engineering consortium with a mission to benefit society by accelerating innovation in machine learning. The foundation for MLCommons began with the MLPerf benchmark in 2018, which rapidly scaled as a set of industry metrics to measure machine learning performance and promote transparency of machine learning techniques. In collaboration with its 50+ founding partners - global technology providers, academics and researchers, MLCommons is focused on collaborative engineering work that builds tools for the entire machine learning industry through benchmarks and metrics, public datasets and best practices.
To get involved with Dynabench, please join the Working Group, and follow @DynabenchAI and @MLCommons on Twitter. Help us keep growing this community effort, and don’t hesitate to get in touch if you would like to be involved.