The Widening Gap Between AI Safety and AI Security

As Large Language Models (LLMs) and Vision-Language Models (VLMs) become part of safety-critical systems—from finance and healthcare to transportation—a gap: AI systems that perform well when used as intended can fail when faced with adversarial attacks.

The AILuminate Benchmark is a family of safety and security benchmarks that assesses genAI across 12 hazard categories. They include:

  • Safety Text-to-Text (T2T) in English, French, and Chinese.
  • Jailbreak T2T, Text-plus-Image-to-Text (T+I2T) in English.

The benchmarks help guide development, inform purchasers, and support international standards organizations, government bodies & policymakers.

59,624

test prompts

477

test images

109

models benchmarked

The AILuminate Family of Benchmarks

Safety

The AILuminate Safety benchmark assesses the safety of general chatbot gen AI systems to help guide development, inform purchasers and consumers, and support standards bodies and policymakers.

Security: Jailbreaks

The AILuminate Jailbreak benchmark is a multimodal framework for evaluating AI systems in security-relevant conditions, including both text-to-text (T2T) security evaluations and text+image-to-text (T+I2T) attack evaluations.

Agentic

The Agentic Workstream is responsible for advancing a new agentic reliability evaluation standard, including design principles, benchmark factory, publications, and demonstrations.

Read the latest news about the AILuminate benchmark:

Texch Arena logo

MLCommons is a proud partner with the AI Verify Foundation. The AI Verify Foundation is a non-profit organisation and a wholly-owned subsidiary of IMDA. The mission of the AI Verify Foundation is to foster a community to contribute to the development of AI testing frameworks, code base, standards and best practices. AI Verify testing framework is aligned with other frameworks from the European Union, the Organization for Economic Co-operation and Development (OECD), US NIST AI Risk Management Framework, and ISO42001.MLCommons is a proud partner with the AI Verify Foundation. The AI Verify Foundation is a non-profit organisation and a wholly-owned subsidiary of IMDA. The mission of the AI Verify Foundation is to foster a community to contribute to the development of AI testing frameworks, code base, standards and best practices. AI Verify testing framework is aligned with other frameworks from the European Union, the Organization for Economic Co-operation and Development (OECD), US NIST AI Risk Management Framework, and ISO42001.

AIVerify logo

MLCommons is proud to announce a strategic partnership with Nasscom — India’s premier technology trade association — to advance global standards for AI reliability.

Nasscom logo

MLCommons is a 501c6 non-profit organization committed to supporting a long-term effort for this important work. We welcome additional funding and working group contributions. 

The AILuminate benchmarks would not be possible without our generous sponsors:

The MLCommons AI Risk & Reliability working group is composed of a global group of industry leaders, practitioners, researchers, and civil society experts committed to building a harmonized approach to AI risk and reliability. People from the following organizations have collaborated within the working group to advance the public’s understanding of AI risk and reliability.

  • Accenture
  • ActiveFence
  • Amazon
  • Anthropic
  • Argonne National Laboratory
  • Artifex Labs
  • Asenion
  • AVERI
  • Bain & Company
  • Berkeley National Laboratory
  • Blue Yonder
  • Bocconi University
  • Broadcom
  • Carnegie Mellon Center for Security and Emerging Technology
  • ChatFriend
  • cKnowledge, cTuning Foundation
  • Clarkson University
  • Coactive AI
  • Cohere
  • Columbia University
  • Common Crawl Foundation
  • Common GroundAI
  • Context Fund
  • Credo AI
  • Deloitte
  • Digital Safety Research Institute
  • Dotphoton
  • Duke University
  • ETH Zurich
  • EleutherAI
  • Ethriva
  • Febus
  • Futurewei Technologies
  • Georgia Institute of Technology
  • Google
  • Google DeepMind
  • Hewlett Packard Enterprise
  • Humanitas AI
  • IIT Delhi
  • Illinois Institute of Technology
  • Inflection
  • Intel
  • Kaggle
  • Lawrence Livermore National Laboratory
  • Learn Prompting
  • Lenovo
  • MIT
  • Meta FAIR
  • Microsoft
  • NASA
  • Nebius
  • NVIDIA Corporation
  • NewsGuard
  • Nutanix
  • OpenAI
  • Polytechnique Montreal (DEEL Project)
  • Process Dynamics
  • Protecto.ai
  • Protiviti
  • Qualcomm Technologies, Inc.
  • RAND
  • Reins AI
  • SAP
  • SaferAI
  • Stanford
  • Surescripts LLC
  • Tangentic
  • Telecommunications Technology Association
  • Think Evolve Labs
  • Toloka
  • Top Health Tech
  • TU Eindhoven
  • Turaco Strategy
  • University College London
  • University Mohamed First Oujda Morocco
  • University of Alabama in Huntsville
  • University of British Columbia (UBC)
  • University of Birmingham
  • University of California, Irvine
  • University of Cambridge
  • University of Chicago
  • University of Illinois at Urbana-Champaign
  • University of Oxford
  • University of Southern California (USC)
  • University of Trento
  • University of Warwick
  • Working Paper

Join Us

The MLCommons AI Risk & Reliability working group is a highly collaborative, diverse set of experts committed to building a safer AI ecosystem. We welcome others to join us.

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.