By Peter Mattson, Aarush Selvan, David Kanter, Vijay Janapa Reddi, Roger Roberts and Jacomo Corbo

Over the past 10 years, machine learning (ML) has grown exponentially, from a small research field to a broad community spanning universities, technology platforms, industries, nonprofits, and governments. The global funding (PE/VC) in AI companies increased by 390% from $16 billion in 2015 to $79 billion in 2022.[1] Recent advances in ML have made it possible for machines to not only consume but also create content to a level that approaches what humans can do. This capability is sometimes called Generative AI[2]. This is primarily enabled by foundation models[3] – large, pre-trained models trained on copious amounts of diverse data through self-supervised learning that can be adapted to a wide array of downstream tasks. The disruptive potential of foundation models lies in their ability to achieve state-of-the-art performance on tasks with little or no task-specific training required (though such “fine-tuning” further improves capabilities).

While considerable innovation is taking place across the ML research-to-production life cycle, it often occurs organically and in silos, resulting in uneven impact and a low transfer of innovation across industry verticals. A handful of technology companies are experiencing benefits from deploying cutting-edge ML at scale, while others are still learning to operationalize it: to quote one study, nearly 80 percent of AI solutions are abandoned because companies struggle to scale them.[4] Even in companies with advanced ML capabilities, there is considerable friction and uncertainty within the ML development cycle. Reasons for this include:

Exhibit 1: ML Development Flywheel

ML development flywheel chart
Foundation Model paradigm

While researching this article, we, like many others, watched the rise of Foundation Models such as PaLM and ChatGPT. In each section we have included a few thoughts on how our takeaways might apply when thinking about the development paradigm for Foundation Models.

When it comes to the ML Development Flywheel, we see ML developers broadly following the same iterative process when developing Foundation Models from scratch. However, Foundation Models have the potential to make this flywheel spin a lot faster for those building applications on top of them: users can get started with a lot less ML engineering knowledge and get great results with just prompting and fine-tuning. However, very few companies have figured out the outermost loop of this flywheel: namely how to economically and safely put these models into production.

The result is that the ML field today resembles the internet of the late 1990s. It has high potential but uneven impact, knowledge is spread across myriad organizations and individuals, and programming requires a collection of custom, often vertically integrated tech stacks.

Motivated by the challenge of increasing ML’s impact in society, the MLCommons Association collaborated with several partners on a high-level analysis of how to develop a healthier ML ecosystem (Exhibit 2). We wanted to answer two key questions—where are the opportunities for unlocking ML’s greatest impact on society, and what are the best ways to address them—with the goal of providing a perspective on how various ML ecosystem players, including enterprises and technology hyperscalers,[6] can contribute to driving greater ML adoption.

Exhibit 2: Types of ML Ecosystem players

Types of ML Ecosystem players

We interviewed 32 experts across these ML ecosystem players[7]—including engineers and product managers from leading tech companies, business leaders, the heads of industry and civil-society groups, and academics—to explore the challenges they face in developing ML and attempt to prioritize solutions for addressing them.

Opportunities to improve ML

Exhibit 3: Opportunities to improve ML fall into three broad categories

Opportunities to unlock the full impact of ML at scale fall into three categories: increasing the capabilities of ML technology, broadening access to and adoption of ML, and ensuring that the results of ML benefit society (Exhibit 3). All ecosystem players have a role to play for ML to reach its full potential, and contributions can be in the form of tools, data, funding, education, training, benchmarks, best practices, partnerships and coalitions, and governance and policy.

Deep dives on critical ecosystem areas to improve

In this section, we discuss three key opportunities that emerged in our discussions with business leaders and ML practitioners: data discoverability, access, and quality; operational scaling and hardening; and ethical applied ML.

Data discoverability, access, and quality: ML developer productivity is data-limited, but public data is primitive and data tools are poor

ML developers face two broad categories of data challenges:

These challenges are exacerbated by the relatively low investment companies are making in public ML data (compared with the substantial investment in model development), the nonexistent to primitive nature of metrics and standards for ML data sets, and the lack of a widely used tool stack for working with data. Opportunities for improving the ML data ecosystem include:

Foundation Model paradigm

Both the challenges and recommendations hold true with regard to Foundation Models. On the one hand, once trained, Foundation Models can achieve impressive task-specific results with minimal fine-tuning data. However the base Foundation Models are trained on massive amounts of data from the open internet, incorporating all the associated inaccuracies and risks (e.g., bias, leaking sensitive information). A better ML data ecosystem through better public datasets, data sharing, standards and tooling will go a long way to improving the safety and quality of Foundation Models. Over time, we see open source “Foundation Datasets” supporting open source Foundation Models used as starting points for research and development.

Operational scaling and hardening: deploying models from research into production too often relies on costly and fragile bespoke solutions

Delivering stellar results using a prototype model trained on an experimental data set is a significant accomplishment, but putting the model into production to deliver business value brings new challenges that generally fall into two categories:

ML tooling innovation is addressing some of these challenges, in the process attracting high levels of VC funding and some of the best talent in the industry. An ecosystem perspective suggests new solutions are needed to accelerate ML adoption:

Foundation Model paradigm

Foundation Models are orders of magnitude larger than any model in production today, requiring a highly custom stack - from programming frameworks down to the supercomputers they are trained and served on. For many users, the cost and complexity of developing these models will be too large, and instead they will leverage training and serving services from Foundation Model providers. Enterprises leveraging these models will have to make a careful choice of selection criteria for foundation model providers, as well as develop protocols for fine tuning the models to their use case and customers. However, they may no longer have to manage part of the ML toolchain focused on developing new, large models from scratch. For these users, this could actually simplify their operational scaling story since they no longer have to build and manage their own ML toolchain. Even so, Foundation Models are incredibly expensive to serve (costing as much as several cents per query[14]) and have high latency. Improving serving efficiency through techniques like fine-tuning, distillation and quantization is an area of active research, and we hope that over time these innovations are made more accessible via best practices and tooling. This will be critical to making Foundation Models more accessible to users.

Ethical applied ML: developers lack universally accepted standards, technologies and benchmarks to build responsible ML applications

ML has driven phenomenal advances in many areas, from autonomous vehicles and drug discovery to financial fraud-detection and retail-product recommendations. However, it has also had negative effects, including perpetuating systemic biases[15], making many companies cautious about adopting it. Building a trust-based AI architecture will expand the scope of industries and use cases where ML can be applied and will require work across several dimensions - bias and fairness, explainability and transparency, human oversight and accountability, privacy and data ethics, performance and safety, security and sustainability[16]. These can be addressed through several approaches:

Foundation Model paradigm

Foundation Models carry with them all of the same ethical and safety challenges that traditional ML models do but since they are primarily used for generative tasks, there is greater scope for producing harmful content. Lately, we have seen evidence that these models can be used for generating malicious code, phishing, and spreading misinformation. They also present some unique new challenges such as hallucination (confidently generating inaccurate information) and intellectual property/plagiarism concerns. Additionally, since these models are trained with large amounts of data from the open web, they can further perpetuate societal biases reflected in the content. Significant research is yet to be conducted in this space on how to improve safety and fairness of Foundation Models and we need better dissemination of best practices on how to use these types of generative models responsibly.

Concluding thoughts

The ML field is still nascent—barely 10 years old—and significant work is needed to transform it into an established discipline that achieves its full potential of social and economic impact. Mainstream ML adopters and tech hyperscalers will be key to this endeavor.

Key takeaways for Enterprise adopters of ML

In recent years, the momentum of R&D in AI has been exponential; more than 30 times as many patents were filed in the field in 2021 than in 2015.[19] Given this momentum, as well as other factors such as increased investment, AI is not only performing better but also becoming more affordable, leading to increased adoption across industries. According to McKinsey & Company’s Global Survey on the state of AI[20] 2022, 50 percent of respondents reported their organizations had adopted AI in at least one function in 2021, compared with 20 percent in 2017. While AI adoption has grown 2.5 times since 2017, it has more recently plateaued. While the consensus is that AI will be a mainstream technology at their companies, a 2020 survey found that only 76 percent of organizations were close to breaking even on their AI investments when considering the costs and not just the benefits.[21] To address these challenges and realize the full potential of AI for themselves and other ecosystem players, enterprise adopters could adopt the following strategies:

Foundation Model paradigm

With the emergence of open models, Enterprise ML adopters have the opportunity to fine-tune many of the generative models to relevant business use cases instead of training a model from scratch. According to a recent article by McKinsey & Company, although these models are in early days of scaling, businesses have already started adopting them across marketing and sales, IT/ engineering, operations, risk and legal, R&D etc.[22] Mainstream ML adopters can begin by thinking through questions such as where in their business value chain generative models will have the most impact? What might be a low risk MVP for their technology teams to test? What are the associated risks they should be wary of? What legal and community standards should they adhere to ensure trust? How should they reimagine talent as well as empower their workforce with generative AI tools to enhance productivity and creativity? While the potential to create business value is immense, enterprises are yet to determine perspective on some of these questions, before taking to scale.

Key takeaways for tech hyperscalers

As some of the largest investors in ML research, infrastructure, and applications, tech hyperscalers such as Google and Meta are critical to advancing the ML ecosystem. In addition to the significant contributions they already make through university collaborations and open-source projects, we would encourage them to invest in the following:

Foundation Model paradigm

Broadly spread benefits and involvement in shaping positive outcomes: at present, only technology hyperscalers and a handful of well funded startups have the resources and expertise to develop Foundation Models. Over time, these models may be as economically transformative as the internet. To sustain public support for their development, these pioneering organizations should support the evolution of Foundation Models into a widely accessible and enabling platform similar to the web. Further, Foundation Models are a significant step towards true general artificial intelligence (AGI). It is critical to keep a broad community of researchers, policy makers, and non-specialist representatives involved in better understanding and designing the fine-tuning datasets or feedback methodologies that will direct their behavior and comprehensive benchmarks that track their progress to ensure positive outcomes for everyone.


Peter Mattson is a Senior Staff Engineer at Google. He co-founded and is President of MLCommons®, and co-founded and was General Chair of the MLPerf™ consortium that preceded it.

Aarush Selvan is a Product Manager at Google where he works on Natural Language Processing for the Google Assistant.

David Kanter is a co-founder and the Executive Director of MLCommons, where he helps lead the MLPerf benchmarks and other initiatives.

Vijay Janapa Reddi is an Associate Professor at Harvard University. His research interests include computer architecture and runtime systems, specifically in the context of autonomous machines and mobile and edge computing system.

Roger Roberts is a Partner at McKinsey & Company’s Bay Area office and advises clients in a broad range of industries as they address their toughest technology challenges especially in retail and consumer industries

Jacomo Corbo is Founder and co-CEO of PhysicsX, a company building AI generative models for physics and engineering; previously Chief Scientist and Founder of QuantumBlack and a Partner at McKinsey & Company

Authors would like to thank Bryan Richardson, Chaitanya Adabala Viswa, Chris Anagnostopoulos, Christopher McGrillen, Daniel Herde, David Harvey, Kasia Tokarska, Liz Grennan, Martin Harrysson, Medha Bankhwal, Michael Chui, Michael Rudow, Rasmus Kastoft-Christensen, Rory Walsh, Pablo Illanes, Saurabh Sanghvi, Stephen Xu, Tinashe Handina, Tom Taylor-Vigrass, Yetunde Dada, and Douwe Kiela as well as Adam Yala from University of California, Berkeley, Kasia Chmielinski from Data Nutrition Project and contributors from Google, Harvard University, Hugging Face, Meta AI, and Partnership on AI for their contributions to this article.

  1. Pitchbook ↩︎

  2. ↩︎

  3. Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2022, July 12). On the opportunities and risks of Foundation models. Retrieved January 20, 2023, from ↩︎

  4. “How to operationalize machine learning for maximum business impact: Frameworks for machine learning operationalisation are key,” Wired, in partnership with QuantumBlack, June 12, 2021, ↩︎

  5. “Usage statistics of PHP for websites,” W3 Techs, n.d., ↩︎

  6. Tech hyperscalers comprises technology companies like Google, Microsoft, and Amazon that are making efforts to scale in software products and services but also expand to numerous industry verticals ↩︎

  7. Data Nutrition Project, Google, Harvard University, Hugging Face, Meta AI, Partnership on AI, and the University of California, Berkeley ↩︎

  8. ↩︎

  9. ↩︎

  10. White, Olivia, Anu Madgavkar, Zac Townsend, James Manyika, Tunde Olanrewaju, Tawanda Sibanda, and Scott Kaufman. “Financial Data Unbound: The Value of Open Data for Individuals and Institutions.” McKinsey & Company June 29, 2021. ↩︎

  11. Alexandros Karargyris et al., “MedPerf: Open benchmarking platform or medical artificial intelligence using federated evaluation,” n.d., ↩︎

  12. ↩︎

  13. “The value of a shared understanding of AI models,” n.d.,; Michael Hind, “IBM FactSheets further advances trust in AI,” July 9, 2020,; Timnet Gebru et al., “Datasheets for datasets,” Cornell University, March 23, 2018,; “The Data Nutrition Project,” n.d., ↩︎

  14. ↩︎

  15. Silberg, Jake, and James Manyika. “Tackling Bias in Artificial Intelligence (and in Humans).” McKinsey & Company, July 22, 2020. ↩︎

  16. ↩︎

  17. ↩︎

  18. “Machine learning life cycle,” Partnership on AI, n.d., ↩︎

  19. Daniel Zhang et al., “Artificial Intelligence Index Report 2022,” Stanford Institute for Human-Centered AI, Stanford University, March 2022, ↩︎

  20. ↩︎

  21. “2021 AI Predictions: No uncertainty here,” PwC, October 2020, ↩︎

  22. ↩︎