Inference Working Group

Tiny Working Group


Develop “tiny ML” benchmarks to evaluate inference performance on ultra-low-power systems.


ML inference on the edge is increasingly attractive to increase energy efficiency, privacy, responsiveness, and autonomy of edge devices. Recently there have been significant strides, in both academia and industry, towards expanding the scope of edge machine learning to a new class of ultra-low-power computational platforms. “Tiny ML,” or machine learning on extremely constrained devices, breaks the traditional paradigm of energy and compute hungry machine learning and allows for greater overall efficiency relative to a cloud-centric approach by eliminating networking overhead. This effort extends the accessibility and ubiquity of machine learning since its reach has traditionally been limited by the cost of larger computing platforms.

To enable the development and understanding of new, tiny machine learning devices, the MLPerf™ Tiny working group will extend the existing inference benchmark to include microcontrollers and other resource-constrained computing platforms.


  1. 3-4 benchmarks with defined datasets and reference models for the closed division
  2. Software framework to load inputs and measure latency
  3. Rules for benchmarking latency and energy
  4. Power and energy measurement with partners

Meeting Schedule

Weekly on Monday from 9:35-10:30AM Pacific.

How to Join

Use this link to request to join the group/mailing list, and receive the meeting invite:
Tiny Google Group.
Requests are manually reviewed, so please be patient.

Working Group Resources

Working Group Chairs

Jeremy Holleman ( - LinkedIn

Jeremy Holleman is the Chief Scientist at Syntiant Corp. and an Associate Professor of Electrical and Computer Engineering at the University of North Carolina, Charlotte. He has held positions at the University of Tennessee as well as Data I/O and National Semiconductor. He received his Ph.D. from the University of Washington where he studied micro-power integrated circuits for neural interfaces. His research interests span several disciplines including low-power circuit design, machine learning, and resource-constrained intelligent systems.

Csaba Kiraly ( - LinkedIn

Dr. Csaba Kiraly is a senior research engineer with more than 18 years of professional R&D experience in the design, implementation, and evaluation of cutting edge technologies in the fields of ultra low power machine learning (TinyML), IoT, wireless communications, networking, p2p, and privacy. In the past, he was working in research groups at Digital Catapult, Fondazione Bruno Kessler, the University of Trento, Rome Tor Vergata, Politecnico di Torino, and Budapest University of Technology and Economics where he received both his PhD and MSc. He has co-authored more than 40 scientific papers and received several IEEE Best Demo and Best Paper awards. He is a senior member of IEEE, an Eclipse Foundation open-source committer, and a past member of the Eclipse IoT working group.