Chakra

Advance performance benchmarking and co-design using standardized execution traces.

Connect with us:

Purpose


Benchmarking and co-design are essential for driving optimization and innovation around AI models, software, and next-generation hardware. Full-stack benchmarks (such as MLPerf) play a crucial role in enabling fair comparisons across different software and hardware stacks on current systems. However, the fast pace of AI innovation demands a more agile methodology for reproduction of workload behavior in current production systems and extension to future AI SW/HW system co-design. Examples include: 

  • Enable production teams in cloud/hyperscalers and/or vendors quickly reproduce a bug or performance regression based on behavior observed in production
  • Project workload behavior for co-designing future AI systems, either for use cases internal to cloud/hyperscalers, sharing with vendors (with or without NDAs), and/or sharing broadly across industry/academia in an open manner. 

Existing benchmarking methodologies fall short in addressing these requirements. Creating a small stand-alone reproducer is time-consuming and usually requires non-trivial effort. Furthermore, for external use cases it is critical we are able to accomplish the above without sharing proprietary code/model details. The challenge emanates (in part) from (i) lack of standard mechanisms to define and share workload behavior without sharing the actual code/infrastructure, and (ii) lack of interpretability or comparison across a wide range of proprietary simulation and emulation tools preferred by different vendors. 

We are developing Chakra: an open and interoperable graph-based representation of AI/ML workloads focused on enabling and accelerating AI SW/HW co-design. Chakra execution traces represent key operations, such as compute, memory, and communication, data and control dependencies, timing, and resource constraints. Additionally, Chakra includes a complementary set of tools and capabilities to enable the collection, analysis, generation, and adoption of Chakra ETs by a broad range of simulators, emulators, and replay tools.

The Chakra working group is working on the following components: 

  • Schema standardization 
  • Execution trace collection – PyTorch, TF/XLA, etc. 
  • Chakra trace synthesis using ML models (for trace obfuscation and future workload projection) 
  • Support tools, e.g. analyzers, visualization, etc. 
  • Downstream tools enablement, e.g. simulators, emulators, replay 
  • Benchmark suite development and metrics 

Deliverables


  • Defining the Chakra schema
  • Support for collecting Chakra traces in commonly used AI frameworks
  • Benchmark definition, methodology, and scoring representing different classes of AI workloads
  • Building consensus on tasks, models, rules, first submissions, and leaderboard
Meeting Schedule

Monday November 18, 2024 Weekly – 11:05 – 12:00 Pacific Time


How to Join and Access Chakra Resources


Chakra Working Group Chairs

Chairs

To contact all Chakra working group chairs email [email protected].

Srinivas Sridharan 

Srinivas Sridharan is a Research Scientist at Meta, where his work has been instrumental in designing and deploying state-of-the-art networking, communication libraries, and observability tools for Meta’s AI clusters. Before joining Meta, he was a Research Scientist at Intel Labs, where he played a pivotal role in scaling AI training on Intel platforms, earning him 10+ Group/Division level awards. Srinivas collaborates extensively with numerous academic groups and has published widely in Tier-1 conferences. He holds 17 patents. Srinivas received his PhD from the University of Notre Dame.

Tushar Krishna 

Tushar Krishna is an Associate Professor in the School of Electrical and Computer Engineering at Georgia Tech. He also serves as an Associate Director for the Center for Research into Novel Computing Hierarchies (CRNCH). He has a Ph.D. in Electrical Engineering and Computer Science from MIT (2014), a M.S.E in Electrical Engineering from Princeton University (2009), and a B.Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Delhi (2007). Before joining Georgia Tech in 2015, Dr. Krishna spent a year as a researcher at the VSSAD group at Intel, Massachusetts. Dr. Krishna’s research spans computer architecture, interconnection networks, networks-on-chip (NoC), and deep learning accelerators – with a focus on optimizing data movement in modern computing systems. His research is funded via multiple awards from NSF, DARPA, IARPA, SRC, Department of Energy, Intel, Google, Meta/Facebook, Qualcomm, and TSMC. His papers have been cited over 14,000 times. Three of his papers have been selected for IEEE Micro’s Top Picks from Computer Architecture, one more received an honorable mention, and four have won best paper awards. He was inducted into the HPCA Hall of Fame in 2022. He was program vice-chair for ISCA 2023.