Skip to navigation Skip to content
Careers | Phone Book | A - Z Index
Performance and Algorithms Research

Performance Analysis of AI Hardware and Software

The performance characteristics of AI training and inference can be quite distinct from HPC applications despite possessing similar computational methods (large/small matrix multiplications, stencils, gather/scatter, etc...) albeit at reduced precision (single, half, BFLOAT16).  Where possible, vendors are attempting to create specialized architectures subset of computations used in AI training and inference.  Understanding the interplay between science, AI method, framework, and architecture is essential in not only in quantifying the computational potential for current and future architectures running AI models, but also identifying the bottlenecks and the ultimate limits of today's models.

Research Topics



  • Samuel Williams
  • Nick Wright

  • Khaled Ibrahim

  • Hai Ah Nam

  • Leonid Oliker

  • Tan Nguyen

  • Nan Ding

  • Steve Farrell

  • Wahid Bhimji


Conference Paper


Nan Ding, Samuel Williams, Hai Ah Nam, Taylor Groves, Muaaz Gul Awan, Christopher Delay, Oguz Selvitopi, Leonid Oliker, Nicholas Wright, "Methodology for Evaluating the Potential of Disaggregated Memory Systems", RESDIS,, November 18, 2022,

K. Ibrahim, L. Oliker,, "Preprocessing Pipeline Optimization for Scientific Deep-Learning Workloads", IPDPS 22, June 3, 2022,


Khaled Z. Ibrahim, Tan Nguyen, Hai Ah Nam, Wahid Bhimji, Steven Farrell, Leonid Oliker, Michael Rowan, Nicholas J. Wright, Samuel Williams, "Architectural Requirements for Deep Learning Workloads in HPC Environments", (BEST PAPER), Performance Modeling, Benchmarking, and Simulation (PMBS), November 2021,