Skip to navigation Skip to content
Careers | Phone Book | A - Z Index
Performance and Algorithms Research

SC21 Tutorial

The Roofline performance model offers an insightful and intuitive method for extracting the key execution characteristics of HPC applications and comparing them against the performance ­bounds of modern CPUs and GPUs. Its ability to abstract the complexity of memory hierarchies and identify the most profitable optimization techniques have made Roofline­-based analysis increasingly popular in the HPC community. The tutorial will introduce the fundamental aspects behind different Roofline modeling principles as well as providing several practical use­ case scenarios that highlight their efficacy for application optimization on CPUs and GPUs. This tutorial presents a unique combination of instruction to Roofline, hands-on instruction in using Roofline within Intel and NVIDIA production performance tools by Intel and NVIDIA staff and discussions of real-world Roofline use cases at ALCF, NERSC and INESC computing centers. The tutorial presenters have a long history of collaborating on the Roofline model and have presented several Roofline-­based tutorials.


  • Sunday, November 14th, 8am-5pm CST
    • 8:00am: Welcome and Administration
    • 8:05am: Introduction to the Roofline Model (55mins)
    • 9:00am: Cache-Aware Roofline Model (25mins)
    • 9:30am: Session I Q&A and Break (30mins)
    • 10:00am: Intel Advisor Roofline Hands-On (120mins)
    • 12:00pm: Session II Q&A and Lunch (60mins)
    • 1:00pm: NVIDIA NSight Compute Roofline Hands-On (120mins)
    • 3:00pm: Session III Q&A and Break (30mins)
    • 3:30pm: INESC Application Use Cases (20mins)
    • 3:50pm: ALCF Application Use Cases (35mins)
    • 4:25pm: NERSC Application Use Cases (35mins)
    • 5:00pm: Session IV Q&A


We have created a slack space for tutorial discussions:



The slides for this tutorial are available from:



Intel Advisor Hands-on

It is strongly suggested that attendees wishing to participate in the Intel Advisor Roofline Tutorial follow these instructions.  Download the Intel Advisor build (linux, windows or macOS) and hands-on materials form Users do not need any hardware other than a computer (laptop, desktop or server)  with an internet connection and a recent browser (Firefox and Chrome recommended).


As an alternative (aforementioned approach is highly recommended), users can access Intel cloud (“devcloud”) with CPU and GPU hardware resources and all hands-on software provided.

Go to to request devcloud account. Enter your name and e-mail and you will get an email response in minutes with setup instructions. There is a detailed video explaining the process at the URL above.

Follow to access devcloud after account is created.

Use /data/oneapi_workshop/sc21 location to access SC21 hands-on materials.

To setup Advisor do following: source /data/oneapi_workshop/sc21/advisor_2022.0/





NVIDIA Nsight Compute Hands-on

For the NVIDIA profiling tools tutorial, users will be provided with accesses to nodes in the cloud which are provisioned with NVIDIA V100 GPUs. Users do not need any hardware other than a computer with an internet connection and a recent browser (Firefox and Chrome recommended). Access to the cloud GPUs will be granted through the NVIDIA Deep Learning Institute Platform. Prior to the event, attendees should create an account on this platform at (this requires an NVIDIA Developer Zone account, and you will be prompted to create one if you do not already have one). This will require an email address that you can access to confirm your account. On the day of the event, the instructor will provide instructions on how to access the content with your account.

Please also verify that you can use WebSockets over port 80 by visiting If you have questions about the technical requirements, please visit

It is also recommended, but not required, that you download and install the NVIDIA Nsight Systems ( and Nsight Compute ( developer tools on the system you will be using on the day of the event. These tools are available for Linux, Mac, and Windows and are free to download, although require an NVIDIA Developer Zone account (which you should have created above anyway). We will be collecting profile reports on the remote GPUs on the cloud, but having the tools installed locally would allow you to interact with their user interfaces at lower latency than the remote desktop solution that we will use during the event. It is strictly optional, however we recommended it because then you will be better prepared to use these tools on your own after the event.