Careers | Phone Book | A - Z Index



CAL: Computer Architecture Laboratory for Design Space Exploration

The Computer Architecture Laboratory (CAL) will advance Exascale Design Space Exploration to develop energy efficient and effective processor and memory architecture R&D for DOE’s Exascale program. Read More »


LBNL is a key member of the Combustion Co-Design Center (ExaCT). Our work represents a collaboration between applied mathematicians and computational scientists who have developed the Low Mach number combustion code (LMC) and computer scientists focused on performance optimization through auto-tuning and DSLs, performance modeling, and architectural simulation. Read More »


Please see our main webpage here.  One of the emerging challenges to design HPC systems is to understand and project the requirements of exascale applications. In order to determine the performance consequences of different hardware designs, analytic models are essential because they can provide fast feedback to the co-design centers and chip designers without costly simulations.  However, current attempts to analytically model program performance typically rely on the user manually… Read More »

gcrm sm

Green Flash

Our researchers have proposed an innovative way to improve global climate change predictions by using a supercomputer with low-power embedded microprocessors, an approach that would overcome limitations posed by today’s conventional supercomputers. Read More »

Characterization of DOE Mini-apps

The Computer Architecture Laboratory (CAL) will advance Exascale Design Space Exploration to develop energy efficient and effective processor and memory architecture R&D for DOE’s Exascale program. Read More »

CoDEx Logo

CoDEx: Co-Design for Exascale

The next decade will see a rapid evolution of HPC node architectures as power and cooling constraints are limiting increases in microprocessor clock speeds and constraining data movement. Applications and algorithms will need to change and adapt as node architectures evolve. A key element of the strategy as we move forward is the co-design of applications, architectures and programming environments, to navigate the increasingly daunting constraint space for feasible exascale system designs. We… Read More »

Screen Shot 2016 05 24 at 10.19.14 PM

Continuing the Scaling of Digital Computing Post Moore’s Law

The approaching end of traditional CMOS technology scaling that up until now followed Moore's law is coming to an end in the next decade. However, the DOE has come to depend on the rapid, predictable, and cheap scaling of computing performance to meet mission needs for scientific theory, large scale experiments, and national security. Moving forward, performance scaling of digital computing will need to originate from energy and cost reductions that are a result of novel architectures, devices,… Read More »

Rambutan TaskGraph


Rambutan is a performance modeling and analysis tool for understanding the behavior of asynchronous, task-based execution models.  It consists of a deeply-instrumented runtime that collects statistics during the execution of a task-based application across distributed memory machines.  The tool keeps track of application task execution, communication costs, and runtime overheads such as task creation and deletion, queue management, dependency satisfaction (possibly remote), remote data… Read More »



In order to model the behavior of AMR solvers that run in an asynchronous fashion, we have developed a tool that builds a skeleton task dependency graph for a variety of AMR algorithms.   The task dependency graph generated contains critical performance information, such as compute time estimates and required communication traffic volume.  The task graph exposes the true data dependencies of the constituent tasks and removes false dependencies that are often introduced as a byproduct of… Read More »


Mota Mapper

Mota is a library that provides several heuristics for the purpose of AMR task placement.  It is multi-objective in the sense that it simultaneously balances the computational load on each rank as well as the communication traffic between the boxes.  We are investigating a variety of approaches to do the task placement and utilizing modeling and simulation tools to evaluate these approaches.  The heuristics used for mapping include algorithms such as greedy list assignment and space-filling… Read More »