Skip to navigation Skip to content
Careers | Phone Book | A - Z Index

Epidemiological Modeling in the Exascale Era

Berkeley Lab researchers will lead a three-year, $12 million effort to create a generalized exascale tool for epidemiological modeling

September 8, 2023

By Linda Vu

Creative artwork featuring colorized 3D prints of influenza virus (surface glycoprotein hemagglutinin is blue and neuraminidase is orange; the viral membrane is a darker orange). Credit: National Institute of Allergy and Infectious Diseases (NIAID).

Epidemiological models are indispensable tools for predicting, understanding, and mitigating the impact of infectious diseases. In the early days of the COVID-19 pandemic, researchers at Lawrence Berkeley National Laboratory (Berkeley Lab) led a multi-institutional effort to develop an agent-based model that could effectively harness the power of cutting-edge exascale supercomputers to speed predictions of disease spread for the Centers for Disease Control and Prevention and other public health agencies.

With funding from the Department of Energy (DOE) Biopreparedness Research Virtual Environment (BRaVE) initiative’s National Virtual Biotechnology Laboratory, the collaboration created ExaEpi. This novel code uses AMReX — a block-structured adaptive mesh refinement (AMR) framework developed by an Exascale Computing Project (ECP) co-design center — to exploit the computing capabilities of an entire exascale supercomputer for COVID-19 epidemiological simulations.

Now, BRaVE will award the EMERGE (ExaEpi for Elucidating Multiscale Ecosystem Complexities for Robust, Generalized Epidemiology) team another $4 million per year over the next three years to build on their successes and enhance the capabilities of ExaEpi to target five new diseases: Influenza, Cholera, Zika, Nipah virus, and Burkholderia pseudomallei. Ultimately, the goal is to make ExaEpi a generalized tool for epidemiology and ensure that it will be flexible enough to rapidly incorporate new diseases, including those that impact plants and other animals. 

“To enhance the calibration, workflow, and optimal decision-making of ExaEpi, we must capture a wide enough range of disease types. With these additional contagions, we will have targeted airborne, waterborne, and vector-borne diseases, bacterial and viral diseases, and diseases that are seasonal or sensitive to local climate,”

— Peter Nugent, a senior scientist in Berkeley Lab’s Applied Mathematics and Computational Research (AMCR) division and principal investigator of EMERGE. 

In addition to Berkeley Lab, the EMERGE collaboration includes researchers from five other national laboratories: Argonne, Brookhaven, Livermore, Los Alamos, and Sandia, as well as Boston University and Morgan State University.

Enhancing Agent-Based Models for Epidemiology

Agent-based models (ABMs) are microscale computer models that can capture and predict the details of a complex system by simulating the actions and interactions of multiple autonomous agents over space and time, making them invaluable research tools for fields ranging from biology to business. In epidemiology, the agents can represent people, animals, cells, etc., making ABMs extremely useful for public health planning, policymaking, and preparedness, especially in the face of emerging infectious diseases or evolving community behaviors. 

By tracking how agents interact, researchers can use ABMs to understand the natural complexities of transmission in diverse populations, where factors like age, social networks, and behavior can significantly influence how a disease spreads. These interactions can also point to emerging phenomena that might not be immediately obvious, like a potential outbreak or superspreader event.

And because agent-based modeling is more of a framework, researchers can combine it with other modeling tools to gain insights across time and scales. For instance, if researchers identify a superspreader event in their simulation, they can incorporate a continuum model that describes population dynamics to analyze and predict how the actions of those agents will contribute to disease spread through their network geographically over time. ABMs are also known for their flexibility, which means researchers can easily add more agents to a model and iteratively tune the complexity (behavior, rules of interaction, etc.) to test different scenarios or improve the model as new data becomes available.

According to Nugent, the EMERGE team plans to use this new round of funding to develop ExaEpi’s framework with additional modalities like climate and transportation. And they will add a subscale framework to address location-specific dynamics within schools, stores, factories, and more. By incorporating AMReX into ExaEpi, researchers can now include more agents and parameters into their models. AMReX also allows them to resolve short-range (e.g., grocery shopping) and long-range (e.g., weather) phenomena to create more accurate transmission predictions for public health planning.  

“We know that climatic variables like temperature, humidity, and precipitation are key drivers for disease transmission, and these factors affect the survival and reproduction of many disease-causing organisms,” said Nugent. “Understanding these complex relationships is important for developing effective public health strategies to mitigate the impact of climate on disease transmission.”

Building on the Success of ExaEpi 


The Oak Ridge National Laboratory's (ORNL's) Frontier supercomputer is the first to achieve the level of computing performance known as exascale, a threshold of a quintillion calculations per second. Credit: ORNL

Despite the usefulness of epidemiological ABMs, these tools also have some limitations. Depending on the system’s complexity and population size, ABMs may require a lot of computing resources and time, which hinders their usefulness for real-time decision-making during outbreaks. 

With the first round of BRAVE funding, the EMERGE team tackled this problem by ensuring that their ExaEpi code could scale up and effectively use the resources of an entire exascale supercomputer for agent-based epidemiological simulations, if necessary. This unprecedented capability means researchers could feasibly compute millions of different scenarios in real time.

“When the code can run so much faster, and you can generate so many models quickly, there are many more things you can do to address existing problems,” said Nugent. “These solutions would have been much more difficult or impossible to implement in the previous environment.” 

He notes that the ability to run more epidemiological ABMs will allow researchers to calibrate — fine-tune their parameters to represent real-world phenomena more accurately — and validate these models. In epidemiology, the meager data that is generally available (like case counts, hospitalizations, and deaths) does not allow for the parameters in agent-based modeling to be estimated with much confidence or at all. Additionally, various parameters can impact widely different scales. For instance, a parameter like the pathogen’s spread rate can have a global impact, while a parameter like an autonomous agent’s travel behavior has a more localized effect.

To tackle this calibration problem, the EMERGE team will isolate the few parameters where observations exist and use that data to calibrate the ABMs for the six pathogens. With ExaEpi and access to DOE’s world-class supercomputing facilities, the team will create surrogate models with novel machine-learning tools and simulations to systematically explore how different parameter combinations influence an ABM’s behavior. This work leverages the efforts of the ExaLearn ECP co-design center, which developed scientific machine-learning tools to advance surrogate modeling. Surrogate models that can provide fast approximations of more computer-intensive simulations will be helpful to progress the field of epidemiology by reducing the computational cost of studying new diseases in real-world scenarios. 

“The goal is to have these surrogate models replace the big simulations. We’ll train these surrogate models on thousands and thousands of simulations. When we’ve finished the training, we should be able to describe how a pandemic spreads under different conditions in mere seconds,” said Nugent. “Our team is also developing a reinforcement learning code to optimize decision-making to help advise public health agencies on the best approaches they can take in real-time to affect outcomes like lower hospitalization rates or death.”  

He adds that every collaborator in this project brings a unique skill set to this project. “This project epitomizes the spirit of the BRAVE initiative by harnessing the DOE’s investments in the unique capabilities of its national laboratories and the exascale initiative, as well as the expertise of our collaborators in academia to meet some of the greatest scientific challenges facing our global community.” 

Learn more about some of the other Berkeley Lab-led BRaVE awards.

About Berkeley Lab

Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 16 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.

DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit