Berkeley Lab Researchers Showcase Deep Learning for High Energy Physics at CHEP
August 17, 2018
Steve Farrell, a machine-learning engineer who recently joined the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (Berkeley Lab), gave an overview of the Lab’s expanding expertise in deep learning for science during a plenary talk at the 2018 CHEP (Computing in High Energy and Nuclear Physics) conference in July.
Farrell presented findings from a recent study in which a team of Berkeley Lab Computing Sciences researchers used generative adversarial networks (GANs) to speed simulations in high energy physics (HEP) studies. His co-authors on the study, “Next-generation generative neural networks for HEP,” were Wahid Bhimji, Thorsten Kurth, Mustafa Mustafa and Debbie Bard of NERSC; Zarija Lukic of the Computational Research Division; Ben Nachman, an HEP post-doc in the ATLAS group in Berkeley Lab's Physics Division; and Harley Patton, who is studying Computer Science and Applied Mathematics at UC Berkeley and works as a researcher for CERN through Berkeley Lab.
Orchestrating particle collisions and observations at facilities like CERN, where groups of protons collide with one another 40 million times per second, is a massive scientific undertaking. The goal is to understand the interactions and decays of fundamental particles and look for anomalies from the expected behavior. But analyzing the enormous amounts of data produced from the experiments is becoming an overwhelming challenge. Large event rates mean physicists must sift through tens of petabytes of data per year, including a large volume of simulated data essential for modeling those collisions and performing analysis. Traditionally, HEP simulations of collision events in experimental facilities take a particle-by-particle approach that is time-consuming and computationally expensive.
“In HEP, simulation is an essential part of data analysis, but it is very expensive,” said Farrell, who was in HEP on the ATLAS experiment for his PhD at UC Irvine and as a postdoc in Berkeley Lab's Physics Division, prior to joining NERSC. “We have these very large, very complex detectors with lots of different materials and technologies that are challenging to simulate. We also have high-fidelity simulation tools that can do things very accurately, but that is what is so expensive. So we are always trying to find ways to speed up the process, to supplement the expensive simulation with solutions that can get the job done faster.”
Generative Models for Whole Collision Events
One newer approach involves applying deep learning generative models to the data-simulation process. Some research groups have focused on single particles as they enter a certain section of the detector, using a deep learning generative model to simulate the particle’s “showering” – how it sprays out as it collides with material. But Farrell and his colleagues are taking a different approach: using deep learning to train a GAN to learn the distribution of full detector images. With this modeling approach—which has also been successful at generating large, weak lensing cosmology convergence maps—the team was able to generate full particle physics events and large, weak lensing cosmology convergence maps.
“Last year we did some work on a classification task that used entire collision events arranged into big cylindrical images containing the energy deposits of all the particles,” Farrell said. “More recently we found that we can use this generative model approach with that same image representation of the data to produce whole collision events, rather than just one particle. And we can train the model such that it produces the correct physics distributions.”
The approach to training a GAN model is like a two-player game in which one neural network is trying to generate realistic looking samples and the other is trying to distinguish which ones are generated and which ones are real, he added.
“You are feeding both real and fake samples to the discriminator and they train off against each other and get better and better,” Farrell said. “It is a powerful technique but difficult and unstable. Much like how two people may try to learn to play chess by only playing against each other; they may get great against each other but their skills may not generalize. They may not actually learn how to play chess.”
Despite the challenges, Farrell and colleagues have shown that their model can produce whole event images with reasonably good fidelity that exhibit the desired physical characteristics.
“We apply standard physics reconstruction techniques to these images, cluster the energy deposits, and create particle object representations,” Farrell said. “Then we look at the distributions of those and compare to real events to decide how well it is doing. And basically we get a model that can learn to capture the distributions pretty well.”
This sort of work is mission critical for the future of the Large Hadron Collider and related experiments because of the enormous data rate and stringent requirements on the precision of our analyses, according to Nachman.
"Ultrafast, high-fidelity simulations will be required for maximizing the scientific output of these unique datasets," said Nachman, who worked on the first GAN application in HEP (https://arxiv.org/1701.0592), which was extended to single particle calorimeter simulations (https://arxiv.org/abs/1705.02355). "I think there are significant obstacles to using an event level GAN in practice, but there are certain use cases for which it could already make a big impact, such as the pileup project that Steve mentioned in his talk."
The Berkeley Lab group is not alone in its use of GANs to speed HEP simulations—there were at least four other papers on the topic at the CHEP conference, Farrell noted. But through this research the Berkeley Lab team has also demonstrated a framework that makes use of distributed computing on the Cori supercomputer at NERSC launched via interactive Jupyter notebook sessions. This allows for tackling high-resolution detector data, model selection and hyper-parameter tuning in a productive yet scalable deep learning environment.
“There is definitely a good amount of interest in this,” he said. “People in this field understand the need for these kinds of methods that can help the solve really complex tasks in this way. And the resulting model that you get, whether it is for a whole event image or using GAN to simulate a single particle, the potential for speedup is enormous. By running a neural network to generate samples, it’s orders of magnitude faster than traditional expensive simulations.”