ATLAS Software Team Pushes Ahead Led by CRD's David Quarrie
September 11, 2008
GENEVA, Switzerland – When the ATLAS detector goes on line in 2007 as one of two experiments on CERN´s Large Hadron Collider, pairs of protons will be hurtling around the 27-kilometer accelerator ring and smashing into each other at a rate of about one billion collisions per second.
This will translate into about 40 terabytes of data, only a small portion of which will be of interest to the team of 1,800 scientists working on the project. Still, the project envisions the need to store and make accessible about one petabyte of data each year. Creating the necessary software for the project is the responsibility of LBNL´s David Quarrie.
Quarrie, who is the leader of the High Energy Physics Computing Group in the NERSC Center Division, is midway through the second year of his stint as software project lead for ATLAS.
“There’s something about seeing a project and the size of this equipment, and the numbers involved, like the 100 million electronic channels coming out of the detectors,” Quarie says with a smile. “I oscillate between ‘It might actually work’ and ‘Oh, my gosh. Why am I doing this?’ In the end, I think we’ll have something that works on time.”
As with most aspects of ATLAS, the software is a project of nearly unprecedented scale. The initial data collection will take place in a huge chamber nearing completion 100 meters below the Swiss-French border. The chamber is 50 meters long and 32 meters wide. From the top of the 5-meter thick concrete floor to the top of the chamber is 35 meters.
“We like to tell our visitors from different countries that they can pick the cathedral of their choice and it will probably fit in here,” says Peter Jenni as he leads a small group on a tour. “ATLAS is huge, but rather light – the detector will only weigh about 7,000 tons.”
The big experiment is expected to find particles known as Higgs Bosons, particles thought to exist but so far undetected. But, as Jenni says, such experiments often turn up completely unexpected and exciting results, too.
All of the pieces are being lowered into place through two access holes in the ceiling. Tests with mockups of detector components show that due to the narrow clearances, even the largest sections will need to be guided into place with the precision of a Swiss watchmaker.
“It’s not just the scale of the project, but also the precision,” Quarrie said. And that precision even extends to possible problems. For example, the software for data colletction needs to be able to adjust if the detectors are up to 50 microns out of alignment, Quarrie said. Another, less tangible adjustment to be incorporated deals with how data will be stored, accessed and analyzed over the 10- to 15-year lifespan of the experiment.
To make sure the analysis software is up to the task, the team is conducting four “data challenges,” with the third one taking place this summer. Each challenge is aimed at testing the software infrastructure by using an increasingly large volume of simulated data would flow from the detectors to the computing processing center and out to physicists’ desktops for analysis.
This go-round will simulate one day’s worth of data when ATLAS goes on line, or 10 million physics events, along with background information. But the simulated analysis will be done over 10 days, or one-tenth the scale of the real thing. The next data challenge will seek to test everything at one-third scale. The challenge data, generated by some 50 sites around the world, simulates the underlying physics of ATLAS, with simulated responses from the detectors.
“The goal is to get the data into the hands of physicists as quickly as possible,” Quarrie says. “Ultimately, we hope to get 50 percent of the data out to users within eight hours, and 90 percent out within 24 hours of initial collection.”
In the challenge, LBNL is responsible for validating the data generators and making sure it all works in the ATLAS environment. Quarrie describes the data generation as throwing the dice of the underlying physics. As the data pile up, background information is added, then it’s all digitized to represent and event stream. From the data, scientists will then reconstruct the event and recover the underlying physics.
"One thing we haven’t done yet is to simulate a misaligned detector and include ‘hidden’ physics and then let the scientists try to find it,” Quarrie said.
Unlike the real data that will be coming off the detectors, the challenges include information about the data, information that presents the “truth” about the simulated data and allows physicists to tune their algorithms.
Pulling it all together and making sure everything is ready to go when the experiment goes online involves far-flung collaborations, both in terms of software developers and hardware infrastructure.
Serving first as the ATLAS software architect and now, since 2003, as the software project lead, Quarrie is responsible for coordinating the development efforts by some 200 contributors from around the world, including five in his group at LBNL. It’s a job, he says, that carries plenty of responsibility but little direct authority over those who are contributing. Much of his time is spent on consensus building, he said, then coming to a decision and moving ahead.
The project finds Quarrie spending many of his mornings in either group meetings or a series of one-on-one conversations with collaborators at CERN. For example, he says he like to meet at least once a week with the head of simulations to discuss various issues as they arise. To keep in touch with his group back in Berkeley, Quarrie holds a teleconference once a week, staying late in his office to accommodate the time difference.
Add in the required reports, reviews and milestones, and the idea of spending a couple of years working in Switzerland has turned out to be a very demanding assignment. Quarrie’s two-year assignment officially ends in February, but part of him would like to stay on. This is his ninth physics experiments he’s worked on during his career.
“Each one is always harder than the previous – bigger and harder. Just to get all the bugs out of this one requires calibrating 100 million channels – that’s an enormous technical challenge;” he said. “In a year, we should have the system in place so we can fine tune it for another year before the experiment goes on line. But we won’t really know if the software works until the first experiments are conducted.”
For many of the people working on ATLAS, which began design work 15 years ago and is expected to be in operation for 10 to 15 year, the project will constitute their entire career.
“While there are times when I’d like to just sit down at my workstation and write some code, but it’s great to be working on something like this,” Quarrie said. “It’s just a great feeling to be working on something this big, this difficult, and finding out that it works.”