Seeing the Great Lights of Europe: A Study in Approaches to Synchrotron Data Management and Analysis
February 21, 2014
Contact: Jon Bashor, jbashor@lbl.gov, 510-486-5849
The 10-day tour of Europe was not your typical itinerary – Garching, Karlsruhe, Villigen, Hamburg and Oxford. In January.
But David Brown and Craig Tull of the Computational Research Division and Alex Hexemer of the Advanced Light Source weren’t touring to see the sights -- they more interested in seeing the lights – powerful scientific instruments known as light sources that use intense X-rays to study materials down to the macromolecular scale.
Scientific user facilities like Lawrence Berkeley National Laboratory’s Advanced Light Source (ALS) are becoming increasingly powerful tools for scientific discovery with the development of higher-resolution imaging devices, which not only capture higher resolution images, but at a much faster rate. Thus, beamline scientists are faced with unprecedented amounts of data, but not always with the hardware or software needed to effectively manage, analyze and share that data.
To get a better picture of how this situation is being addressed elsewhere in the world, Brown, Tull and Hexemer visited some of Europe’s leading light sources, as well as other facilities. Stops included the Heinz Maier-Leibnitz Neutron Source and the Leibniz Computing Center at the Technical University of Munich in Garching; the Steinbuch Computing Center at the Karlsruhe Institute of Technology; the Swiss Light Source at the Paul Scherrer Institute in Villigen, Switzerland; the PETRA III light source at the German Electron-Synchrotron (DESY) in Hamburg; and the Diamond Light Source near Oxford, England.
“Among our goals were understanding what kind of hardware and software these facilities are using, how they manage their data and workflows and to what extent the facilities worked closely with high performance computing centers,” Brown said. “We also wanted to understand the funding models used by other facilities to support their IT infrastructure with the idea of possibly leveraging the most successful models here in the U.S.”
In general, Brown said, the European facilities have well-funded IT infrastructures that include computing hardware, networking and user software development and support.
“Many of our facilities here in the U.S. have some catching up to do in other to provide IT support commensurate with the emerging big data challenges,” Brown said.”On the other hand, in the area of new mathematics algorithms and software development, we appear to be ahead of the Europeans.”
At Berkeley Lab, for instance, mathematician James Sethian leads a project called CAMERA (The Center for Applied Mathematics in Energy Research Applications) to design and apply mathematical solutions to data and imaging problems at Berkeley Lab scientific user facilities supported by the Department of Energy’s Office of Basic Energy Sciences.
Tull said he found the discussions with the Europeans “extraordinarily interesting” and that the series of visits “validated many of our views, but also led me to rethink some of our views. There was a lot of commonality on the problems, and some commonality on the solutions,” Tull said.
During the visits, Tull gave presentations on SPOT Suite, a Laboratory Directed Research and Development project he is leading between CRD, the ALS, NERSC, ESnet and the Materials Sciences Division. SPOT Suite is a collection of software providing a data portal, data management and processing, a database and workflow management. The system automatically transfers data from ALS beamlines to NERSC where the data is processed in real time, with the results automatically transferred back to the scientist working at the beamline.
Brown noted that while the European sites have their own computing infrastructure, they did not see the kind of working relationships with supercomputing centers that Tull is developing.
Hexemer said that many of the people they met with were also interested in HipGISAXS, the high-performance software developed by CRD's Slim Chourou, Abhinav Sarje and Sherry Li (PI), along with Hexemer. HipGISAXS is a high performance, massively parallel analysis code to support GISAXS (Grazing-Incidence Small-Angle X-ray Scattering), an experimental measurement technique characterizing materials properties at the nanoscale.
“Most of the software we heard about did not have the same capabilities as HipGISAXS,” Hexemer said.
Brown said their hosts also expressed interest in better sharing of information and in holding joint workshops to understand and develop solutions to the big data challenges shared by the U.S. and European facilities.
“The Big Data challenge is just coming on for the light sources, due mainly to advanced detectors, and everyone is struggling with it,” Brown said.
About Berkeley Lab
Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 16 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.