Data Science Pioneer Deb Agarwal Named 2024 AGU Fellow
December 5, 2024
By Keri Troutman
Contact: cscomms@lbl.gov
Deb Agarwal’s enduring commitment to creating lasting scientific impact has earned her recognition as a 2024 American Geophysical Union (AGU) Fellow, a prestigious honor acknowledging her transformative contributions to Earth and environmental science. She hopes this recognition will help bring greater visibility to the remarkable work of informatics professionals within AGU’s ranks. Agarwal and the other honorees will be celebrated at AGU24, where more than 25,000 attendees from over 100 countries will gather in Washington, D.C., from December 9 to 13.
Retiring in July 2023 as Director of Berkeley Lab’s Scientific Data Division, Agarwal was honored for her ‘pioneering user-oriented data infrastructure and value-added products,’ which have made complex environmental data more accessible and enabled the scientific community to advance its understanding of the global carbon cycle, ecosystems, and their role in mitigating climate change. Her technical innovations and collaborative efforts have successfully bridged the gap between computer science and Earth sciences.
Initially focused on advancing computer science research, her career pivoted toward a more specialized role in the Earth sciences in 2005 when she joined a Microsoft e-Science collaboration. Projects like the Berkeley Water Center, AmeriFlux, and Fluxnet laid the foundation for streamlined environmental data collection and climate research acceleration. These successes led to Department of Energy (DOE)-funded initiatives such as the Environmental System – Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE).
Through these efforts, Agarwal recognized the unique challenges faced by Earth scientists. Unlike physicists, who had already built large-scale data infrastructure, they were still in the early stages of working with large-scale field data. This gap presented an opportunity for Agarwal and her team to collaborate with Earth and environmental scientists and computer scientists to develop advanced data tools.
“What I found in the earth sciences was a greater need,” Agarwal said. “They were further away from adopting more advanced data techniques, partially because they were coming from being a field science.”
Agarwal’s work with the AmeriFlux and FLUXNET projects was transformative. Earth scientists initially relied on disparate and often inconsistent data systems and formats. Over the course of decades, she collaborated with scientists locally and globally to standardize data storage, ensuring that Earth science data was accessible and usable by the broader scientific community. Her focus was not on building the perfect system but on creating one that aligned with how scientists worked—recognizing that collaboration, flexibility, and iterative development were key to advancing the science.
“It really changed the whole course of my career because I started recognizing gaps between what we as computer scientists were building and what Earth sciences needed, and I began to work in those spaces where there were complete gaps,” Agarwal said. “I became very dedicated to advancing the Earth sciences through advanced computing and data science.”
Since 2006, in her work for the Ameriflux and FLUXNET projects, Agarwal and her team have led the creation of high-value data products through the development of workflows and infrastructure to manage sensor streams of ecosystem-atmosphere Green House Gases (GHGs) exchanges from a network of over 400 stations globally.
This work culminated in Agarwal and her team collaborating to advance the publication and sharing of climate data, especially for carbon flux, and introducing new methods to manage vast, complex data sets. Her efforts were pivotal in the transition from isolated data silos to more integrated systems, supporting a new era in climate science.
The AmeriFlux network now consists of data from scientists who are making measurements of ecosystems throughout the Americas, with thousands of site years of data publicly available. The dataset has been downloaded more than 20,000 times to support education and studies of remote sensing, land surface modeling, or comparative ecosystem investigations. Hundreds of scientific papers have synthesized AmeriFlux data to make discoveries relevant to biogeosciences, hydrology, global environmental change, and climate change.
Agarwal describes this time as a key transition point in the scientific community because scientists are rethinking their data approach. She notes that field scientists were accustomed to gathering data, analyzing it, and then publishing their findings. “Now they’re asking, ‘Maybe we should also publish the data?’” she said. However, to make their data usable for other scientists and researchers, they need the right tools and a common data science “language.” That’s where Agarwal and her collaborators stepped in. Their work involved a blend of communication, mediation, and computer science, bringing together field scientists from around the world, understanding how and why they documented their data, gaining buy-in for a new common data storage method, and creating the tools to support this shift.
“I like to say it took 16 years to figure it out and move forward, and it was my favorite part of my career,” she added. “It had to be collaborative; it’s been this huge cross-agency and international team of effort where everybody contributes different pieces.”
Following the success of Ameriflux and FLUXNET, Agarwal was the founding principal investigator for the U.S. Department of Energy’s ESS-DIVE environmental data repository. Under her stewardship, ESS-DIVE began accepting data for publication within 9 months of funding in 2017, became a DataONE member node, and to date has published 889 data packages from 372 users. Agarwal’s work also led to an appointment on the inaugural steering committee of the California Water Data Consortium (CWDC), where she helped develop a statewide integrated water data platform. Agarwal has held leadership roles in the Institute for the Design of Advanced Energy Systems (IDAES) project, focused on improving the efficiency of power plants, and DOE’s Carbon Capture Simulation Initiative (CCSI), which accelerates the development and deployment cycle for new carbon capture technologies. She was also briefly the Deputy Task Area Lead for Modeling and Simulation for the new National Alliance for Water Innovation (NAWI), a $100 million hub to advance technologies to treat nontraditional water sources.
"Deb’s leadership on ESS-DIVE has enabled us to establish a repository of the future,” said Charuleka Varadharajan, who worked closely with Agarwal as a staff scientist and earth AI/data program lead in Berkeley Lab’s Earth and Environmental Sciences Area. “Through her years of experience with Ameriflux and other projects, she recognized the pain points of data contributors and sought to balance those with the needs of data users."
A constant in Agarwal’s career was her commitment and dedication to scientific collaboration. She consistently emphasized that her role was to contribute to a team effort that would have a lasting impact on the field. Her philosophy was simple: “I wasn’t looking for individual glory but for the opportunity to help advance a scientific field; working as a team enabled us to accomplish so much more than any one of us could have done individually,” she said. “I am so grateful to all of my collaborators, and I view this award as a recognition of what we accomplished together.”
Agarwal’s contributions extend beyond Earth sciences to the broader field of informatics. Her more recent work at AGU has advocated for better inclusion of data citations in papers and acknowledgment of the tools and methodologies that support modern science. Being nominated as an AGU Fellow was an honor to Agarwal largely because it reflects the organization’s recognition of the importance of informatics. “There’s a whole informatics section at AGU full of amazing contributors to science, and it’s been growing,” she said. “But it has felt like it is treated as a second cousin to the science; we are not always viewed as helping to advance the science.”
As an AGU Fellow, Agarwal will focus on elevating the recognition of the importance of informatics within the broader community. She will also focus on issues surrounding data citation, advocating for a shift to citation models that enable recognition of the value of data sets, software, and methods. This work has included advocating for the proper recognition of data contributions and efforts to standardize how data is cited in scientific literature.
She has always been a passionate advocate for diversity, equity, and inclusion (DEI) in informatics and the sciences, pushing for greater representation and recognition of women and underrepresented groups in her field. She plans to continue this work as an AGU Fellow as well.
“Besides her technical achievements, she has provided an extraordinary amount of community service and been an outstanding mentor for the next generation of data scientists,” said Varadharajan. “She has a long history of advocating for diversity, inclusion, and advancement of women and minorities in computing and other sciences.”
Agarwal will be balancing her new role as an AGU fellow with her other new role as a retiree. Since retiring, she has shifted toward a different pace of life, focusing on exploring the natural world. She and her wife spend much of their time traveling around the country in their RV, visiting state and national parks and appreciating Earth science from a whole different perspective.
About Berkeley Lab
Founded in 1931 on the belief that the biggest scientific challenges are best addressed by teams, Lawrence Berkeley National Laboratory and its scientists have been recognized with 16 Nobel Prizes. Today, Berkeley Lab researchers develop sustainable energy and environmental solutions, create useful new materials, advance the frontiers of computing, and probe the mysteries of life, matter, and the universe. Scientists from around the world rely on the Lab’s facilities for their own discovery science. Berkeley Lab is a multiprogram national laboratory, managed by the University of California for the U.S. Department of Energy’s Office of Science.
DOE’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.