Skip to navigation Skip to content
Careers | Phone Book | A - Z Index
Usable Data Systems Group

Gilberto Pastorello

GilbertoPastorello
Gilberto Pastorello
Research Scientist
Phone: +1 510 486 4613

Biographical Sketch

Gilberto Pastorello is a research scientist in the Usable Data Systems Group at the Lawrence Berkeley National Laboratory. His research is focused on managing the life-cycle of scientific data, encompassing data and metadata structures and linkages, data quality and data uncertainty quantification, and end-to-end data systems. He is part of the data teams for the AmeriFlux and FLUXNET networks (with the AmeriFlux Management Project) and the data team for the NGEE-Tropics project. He implemented new methods for data integration at multiple stages of processing, data quality assurance for heterogeneous data sources, and algorithms for execution and evaluation of data processing pipelines for environmental data -- in particular carbon, water, and energy fluxes and micro-meteorological data. He is part of the team creating the FLUXNET datasets, which are used in research ranging from soil microbiology to climate change. He has also developed data models and data behavior models for observational data, allowing more complete characterization of data quality and improved evaluation of data processing algorithms. He has also contributed to the Deduce project, investigating data change dynamics, and to the Tigres project, investigating workflow specification APIs and workflow patterns for data intensive pipelines. Before joining LBNL as a Post Doctoral Fellow in March 2013, he held post doctoral positions at the University of Alberta, in Edmonton, AB, Canada, where he implemented end-to-end data management solutions for biodiversity data and micro-meteorological data from traditional and wireless sensor networks systems; created data processing pipelines for optical data acquired from a variety of platforms (including, for instance, ground-based observations and airborne imaging spectrometry); and, created Web-based data exploration portals. He holds PhD, MSc, and BSc degrees in Computer Science from the University of Campinas, in Campinas, SP, Brazil.

Journal Articles

D. A. Agarwal, J. Damerow, C. Varadharajan, D. S. Christianson, G. Z. Pastorello, Y.-W. Cheah, L. Ramakrishnan, "Balancing the needs of consumers and producers for scientific data collections", Ecological Informatics, 2021, 62:101251, doi: 10.1016/j.ecoinf.2021.101251

G. Z. Pastorello, C. Trotta, E. Canfora, H. Chu, D. Christianson, Y.-W. Cheah, C. Poindexter, J. Chen, A. Elbashandy, M. Humphrey, P. Isaac, D. Polidori, M. Reichstein, A. Ribeca, C. van Ingen, N. Vuichard, L. Zhang, B. Amiro, C. Ammann, M. A. Arain, J. Ardö, T. Arkebauer, S. K. Arndt, N. Arriga, M. Aubinet, M. Aurela, D. Baldocchi, A. Barr, E. Beamesderfer, L. B. Marchesini, O. Bergeron, J. Beringer, C. Bernhofer, D. Berveiller, D. Billesbach, T. A. Black, P. D. Blanken, G. Bohrer, J. Boike, P. V. Bol stad, D. Bonal, J.-M. Bonnefond, D. R. Bowling, R. Bracho, J. Brodeur, C. Brümmer, N. Buchmann, B. Burban, S. P. Burns, P. Buysse, P. Cale, M. Cavagna, P. Cellier, S. Chen, I. Chini, T. R. Chris tensen, J. Cleverly, A. Collalti, C. Consalvo, B. D. Cook, D. Cook, C. Coursolle, E. Cremonese, P. S. Curtis, E. D’Andrea, H. da Rocha, X. Dai, K. J. Davis, B. D. Cinti, A. de Grandcourt, A. D. Ligne, R. C. D. Oliveira, N. Delpierre, A. R. Desai, C. M. D. Bella, P. di Tommasi, H. Dolman, F. Domingo, G. Dong, S. Dore, P. Duce, E. Dufrêne, A. Dunn, J. Dušek, D. Eamus, U. Eichelmann, H. A. M. ElKhidir, W. Eugster, C. M. Ewenz, B. Ewers, D. Famulari, S. Fares, I. Feigenwinter, A. Feitz, R. Fensholt, G. Fil ippa, M. Fischer, J. Frank, M. Galvagno, M. Gharun, D. Gianelle, B. Gielen, B. Gioli, A. Gitelson, I. Goded, M. Goeckede, A. H. Goldstein, C. M. Gough, M. L. Goulden, A. Graf, A. Griebel, C. Gruening, T. Grünwald, A. Hammerle, S. Han, X. Han, B. U. Hansen, C. Hanson, J. Hatakka, Y. He, M. Hehn, B. Heinesch, N. Hinko-Najera, L. Hörtnagl, L. Hutley, A. Ibrom, H. Ikawa, M. Jackowicz-Korczynski, D. Janouš, W. Jans, R. Jassal, S. Jiang, T. Kato, M. Khomik, J. Klatt, A. Knohl, S. Knox, H. Kobayashi, G. Koerber, O. Kolle, Y. Kosugi, A. Kotani, A. Kowalski, B. Kruijt, J. Kurbatova, W. L. Kutsch, H. Kwon, S. Launiainen, T. Laurila, B. Law, R. Leuning, Y. Li, M. Liddell, J.-M. Limousin, M. Lion, A. J. Liska, A. Lohila, A. López-Ballesteros, E. López-Blanco, B. Loubet, D. Loustau, A. Lucas-Moffat, J. Lüers, S. Ma, C. Macfarlane, V. Magliulo, R. Maier, I. Mammarella, G. Manca, B. Marcolla, H. A. Margolis, S. Mar ras, W. Massman, M. Mastepanov, R. Matamala, J. H. Matthes, F. Mazzenga, H. McCaughey, I. McHugh, A. M. S. McMillan, L. Merbold, W. Meyer, T. Meyers, S. D. Miller, S. Minerbi, U. Moderow, R. K. Monson, L. Montagnani, C. E. Moore, E. Moors, V. Moreaux, C. Moureaux, J. W. Munger, T. Nakai, J. Neirynck, Z. Nesic, G. Nicolini, A. Noormets, M. Northwood, M. Nosetto, Y. Nouvellon, K. Novick, W. Oechel, J. E. Olesen, J.-M. Ourcival, S. A. Papuga, F.-J. Parmentier, E. Paul-Limoges, M. Pavelka, M. Peichl, E. Pendall, R. P. Phillips, K. Pilegaard, N. Pirk, G. Posse, T. Powell, H. Prasse, S. M. Prober, S. Ram bal, U. Rannik, N. Raz-Yaseef, D. Reed, V. R. de Dios, N. Restrepo-Coupe, B. R. Reverter, M. Roland, S. Sabbatini, T. Sachs, S. R. Saleska, E. P. S.-C. nete, Z. M. Sanchez-Mejia, H. P. Schmid, M. Schmidt, K. Schneider, F. Schrader, I. Schroder, R. L. Scott, P. Sedlák, P. Serrano-Ortíz, C. Shao, P. Shi, I. Shironya, L. Siebicke, L. Šigut, R. Silberstein, C. Sirca, D. Spano, R. Steinbrecher, R. M. Stevens, C. Sturtevant, A. Suyker, T. Tagesson, S. Takanashi, Y. Tang, N. Tapper, J. Thom, F. Tiedemann, M. Tomassucci, J.-P. Tuovinen, S. Urbanski, R. Valentini, M. van der Molen, E. van Gorsel, K. van Huissteden, A. Varlagin, J. Verfaillie, T. Vesala, C. Vincke, D. Vitale, N. Vygodskaya, J. P. Walker, E. Walter-Shea, H. Wang, R. Weber, S. Westermann, C. Wille, S. Wofsy, G. Wohlfahrt, S. Wolf, W. Woodgate, Y. Li, R. Zampedri, J. Zhang, G. Zhou, D. Zona, D. Agarwal, S. Biraud, M. Torn, D. Papale, "The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data", Scientific Data, 2020, 7:225, doi: 10.1038/s41597-020-0534-3

Gilberto Z. Pastorello, Dario Papale, Housen Chu, Carlo Trotta, Deb A. Agarwal, Eleonora Canfora, Dennis D. Baldocchi, M. S. Torn, "A new data set to keep a sharper eye on land-air exchanges", Eos, 2017, 98:28-32, doi: 10.1029/2017EO071597

Danielle S. Christianson, Charuleka Varadharajan, Bradley Christoffersen, Matteo Detto, Faybishenko, Bruno O. Gimenez, Val C. Hendrix, Kolby J. Jardine, Robinson Negron-Juarez, Z. Pastorello, Thomas L. Powell, Megha Sandesh, Jeffrey M. Warren, Brett T. Wolfe, Jeffrey Q. Chambers, Lara M. Kueppers, Nathan G. McDowell, Deborah A. Agarwal, "A metadata reporting framework (FRAMES) for synthesis of ecohydrological observations", Ecological Informatics, 2017, 42:148-158, doi: 10.1016/j.ecoinf.2017.06.002

Craig A. Emmerton, Vincent L. St. Louis, Elyn R. Humphreys, John A. Gamon, Joel D. Barker, Gilberto Z. Pastorello, "Net ecosystem exchange of CO2 with rapidly changing high Arctic landscapes", Global Change Biology, 2016, 22:1185-1200, doi: 10.1111/gcb.13064

Ran Wang, John A. Gamon, Craig A. Emmerton, Li Haitao, Enrica Nestola, Gilberto Z. Pastorello, Olaf Menzer, "Integrated Analysis of Productivity and Biodiversity in a Southern Alberta Prairie", Remote Sensing, 2016, 8:2014, doi: 10.3390/rs8030214

Angela Harris, John A. Gamon, Gilberto Z. Pastorello, Christopher Y. S. Wong, "Retrieval of the photochemical reflectance index for assessing xanthophyll cycle activity: a comparison of near-surface optical sensors", Biogeosciences, 2014, 11:6277-6292, doi: 10.5194/bg-11-6277-2014

Gilberto Z. Pastorello, G. Arturo Sanchez-Azofeifa, Mario A. Nascimento, "Enviro-Net: from networks of ground-based sensor systems to a Web platform for sensor data management", Sensors, 2011, 11:6454-6479, doi: 10.3390/s110606454

John A. Gamon, Craig Coburn, Lawrence B. Flanagan, Karl F. Huemmrich, Kiddle, G. Arturo Sanchez-Azofeifa, Donnette R. Thayer, Loris Vescovo, Damiano Gianelli, Daniel A. Sims, Abdullah Faiz Rahman, Gilberto Z. Pastorello, "SpecNet Revisited: Bridging Flux and Remote Sensing Communities", Canadian Journal of Remote Sensing, 2010, 36:S376-S390, doi: 10.5589/m10-067

Gilberto Z. Pastorello, Jaudete Daltio, Claudia M. B. Medeiros, "A Mechanism for Propagation of Semantic Annotations of Multimedia Content", Journal of Multimedia, 2010, 5:332-342, doi: 10.4304/jmm.5.4.332-342

Gilberto Z. Pastorello, Rodrigo D. A. Senra, Claudia M. B. Medeiros, "A standards-based framework to foster geospatial data and process interoperability", Journal of the Brazilian Computer Society, 2009, 15:13-26, doi: 10.1007/BF03192574

Andre Santanche, Claudia M. B. Medeiros, Gilberto Z. Pastorello, "User-author centered multimedia building blocks", Multimedia Systems Journal, 2007, 12:403-421, doi: 10.1007/s00530-006-0050-0

Claudia M. B. Medeiros, Jose Perez-Alcazar, Luciano Digiampietri, Gilberto Z. Pastorello, Andre Santanche, Ricardo S. Torres, Edmundo Madeira, Evandro Bacarin, "WOODSS and the Web: Annotating and Reusing Scientific Workflows", SIGMOD Record, 2005, 34:18-23, doi: 10.1145/1084805.1084810

Conference Papers

Devarshi Ghoshal, Drew Paine, Gilberto Pastorello, Abdelrahman Elbashandy, Dan Gunter, Oluwamayowa Amusat, Lavanya Ramakrishnan, "Experiences with Reproducibility: Case Studies from Scientific Workflows", (P-RECS'21) Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems, ACM, June 21, 2021, doi: 10.1145/3456287.3465478

Reproducible research is becoming essential for science to ensure transparency and for building trust. Additionally, reproducibility provides the cornerstone for sharing of methodology that can improve efficiency. Although several tools and studies focus on computational reproducibility, we need a better understanding about the gaps, issues, and challenges for enabling reproducibility of scientific results beyond the computational stages of a scientific pipeline. In this paper, we present five different case studies that highlight the reproducibility needs and challenges under various system and environmental conditions. Through the case studies, we present our experiences in reproducing different types of data and methods that exist in an experimental or analysis pipeline. We examine the human aspects of reproducibility while highlighting the things that worked, that did not work, and that could have worked better for each of the cases. Our experiences capture a wide range of scenarios and are applicable to a much broader audience who aim to integrate reproducibility in their everyday pipelines.

Payton A Linton, William M Melodia, Alina Lazar, Deborah Agarwal, Ludovico Bianchi, Devarshi Ghoshal, Kesheng Wu, Gilberto Pastorello, Lavanya Ramakrishnan, "Identifying Time Series Similarity in Large-Scale Earth System Datasets", 2019,

Gilberto Z. Pastorello, Dan K. Gunter, Housen Chu, Danielle S. Christianson, Carlo Trotta, Eleonora Canfora, Boris Faybishenko, You-Wei Cheah, Norm Beekwilder, Stephen W. Chan, Sigrid Dengel, Trevor Keenan, Fianna O Brien, Abderahman Elbashandy, Cristina M. Poindexter, Marty Humphrey, Dario Papale, Deb A. Agarwal, "Hunting Data Rogues at Scale: Data Quality Control for Observational Data in Research Infrastructures", Proceedings of the 13th IEEE International Conference on e-Science (e-Science 2017), Auckland, New Zealand, 2017, doi: 10.1109/eScience.2017.64

Gilberto Z. Pastorello, Deb A. Agarwal, Taghrid Samak, Dario Papale, Trotta, Alessio Ribeca, Cristina M. Poindexter, Boris Faybishenko, Dan K. Gunter, Rachel Hollowgrass, Eleonora Canfora, "Observational data patterns for time series data quality assessment", Proceedings of the 10th IEEE International Conference on e-Science (e-Science 2014), Guaruja, Brazil, 2014, doi: 10.1109/eScience.2014.45

Lavanya Ramakrishnan, Sarah S. Poon, Val C. Hendrix, Dan K. Gunter, Gilberto Z. Pastorello, Deb A. Agarwal, "Experiences with User-Centered Design for the Tigres Workflow API", Proceedings of the 10th IEEE International Conference on e-Science (e-Science 2014), Guaruja, Brazil, 2014, doi: 10.1109/eScience.2014.56

Gilberto Z. Pastorello, G. Arturo Sanchez-Azofeifa, Mario A. Nascimento, "Enviro-Net: A Network of Ground-based Sensors for Tropical Dry Forests in the Americas", Proceedings of the 34th International Symposium on Remote Sensing of Environment (ISRSE 2011), Sydney, Australia, 2011, 4p,

Roger Curry, Cameron Kiddle, Rob Simmonds, Gilberto Z. Pastorello, "An On-line Collaborative Data Management System", Proceedings of the 6th Gateway Computing Environments Workshop (GCE 2010), New Orleans, LA, USA, 2010, doi: 10.1109/GCE.2010.5676120

Gilberto Z. Pastorello, Luiz C. Gomes Jr, Claudia M. B. Medeiros, Andre Santanche, "Sensor Data Publication on the Web for Scientific Applications", Proceedings of the 4th International Conference on Web Information Systems and Technologies (WEBIST 2008), Funchal, Madeira, Portugal, 2008, 137-142, doi: 10.5220/0001515301370142

Gilberto Z. Pastorello, Claudia M. B. Medeiros, Andre Santanche, "Accessing and Processing Sensing Data", Proceedings of the 11th IEEE International Conference on Computational Science and Engineering (CSE 2008), Sao Paulo, Brazil, 2008, 353--360, doi: 10.1109/CSE.2008.23

Gilberto Z. Pastorello, Jaudete Daltio, Claudia M. B. Medeiros, "Multimedia Semantic Annotation Propagation", Proceedings of the 1st IEEE International Workshop on Data Semantics for Multimedia Systems and Applications (DSMSA 2008), 10th IEEE International Symposium on Multimedia (ISM 2008), Berkeley, CA, USA, 2008, 509--514, doi: 10.1109/ISM.2008.77

Gilberto Z. Pastorello, Rodrigo D. A. Senra, Claudia M. B. Medeiros, "Bridging the gap between geospatial resource providers and model developers", Proceedings of the 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS 2008), Irvine, CA, USA, 2008, 379-382, doi: 10.1145/1463434.1463489

Gilberto Z. Pastorello, C. M. B. Medeiros, Andre Santanche, "Applying Scientific Workflows to Manage Sensor Data", Proceedings of the 1st Brazilian e-Science Workshop (BreSci 2007), 22nd Brazilian Symposium on Databases (SBBD 2007), Joao Pessoa, Brazil, 2007, 9-18,

Book Chapters

Gilberto Z. Pastorello, "Generation of Uniform Data Products for AmeriFlux and FLUXNET", The practice of reproducible research: case studies and lessons from the data-intensive sciences, (University of California Press: 2017) Pages: 305-309

Posters

P. Linton, W. Melodia, A. Lazar, D. Agarwal, L. Bianchi, D. Ghoshal, K. Wu, G. Pastorello, L. Ramakrishnan, "Identifying Time Series Similarity in Large-Scale Earth System Datasets", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), 2019,

Others

B Faybishenko, R Versteeg, G Pastorello, D Dwivedi, C Varadharajan, D Agarwal, Challenging problems of quality assurance and quality control (QA/QC) of meteorological time series data, Stochastic Environmental Research and Risk Assessment, Pages: 1049--1062 2022, doi: 10.1007/s00477-021-02106-w

Payton Linton, William Melodia, Alina Lazar, Deborah Agarwal, Ludovico Bianchi, Devarshi Ghoshal, Gilberto Pastorello, Lavanya Ramakrishnan, Kesheng Wu, Understanding Data Similarity in Large-Scale Scientific Datasets, 2019 IEEE International Conference on Big Data (Big Data), Pages: 4525--4531 2019,