Gilberto Pastorello
Biographical Sketch
Gilberto Pastorello is a research scientist in the Usable Data Systems Group at the Lawrence Berkeley National Laboratory. His research is focused on managing the life-cycle of scientific data, encompassing data and metadata structures and linkages, data quality and data uncertainty quantification, and end-to-end data systems. He is part of the data teams for the AmeriFlux and FLUXNET networks (with the AmeriFlux Management Project) and the data team for the NGEE-Tropics project. He implemented new methods for data integration at multiple stages of processing, data quality assurance for heterogeneous data sources, and algorithms for execution and evaluation of data processing pipelines for environmental data -- in particular carbon, water, and energy fluxes and micro-meteorological data. He is part of the team creating the FLUXNET datasets, which are used in research ranging from soil microbiology to climate change. He has also developed data models and data behavior models for observational data, allowing more complete characterization of data quality and improved evaluation of data processing algorithms. He has also contributed to the Deduce project, investigating data change dynamics, and to the Tigres project, investigating workflow specification APIs and workflow patterns for data intensive pipelines. Before joining LBNL as a Post Doctoral Fellow in March 2013, he held post doctoral positions at the University of Alberta, in Edmonton, AB, Canada, where he implemented end-to-end data management solutions for biodiversity data and micro-meteorological data from traditional and wireless sensor networks systems; created data processing pipelines for optical data acquired from a variety of platforms (including, for instance, ground-based observations and airborne imaging spectrometry); and, created Web-based data exploration portals. He holds PhD, MSc, and BSc degrees in Computer Science from the University of Campinas, in Campinas, SP, Brazil.
Journal Articles
D. A. Agarwal, J. Damerow, C. Varadharajan, D. S. Christianson, G. Z. Pastorello, Y.-W. Cheah, L. Ramakrishnan, "Balancing the needs of consumers and producers for scientific data collections", Ecological Informatics, 2021, 62:101251, doi: 10.1016/j.ecoinf.2021.101251
G. Z. Pastorello, C. Trotta, E. Canfora, H. Chu, D. Christianson, Y.-W. Cheah, C. Poindexter, J. Chen, A. Elbashandy, M. Humphrey, P. Isaac, D. Polidori, M. Reichstein, A. Ribeca, C. van Ingen, N. Vuichard, L. Zhang, B. Amiro, C. Ammann, M. A. Arain, J. Ardö, T. Arkebauer, S. K. Arndt, N. Arriga, M. Aubinet, M. Aurela, D. Baldocchi, A. Barr, E. Beamesderfer, L. B. Marchesini, O. Bergeron, J. Beringer, C. Bernhofer, D. Berveiller, D. Billesbach, T. A. Black, P. D. Blanken, G. Bohrer, J. Boike, P. V. Bol stad, D. Bonal, J.-M. Bonnefond, D. R. Bowling, R. Bracho, J. Brodeur, C. Brümmer, N. Buchmann, B. Burban, S. P. Burns, P. Buysse, P. Cale, M. Cavagna, P. Cellier, S. Chen, I. Chini, T. R. Chris tensen, J. Cleverly, A. Collalti, C. Consalvo, B. D. Cook, D. Cook, C. Coursolle, E. Cremonese, P. S. Curtis, E. D’Andrea, H. da Rocha, X. Dai, K. J. Davis, B. D. Cinti, A. de Grandcourt, A. D. Ligne, R. C. D. Oliveira, N. Delpierre, A. R. Desai, C. M. D. Bella, P. di Tommasi, H. Dolman, F. Domingo, G. Dong, S. Dore, P. Duce, E. Dufrêne, A. Dunn, J. Dušek, D. Eamus, U. Eichelmann, H. A. M. ElKhidir, W. Eugster, C. M. Ewenz, B. Ewers, D. Famulari, S. Fares, I. Feigenwinter, A. Feitz, R. Fensholt, G. Fil ippa, M. Fischer, J. Frank, M. Galvagno, M. Gharun, D. Gianelle, B. Gielen, B. Gioli, A. Gitelson, I. Goded, M. Goeckede, A. H. Goldstein, C. M. Gough, M. L. Goulden, A. Graf, A. Griebel, C. Gruening, T. Grünwald, A. Hammerle, S. Han, X. Han, B. U. Hansen, C. Hanson, J. Hatakka, Y. He, M. Hehn, B. Heinesch, N. Hinko-Najera, L. Hörtnagl, L. Hutley, A. Ibrom, H. Ikawa, M. Jackowicz-Korczynski, D. Janouš, W. Jans, R. Jassal, S. Jiang, T. Kato, M. Khomik, J. Klatt, A. Knohl, S. Knox, H. Kobayashi, G. Koerber, O. Kolle, Y. Kosugi, A. Kotani, A. Kowalski, B. Kruijt, J. Kurbatova, W. L. Kutsch, H. Kwon, S. Launiainen, T. Laurila, B. Law, R. Leuning, Y. Li, M. Liddell, J.-M. Limousin, M. Lion, A. J. Liska, A. Lohila, A. López-Ballesteros, E. López-Blanco, B. Loubet, D. Loustau, A. Lucas-Moffat, J. Lüers, S. Ma, C. Macfarlane, V. Magliulo, R. Maier, I. Mammarella, G. Manca, B. Marcolla, H. A. Margolis, S. Mar ras, W. Massman, M. Mastepanov, R. Matamala, J. H. Matthes, F. Mazzenga, H. McCaughey, I. McHugh, A. M. S. McMillan, L. Merbold, W. Meyer, T. Meyers, S. D. Miller, S. Minerbi, U. Moderow, R. K. Monson, L. Montagnani, C. E. Moore, E. Moors, V. Moreaux, C. Moureaux, J. W. Munger, T. Nakai, J. Neirynck, Z. Nesic, G. Nicolini, A. Noormets, M. Northwood, M. Nosetto, Y. Nouvellon, K. Novick, W. Oechel, J. E. Olesen, J.-M. Ourcival, S. A. Papuga, F.-J. Parmentier, E. Paul-Limoges, M. Pavelka, M. Peichl, E. Pendall, R. P. Phillips, K. Pilegaard, N. Pirk, G. Posse, T. Powell, H. Prasse, S. M. Prober, S. Ram bal, U. Rannik, N. Raz-Yaseef, D. Reed, V. R. de Dios, N. Restrepo-Coupe, B. R. Reverter, M. Roland, S. Sabbatini, T. Sachs, S. R. Saleska, E. P. S.-C. nete, Z. M. Sanchez-Mejia, H. P. Schmid, M. Schmidt, K. Schneider, F. Schrader, I. Schroder, R. L. Scott, P. Sedlák, P. Serrano-Ortíz, C. Shao, P. Shi, I. Shironya, L. Siebicke, L. Šigut, R. Silberstein, C. Sirca, D. Spano, R. Steinbrecher, R. M. Stevens, C. Sturtevant, A. Suyker, T. Tagesson, S. Takanashi, Y. Tang, N. Tapper, J. Thom, F. Tiedemann, M. Tomassucci, J.-P. Tuovinen, S. Urbanski, R. Valentini, M. van der Molen, E. van Gorsel, K. van Huissteden, A. Varlagin, J. Verfaillie, T. Vesala, C. Vincke, D. Vitale, N. Vygodskaya, J. P. Walker, E. Walter-Shea, H. Wang, R. Weber, S. Westermann, C. Wille, S. Wofsy, G. Wohlfahrt, S. Wolf, W. Woodgate, Y. Li, R. Zampedri, J. Zhang, G. Zhou, D. Zona, D. Agarwal, S. Biraud, M. Torn, D. Papale, "The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data", Scientific Data, 2020, 7:225, doi: 10.1038/s41597-020-0534-3
Bing Xu, M. Altaf Arain, Beverly E. Law, Gilberto Z. Pastorello, "Seasonal variability of forest sensitivity to heat and drought stresses: A synthesis based on carbon fluxes from North American forest ecosystems", Global Change Biology, September 2019, doi: 10.1111/gcb.14843
Gilberto Z. Pastorello, Dario Papale, Housen Chu, Carlo Trotta, Deb A. Agarwal, Eleonora Canfora, Dennis D. Baldocchi, M. S. Torn, "A new data set to keep a sharper eye on land-air exchanges", Eos, 2017, 98:28-32, doi: 10.1029/2017EO071597
Danielle S. Christianson, Charuleka Varadharajan, Bradley Christoffersen, Matteo Detto, Faybishenko, Bruno O. Gimenez, Val C. Hendrix, Kolby J. Jardine, Robinson Negron-Juarez, Z. Pastorello, Thomas L. Powell, Megha Sandesh, Jeffrey M. Warren, Brett T. Wolfe, Jeffrey Q. Chambers, Lara M. Kueppers, Nathan G. McDowell, Deborah A. Agarwal, "A metadata reporting framework (FRAMES) for synthesis of ecohydrological observations", Ecological Informatics, 2017, 42:148-158, doi: 10.1016/j.ecoinf.2017.06.002
Craig A. Emmerton, Vincent L. St. Louis, Elyn R. Humphreys, John A. Gamon, Joel D. Barker, Gilberto Z. Pastorello, "Net ecosystem exchange of CO2 with rapidly changing high Arctic landscapes", Global Change Biology, 2016, 22:1185-1200, doi: 10.1111/gcb.13064
Ran Wang, John A. Gamon, Craig A. Emmerton, Li Haitao, Enrica Nestola, Gilberto Z. Pastorello, Olaf Menzer, "Integrated Analysis of Productivity and Biodiversity in a Southern Alberta Prairie", Remote Sensing, 2016, 8:2014, doi: 10.3390/rs8030214
Angela Harris, John A. Gamon, Gilberto Z. Pastorello, Christopher Y. S. Wong, "Retrieval of the photochemical reflectance index for assessing xanthophyll cycle activity: a comparison of near-surface optical sensors", Biogeosciences, 2014, 11:6277-6292, doi: 10.5194/bg-11-6277-2014
Gilberto Z. Pastorello, G. Arturo Sanchez-Azofeifa, Mario A. Nascimento, "Enviro-Net: from networks of ground-based sensor systems to a Web platform for sensor data management", Sensors, 2011, 11:6454-6479, doi: 10.3390/s110606454
John A. Gamon, Craig Coburn, Lawrence B. Flanagan, Karl F. Huemmrich, Kiddle, G. Arturo Sanchez-Azofeifa, Donnette R. Thayer, Loris Vescovo, Damiano Gianelli, Daniel A. Sims, Abdullah Faiz Rahman, Gilberto Z. Pastorello, "SpecNet Revisited: Bridging Flux and Remote Sensing Communities", Canadian Journal of Remote Sensing, 2010, 36:S376-S390, doi: 10.5589/m10-067
Gilberto Z. Pastorello, Jaudete Daltio, Claudia M. B. Medeiros, "A Mechanism for Propagation of Semantic Annotations of Multimedia Content", Journal of Multimedia, 2010, 5:332-342, doi: 10.4304/jmm.5.4.332-342
Gilberto Z. Pastorello, Rodrigo D. A. Senra, Claudia M. B. Medeiros, "A standards-based framework to foster geospatial data and process interoperability", Journal of the Brazilian Computer Society, 2009, 15:13-26, doi: 10.1007/BF03192574
Andre Santanche, Claudia M. B. Medeiros, Gilberto Z. Pastorello, "User-author centered multimedia building blocks", Multimedia Systems Journal, 2007, 12:403-421, doi: 10.1007/s00530-006-0050-0
Claudia M. B. Medeiros, Jose Perez-Alcazar, Luciano Digiampietri, Gilberto Z. Pastorello, Andre Santanche, Ricardo S. Torres, Edmundo Madeira, Evandro Bacarin, "WOODSS and the Web: Annotating and Reusing Scientific Workflows", SIGMOD Record, 2005, 34:18-23, doi: 10.1145/1084805.1084810
Conference Papers
Devarshi Ghoshal, Drew Paine, Gilberto Pastorello, Abdelrahman Elbashandy, Dan Gunter, Oluwamayowa Amusat, Lavanya Ramakrishnan, "Experiences with Reproducibility: Case Studies from Scientific Workflows", (P-RECS'21) Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems, ACM, June 21, 2021, doi: 10.1145/3456287.3465478
Reproducible research is becoming essential for science to ensure transparency and for building trust. Additionally, reproducibility provides the cornerstone for sharing of methodology that can improve efficiency. Although several tools and studies focus on computational reproducibility, we need a better understanding about the gaps, issues, and challenges for enabling reproducibility of scientific results beyond the computational stages of a scientific pipeline. In this paper, we present five different case studies that highlight the reproducibility needs and challenges under various system and environmental conditions. Through the case studies, we present our experiences in reproducing different types of data and methods that exist in an experimental or analysis pipeline. We examine the human aspects of reproducibility while highlighting the things that worked, that did not work, and that could have worked better for each of the cases. Our experiences capture a wide range of scenarios and are applicable to a much broader audience who aim to integrate reproducibility in their everyday pipelines.