Dan Gunter

Biographical Sketch
Dan Gunter leads the Usable Data Systems (UDS) group in the Scientific Data Division (SciData). Dan's interests include usability for scientific interfaces and workflows, data management and data processing pipelines in heterogeneous environments, software engineering for distributed multidisciplinary scientific teams, and building usable interfaces to enable scientific exploration. He has led software efforts in projects for multiple domains, collaborating with science divisions at LBNL as well as other research institutions in the DOE and academia.
Journal Articles
Oluwamayowa O Amusat, Harshad Hegde, Christopher J Mungall, Anna Giannakou, Neil P Byers, Dan Gunter, Kjiersten Fagnan, Lavanya Ramakrishnan, "Automated annotation of scientific texts for ML-based keyphrase extraction and validation", Database, September 27, 2024, 2024:baae093, doi: https://doi.org/10.1093/database/baae093
Mohammed A. Alhussaini, Zachary M. Binger, Bianca M. Souza-Chaves, Oluwamayowa O. Amusat, Jangho Park, Timothy V. Bartholomew, Dan Gunter, Andrea Achilli, "Analysis of backwash settings to maximize net water production in an engineering-scale ultrafiltration system for water reuse", Journal of Water Process Engineering, 2023, 53, doi: 10.1016/j.jwpe.2023.103761
AP Arkin,RW Cottingham,CS Henry,NL Harris,RL Stevens,S Maslov,P Dehal,D Ware,F Perez,S Canon,MW Sneddon,ML Henderson,WJ Riehl,D Murphy-Olson,SY Chan,RT Kamimura,S Kumari,MM Drake,TS Brettin,EM Glass,D Chivian,D Gunter,DJ Weston,BH Allen,J Baumohl,AA Best,B Bowen,SE Brenner,CC Bun,JM Chandonia,JM Chia,R Colasanti,N Conrad,JJ Davis,BH Davison,M Dejongh,S Devoid,E Dietrich,I Dubchak,JN Edirisinghe,G Fang,JP Faria,PM Frybarger,W Gerlach,M Gerstein,A Greiner,J Gurtowski,HL Haun,F He,R Jain,MP Joachimiak,KP Keegan,S Kondo,V Kumar,ML Land,F Meyer,M Mills,PS Novichkov,T Oh,GJ Olsen,R Olson,B Parrello,S Pasternak,E Pearson,SS Poon,GA Price,S Ramakrishnan,P Ranjan,PC Ronald,MC Schatz,SMD Seaver,M Shukla,RA Sutormin,MH Syed,J Thomason,NL Tintle,D Wang,F Xia,H Yoo,S Yoo,D Yu, "KBase: The United States department of energy systems biology knowledgebase", Nature Biotechnology, July 2018, 36:566--569, doi: 10.1038/nbt.4163
DC Miller, JD Siirola, D Agarwal, AP Burgard, A Lee, JC Eslick, B Nicholson, C Laird, LT Biegler, D Bhattacharyya, NV Sahinidis, IE Grossmann, CE Gounaris, D Gunter, "Next Generation Multi-Scale Process Systems Engineering Framework", Computer Aided Chemical Engineering, 2018, 44:2209--2214, doi: 10.1016/B978-0-444-64241-7.50363-3
Patrick Huck, Dan Gunter, Shreyas Cholia, Donald Winston, AT N Diaye, Kristin Persson, "User applications driven by the community contribution framework MPContribs in the Materials Project", Concurrency and Computation: Practice and Experience, 2016, 28:1982--1993,
P Huck, D Gunter, S Cholia, D Winston, A N Diaye, KA Persson, "User Applications Driven by the Community Contribution Framework MPContribs in the Materials Project.", CoRR, 2015, abs/1510,
SP Ong, S Cholia, A Jain, M Brafman, D Gunter, G Ceder, KA Persson, "The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles", Computational Materials Science, 2015, 97:209--215, doi: 10.1016/j.commatsci.2014.10.037
A Jain, SP Ong, G Hautier, W Chen, WD Richards, S Dacek, S Cholia, D Gunter, D Skinner, G Ceder, KA Persson, "Commentary: The materials project: A materials genome approach to accelerating materials innovation", APL Materials, 2013, 1, doi: 10.1063/1.4812323
SP Ong, WD Richards, A Jain, G Hautier, M Kocher, S Cholia, D Gunter, VL Chevrier, KA Persson, G Ceder, "Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis", Computational Materials Science, 2013, 68:314--319, doi: 10.1016/j.commatsci.2012.10.028
Scott Callaghan, Ewa Deelman, Dan Gunter, Gideon Juve, Philip Maechling, Christopher Brooks, Karan Vahi, Kevin Milner, Robert Graves, Edward Field, David Okaya, Thomas Jordan, "Scaling up Workflow-based Applications", Journal of Computer and System Sciences, 2010, 76:428--446,
Conference Papers
Devarshi Ghoshal, Drew Paine, Gilberto Pastorello, Abdelrahman Elbashandy, Dan Gunter, Oluwamayowa Amusat, Lavanya Ramakrishnan, "Experiences with Reproducibility: Case Studies from Scientific Workflows", (P-RECS'21) Proceedings of the 4th International Workshop on Practical Reproducible Evaluation of Computer Systems, ACM, June 21, 2021, doi: 10.1145/3456287.3465478
Reproducible research is becoming essential for science to ensure transparency and for building trust. Additionally, reproducibility provides the cornerstone for sharing of methodology that can improve efficiency. Although several tools and studies focus on computational reproducibility, we need a better understanding about the gaps, issues, and challenges for enabling reproducibility of scientific results beyond the computational stages of a scientific pipeline. In this paper, we present five different case studies that highlight the reproducibility needs and challenges under various system and environmental conditions. Through the case studies, we present our experiences in reproducing different types of data and methods that exist in an experimental or analysis pipeline. We examine the human aspects of reproducibility while highlighting the things that worked, that did not work, and that could have worked better for each of the cases. Our experiences capture a wide range of scenarios and are applicable to a much broader audience who aim to integrate reproducibility in their everyday pipelines.
Reinhard Gentz, Sean Peisert, Joshua Boverhof, Daniel Gunter, "SPARCS: Stream-Processing Architecture applied in Real-time Cyber-physical Security", Proceedings of the 15th IEEE International Conference on e-Science (eScience), San Diego, CA, IEEE, September 2019, doi: 10.1109/eScience.2019.00028
Anna Giannakou, Daniel Gunter, Sean Peisert, "Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks", Workshop on Innovating the Network for Data-Intensive Science (INDIS), November 11, 2018, doi: 10.1109/INDIS.2018.00004
Gilberto Z. Pastorello, Dan K. Gunter, Housen Chu, Danielle S. Christianson, Carlo Trotta, Eleonora Canfora, Boris Faybishenko, You-Wei Cheah, Norm Beekwilder, Stephen W. Chan, Sigrid Dengel, Trevor Keenan, Fianna O Brien, Abderahman Elbashandy, Cristina M. Poindexter, Marty Humphrey, Dario Papale, Deb A. Agarwal, "Hunting Data Rogues at Scale: Data Quality Control for Observational Data in Research Infrastructures", Proceedings of the 13th IEEE International Conference on e-Science (e-Science 2017), Auckland, New Zealand, 2017, doi: 10.1109/eScience.2017.64
L Ramakrishnan, D Gunter, "Ten principles for creating usable software for science", Proceedings - 13th IEEE International Conference on eScience, eScience 2017, 2017, 210--218, doi: 10.1109/eScience.2017.34
Gilberto Z. Pastorello, Deb A. Agarwal, Taghrid Samak, Dario Papale, Trotta, Alessio Ribeca, Cristina M. Poindexter, Boris Faybishenko, Dan K. Gunter, Rachel Hollowgrass, Eleonora Canfora, "Observational data patterns for time series data quality assessment", Proceedings of the 10th IEEE International Conference on e-Science (e-Science 2014), Guaruja, Brazil, 2014, doi: 10.1109/eScience.2014.45
Lavanya Ramakrishnan, Sarah S. Poon, Val C. Hendrix, Dan K. Gunter, Gilberto Z. Pastorello, Deb A. Agarwal, "Experiences with User-Centered Design for the Tigres Workflow API", Proceedings of the 10th IEEE International Conference on e-Science (e-Science 2014), Guaruja, Brazil, 2014, doi: 10.1109/eScience.2014.56
Elif Dede, Madhusudhan Govindaraju, Daniel Gunter, Richard Canon, Lavanya Ramakrishnan, "Semi-Structured Data Analysis using MongoDB and MapReduce: A Performance Evaluation", Proceedings of the 4th international workshop on Scientific cloud computing, 2013,
Karan Vahi, Ian Harvey, Taghrid Samak, Dan Gunter, Kieran Evans, David Rogers, Ian Taylor, Monte Goode, Fabio Silva, Eddie Al-Shakarchi, Gaurang Mehta, Andrew Jones, Ewa Deelman, "A General Approach to Real-time Workflow Monitoring", The Seventh Workshop on Workflows in Support of Large-Scale Science (WORKS12), 2012,
Elif Dede, Zacharia Fadika, Jessica Hartog, Modhusudhan Govindaraju, Lavanya Ramakrishnan, Daniel Gunter, Richard Shane Canon, "MARISSA: MApReduce Implementation for Streaming Science Applications", IEEE eScience Conference, 2012,
D Gunter, S Cholia, A Jain, M Kocher, K Persson, L Ramakrishnan, SP Ong, G Ceder, "Community accessible datastore of high-throughput calculations: Experiences from the materials project", Proceedings - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012, 2012, 1244--1251, doi: 10.1109/SC.Companion.2012.150
Dan Gunter, Raj Kettimuthu, Ezra Kissel, Martin Swany, Jun Yi, Jason Zurawski, "Exploiting Network Parallelism for Improving Data Transfer Performance", SC12 Companion, 2012,
Taghrid Samak, Dan Gunter, Monte Goode, Ewa Deelman, Fabio Silva, Karan Vahi, "Failure Analysis of Distributed Scientific Workflows Executing in the Cloud", 8th International Conference on Network and Service Management (CNSM 2012), 2012,
T Samak, D Gunter, V Hendrix, "Scalable analysis of network measurements with Hadoop and Pig", Proceedings of the 2012 IEEE Network Operations and Management Symposium, NOMS 2012, 2012, 1254--1259, doi: 10.1109/NOMS.2012.6212060
Ezra Kissel, Ahmed El-Hassany, Guilherme Fernandes, Martin Swany, Dan Gunter, Taghrid Samak, Jennifer M. Schopf, "Scalable Integrated Performance Analysis of Multi-Gigabit Networks", Fifth International Workshop on Distributed Autonomous Network Management Systems 2012 (DANMS 12), 2012,
A Jain, G Hautier, SP Ong, C Moore, B Kang, H Chen, X Ma, JC Kim, M Kocher, D Gunter, S Cholia, A Greiner, DH Bailey, D Skinner, K Persson, G Ceder, "Materials Project: A public materials database and its application to lithium ion battery cathode design", ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 243,
Taghrid Samak, Dan Gunter, Ewa Deelman, Gideon Juve, Gaurang Mehta, Fabio Silva, Karan Vahi, "Online Fault and Anomaly Detection for Large-Scale Scientific Workflows", 13th IEEE Conference on High Performance Computing and Communications (HPCC-2011), 2011,
Dan Gunter, Taghrid Samak, Ewa Deelman, Christopher H. Brooks, Monte Goode, Gideon Juve, Gaurang Mehta, Priscilla Moraes, Fabio Silva, Martin Swany, Karan Vahi, "Online Workflow Management and Performance Analysis with STAMPEDE", 7th International Conference on Network and Service Management (CNSM 2011), Paris, France, 2011,
Taghrid Samak, Dan Gunter, Monte Goode, Ewa Deelman, Gaurang Mehta, Fabio Silva, Karan Vahi, "Failure Prediction and Localization in Large Scientific Workflows", The Sixth Workshop on Workflows in Support of Large-Scale Science (WORKS11), 2011,
Elif Dede, Madhusudan Govindaraju, Daniel Gunter, Lavanya Ramakrishnan, "Riding the Elephant: Managing Ensembles with Hadoop", 4th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS), 2011,
Raj Kettimuthu, Alex Sim, Dan Gunter, Bill Allcock, Peer T. Bremer, John Bresnahan, Andrew Cherry, Lisa Childers, Eli Dart, Ian Foster, Kevin Harms, Jason Hick, Jason Lee, Michael Link, Jeff Long, Keith Miller, Vijaya Natarajan, Valerio Pascucci, Ken Raffenetti, David Ressman, Dean Williams, Loren Wilson, Linda Winkler, "Lessons learned from moving earth system grid data sets over a 20 Gbps wide-area network", HPDC 10, New York, NY, USA, ACM, 2010, 316--319, doi: 10.1145/1851476.1851519
Shoaib Kamil, Pinar, Gunter, Lijewski, Oliker, John Shalf, "Reconfigurable hybrid interconnection for static and dynamic scientific applications", Conf. Computing Frontiers, 2007, 183-194, LBNL 60060,
- Download File: CF07.pdf (pdf: 9.5 MB)