Houjun Tang
Houjun Tang (唐厚君) is currently a Computer Research Scientist in the Scientific Data Management Group at Berkeley Lab. His research interests include data management, storage systems, parallel I/O, and high performance computing. Tang received his Ph.D in Computer Science from North Carolina State University in 2016, and a B.Eng in Computer Science from Shenzhen University, China in 2012. The projects that he is currently working on include: ECP-EQSIM: High-Performance, Multidisciplinary Simulations for Regional-Scale Earthquake Hazard/Risk Assessments, ECP-ExaIO: Advancing HPC I/O to Enable Scientific Discovery, and PDC: Proactive Data Containers for next generation HPC storage.
Link to my Google Scholar page.
Journal Articles
Jean Luca Bez, Houjun Tang, Scot Breitenfeld, Huihuo Zheng, Wei-Keng Liao, Kaiyuan Hou, Zanhua Huang, Suren Byna, "h5bench: Exploring HDF5 Access Patterns Performance in Pre-Exascale Platforms", Concurrency and Computation: Practice and Experience (CCPE), January 31, 2024,
Houjun Tang, Quincey Koziol, John Ravi, and Suren Byna,, "Transparent Asynchronous Parallel I/O using Background Threads", IEEE Transactions on Parallel and Distributed Systems, April 4, 2022, 33, doi: 10.1109/TPDS.2021.3090322
David McCallen, Houjun Tang, Suiwen Wu, Eric Eckert, Junfei Huang, N Anders Petersson, "Coupling of regional geophysics and local soil-structure models in the EQSIM fault-to-structure earthquake simulation framework", The International Journal of High Performance Computing Applications, May 25, 2021, doi: 10.1177/10943420211019118
David McCallen, Anders Petersson, Arthur Rodgers, Arben Pitarka, Mamun Miah, Floriana Petrone, Bjorn Sjogreen, Norman Abrahamson, Houjun Tang, "EQSIM—A multidisciplinary framework for fault-to-structure earthquake simulations on exascale computers part I: Computational models and workflow", Earthquake Spectra, May 1, 2021, 37:707-735, doi: 10.1177/8755293020970982
Suren Byna, M. Scot Breitenfeld, Bin Dong, Quincey Koziol, Elena Pourmal, Dana Robinson, Jerome Soumagne, Houjun Tang, Venkatram Vishwanath, and Richard Warren, "ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems", Journal of Computer Science and Technology 2020, 35(1): 145-160, February 2, 2020, doi: 10.1007/s11390-020-9822-9
Conference Papers
D.K. Sung, Y. Son, A. Sim, K. Wu, S. Byna, H. Tang, H. Eom, C. Kim, S. Kim, "A2FL: Autonomous and Adaptive File Layout in HPC through Real-time Access Pattern Analysis", 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS2024), 2024,
Wei Zhang, Houjun Tang, Suren Byna, "IDIOMS: Index-powered Distributed Object-centric Metadata Search for Scientific Data Management", The 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing. Philadelphia, 2024 (CCGrid 2024), May 9, 2024,
- Download File: CCGrid-Paper-158-IDIOMS-Wei-Zhang-2024054155.pdf (pdf: 568 KB)
Cong Xu, Suparna Bhattacharya, Martin Foltin, Suren Byna, and Paolo Faraboschi, "Data-Aware Storage Tiering for Deep Learning", 6th International Parallel Data Systems Workshop (PDSW) 2021, held in conjunction with SC21, November 21, 2021,
Houjun Tang, Bing Xie, Suren Byna, Phillip Carns, Quincey Koziol, Sudarsun Kannan, Jay Lofstead, and Sarp Oral,, "SCTuner: An Auto-tuner Addressing Dynamic I/O Needs on Supercomputer I/O Sub-systems", 6th International Parallel Data Systems Workshop (PDSW) 2021, held in conjunction with SC21, November 21, 2021,
Bing Xie, Houjun Tang, Suren Byna, Jesse Hanley, Quincey Koziol, Tonglin Li, Sarp Oral,, "Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load", CCGrid 2021, May 31, 2021,
Tonglin Li, Suren Byna, Quincey Koziol, Houjun Tang, Jean Luca Bez, Qiao Kang, "h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns", Cray User Group (CUG) 2021, January 1, 2021,
Jean Luca Bez, Houjun Tang, Bing Xie, David Williams-Young, Rob Latham, Rob Ross, Sarp Oral, Suren Byna, "I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis", 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW), January 1, 2021, 15-22, doi: 10.1109/PDSW54622.2021.00008
Houjun Tang, Suren Byna, Bin Dong, Quincey Koziol, "Parallel Query Service for Object-centric Data Management Systems", 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), IEEE, May 18, 2020, 406-415,
Houjun Tang, Suren Byna, Stephen Bailey, Zarija Lukic, Jialin Liu, Quincey Koziol, Bin Dong, "Tuning Object-centric Data Management Systems for Large Scale Scientific Applications", 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2019), December 18, 2019,
Richard Warren, Jerome Soumagne, Jingqing Mu, Houjun Tang, Suren Byna, Bin Dong, Quincey Koziol, "Analysis in the Data Path of an Object-centric Data Management System", 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2019), December 18, 2019,
Houjun Tang, Quincey Koziol, Suren Byna, John Mainzer, Tonglin Li, "Enabling Transparent Asynchronous I/O using Background Threads", 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW 2019), November 19, 2019, doi: DOI 10.1109/PDSW49588.2019.00006
Wei Zhang, Suren Byna, Houjun Tang, Brody Williams, Yong Chen, "MIQS: Metadata Indexing and erying Service for Self-Describing File Formats", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), November 19, 2019,
- Download File: 3295500.3356146.pdf (pdf: 1 MB)
Tonglin Li, Quincey Koziol, Houjun Tang, Jialin Liu, Suren Byna, "I/O Performance Analysis of Science Applications Using HDF5 File-level Provenance", Cray User Group (CUG) 2019, May 10, 2019,
Jingqing Mu, Jerome Soumagne, Suren Byna, Quincey Koziol, Houjun Tang, Richard Warren, "Interfacing HDF5 with A Scalable Object-centric Storage System on Hierarchical Storage", Cray User Group (CUG) 2019, May 7, 2019,
Bin Dong, Kesheng Wu, Suren Byna, Houjun Tang, "SLOPE: Structural Locality-Aware Programming Model for Composing Array Data Analysis", International Conference on High Performance Computing, January 1, 2019, 61--80,
- Download File: slope-cr.pdf (pdf: 623 KB)
Jialin Liu, Quincey Koziol, Gregory Butler, Neil Fortner, Mohamad Chaarawi, Houjun Tang, Suren Byna, Glenn Lockwood, Ravi Cheema, Kristy Kallback-Rose, Damian Hazen, Prabhat, "Evaluation of HPC Application I/O on Object Storage Systems", 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS), November 12, 2018,
Wei Zhang, Houjun Tang, Suren Byna, Yong Chen, "DART: Distributed Adaptive Radix Tree for Efficient Affix-based Keyword Search on HPC Systems", Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, November 1, 2018, 24,
- Download File: 3243176.3243207.pdf (pdf: 1.1 MB)
Kimmy Mu, Jerome Soumagne, Houjun Tang, Suren Byna, Quincey Koziol, Richard Warren, "A Server-managed Transparent Object Storage Abstraction for HPC", 2018 IEEE International Conference on Cluster Computing (CLUSTER), September 10, 2018,
Teng Wang, Suren Byna, Bin Dong, and Houjun Tang, "UniviStor: Integrated Hierarchical and Distributed Storage for HPC", IEEE Cluster 2018., September 1, 2018,
Houjun Tang, Suren Byna, Francois Tessier, Teng Wang, Bin Dong, Jingqing Mu, Quincey Koziol, Jerome Soumagne, Venkatram Vishwanath, Jialin Liu, and Richard Warren, "Toward Scalable and Asynchronous Object-centric Data Management for HPC", 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2018, May 1, 2018,
Bin Dong, Teng Wang, Houjun Tang, Quincey Koziol, Kesheng Wu, Suren Byna, "ARCHIE: Data analysis acceleration with array caching in hierarchical storage", 2018 IEEE International Conference on Big Data (Big Data), January 1, 2018, 211--220,
- Download File: DataElevator-ARCHIE.pdf (pdf: 613 KB)