Houjun Tang
Dr.
Houjun
Tang
Computer Research Scientist
Berkeley Lab
1 Cyclotron Road, 50A-3135
Berkeley,
California
94720
us
Houjun Tang (唐厚君) is currently a Computer Research Scientist in the Scientific Data Management Group at Berkeley Lab. His research interests include data management, storage systems, parallel I/O, and high performance computing. Tang received his Ph.D in Computer Science from North Carolina State University in 2016, and a B.Eng in Computer Science from Shenzhen University, China in 2012. He is currently working on projects funded by the DOE Office of Science and Office of Cybersecurity, Energy Security, and Emergency Response.
Journal Articles
M Scot Breitenfeld, Houjun Tang, Huihuo Zheng, Jordan Henderson, Suren Byna, "HDF5 in the Exascale Era: Delivering Efficient and Scalable Parallel I/O for Exascale Applications", The International Journal of High Performance Computing Applications, October 16, 2024, doi: 10.1177/10943420241288244
David McCallen, Arben Pitarka, Houjun Tang, Ramesh Pankajakshan, Anders Petersson, Mamun Miah, "Transformational Regional-Scale Earthquake Simulations with the DOE EarthQuake SIMulation Exascale Framework", Scientific Impact of the Exascale Computing Project (ECP), August 1, 2024, doi: 10.1109/MCSE.2024.3397768
D McCallen, A Pitarka, H Tang, R Pankajakshan, NA Petersson, M Miah, "Transformational Regional-Scale Earthquake Simulations with the DOE EarthQuake SIMulation (EQSIM) Exascale Framework", Computing in Science & Engineering, May 8, 2024, doi: 10.1109/MCSE.2024.3397768
David McCallen, Arben Pitarka, Houjun Tang, Ramesh Pankajakshan, N Anders Petersson, Mamun Miah, Junfei Huang, "Regional-scale fault-to-structure earthquake simulations with the EQSIM framework: Workflow maturation and computational performance on GPU-accelerated exascale platforms", Earthquake Spectra, May 3, 2024, 40(3):1615-1652, doi: 10.1177/87552930241246235
R. Han, M, Zheng, S. Byna, H. Tang, B. Dong, D. Dai, Y. Chen, D. Kim, J. Hassoun, D. Thorsley, M. Wolf, "PROV-IO: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems", IEEE Transactions on Parallel and Distributed Systems, March 14, 2024,
Jean Luca Bez, Houjun Tang, Scot Breitenfeld, Huihuo Zheng, Wei-Keng Liao, Kaiyuan Hou, Zanhua Huang, Suren Byna, "h5bench: Exploring HDF5 Access Patterns Performance in Pre-Exascale Platforms", Concurrency and Computation: Practice and Experience (CCPE), January 31, 2024,
Xiaoxia Zhang, Degang Chen, Hong Yu, Guoyin Wang, Houjun Tang, Kesheng Wu, "Improving nonnegative matrix factorization with advanced graph regularization", Information Sciences, June 1, 2022, 597:125-143, doi: 10.1016/j.ins.2022.03.008
Houjun Tang, Quincey Koziol, John Ravi, and Suren Byna,, "Transparent Asynchronous Parallel I/O using Background Threads", IEEE Transactions on Parallel and Distributed Systems, April 4, 2022, 33, doi: 10.1109/TPDS.2021.3090322
David McCallen, Houjun Tang, Suiwen Wu, Eric Eckert, Junfei Huang, N Anders Petersson, "Coupling of regional geophysics and local soil-structure models in the EQSIM fault-to-structure earthquake simulation framework", The International Journal of High Performance Computing Applications, May 25, 2021, doi: 10.1177/10943420211019118
David McCallen, Anders Petersson, Arthur Rodgers, Arben Pitarka, Mamun Miah, Floriana Petrone, Bjorn Sjogreen, Norman Abrahamson, Houjun Tang, "EQSIM—A multidisciplinary framework for fault-to-structure earthquake simulations on exascale computers part I: Computational models and workflow", Earthquake Spectra, May 1, 2021, 37:707-735, doi: 10.1177/8755293020970982
Suren Byna, M. Scot Breitenfeld, Bin Dong, Quincey Koziol, Elena Pourmal, Dana Robinson, Jerome Soumagne, Houjun Tang, Venkatram Vishwanath, and Richard Warren, "ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems", Journal of Computer Science and Technology 2020, 35(1): 145-160, February 2, 2020, doi: 10.1007/s11390-020-9822-9
Conference Papers
Rajeev Jain, Houjun Tang, Akash Dhruv, Suren Byna, "Enabling Data Reduction for Flash-X Simulations", 10th International Workshop on Data Analysis and Reduction for Big Scientific Data (DRBSD), 2024,
D.K. Sung, Y. Son, A. Sim, K. Wu, S. Byna, H. Tang, H. Eom, C. Kim, S. Kim, "A2FL: Autonomous and Adaptive File Layout in HPC through Real-time Access Pattern Analysis", 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS2024), 2024,
Wei Zhang, Houjun Tang, Suren Byna, "IDIOMS: Index-powered Distributed Object-centric Metadata Search for Scientific Data Management", The 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing. Philadelphia, 2024 (CCGrid 2024), Philadelphia, PA, USA, IEEE, May 9, 2024, doi: 10.1109/CCGrid59990.2024.00072
- Download File: 956600a598.pdf (pdf: 782 KB)
Daoce Wang, Jesus Pulido, Pascal Grosset, Jiannan Tian, Sian Jin, Houjun Tang, Jean Sexton, Sheng Di, Kai Zhao, Bo Fang, Zarija Lukić, Franck Cappello, James Ahrens, Dingwen Tao, "AMRIC: A novel in situ lossy compression framework for efficient I/O in adaptive mesh refinement applications", SC23: International Conference for High Performance Computing, Networking, Storage and Analysis, November 12, 2023, doi: 10.1145/3581784.3613212
Md Kamal Hossain Chowdhury, Houjun Tang, Jean Luca Bez, Purushotham V. Bangalore, Suren Byna, "Efficient Asynchronous I/O with Request Merging", 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), St. Petersburg, FL, USA, IEEE, 2023, 628-636, doi: 10.1109/IPDPSW59300.2023.00107
John Ravi, Suren Byna, Quincey Koziol, Houjun Tang, Michela Becchi, "Evaluating Asynchronous Parallel I/O on HPC Systems", 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), May 15, 2023, doi: 10.1109/IPDPS54959.2023.00030
Sian Jin, Dingwen Tao, Houjun Tang, Sheng Di, Suren Byna, Zarija Lukic, Franck Cappello, "Accelerating parallel write via deeply integrating predictive lossy compression with HDF5", SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, November 13, 2022, doi: 10.1109/SC41404.2022.00066
Rajeev Jain, Houjun Tang, Akash Dhruv, J Austin Harris, Suren Byna, "Accelerating flash-x simulations with asynchronous I/O", https://ieeexplore.ieee.org/abstract/document/10026923/, November 13, 2022, doi: 10.1109/PDSW56643.2022.00008
Runzhou Han, Suren Byna, Houjun Tang, Bin Dong, and Mai Zheng,, "PROV-IO: An I/O-Centric Provenance Framework for Scientific Data on HPC Systems", HPDC 2022, June 23, 2022,
Huihuo Zheng, Venkatram Vishwanath, Quincey Koziol, Houjun Tang, John Ravi, John Mainzer, Suren Byna, "HDF5 Cache VOL: Efficient and scalable parallel I/O through caching data on node-local storage", 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), May 16, 2022, doi: 10.1109/CCGrid54584.2022.00015
Houjun Tang, Bing Xie, Suren Byna, Phillip Carns, Quincey Koziol, Sudarsun Kannan, Jay Lofstead, and Sarp Oral,, "SCTuner: An Auto-tuner Addressing Dynamic I/O Needs on Supercomputer I/O Sub-systems", 6th International Parallel Data Systems Workshop (PDSW) 2021, held in conjunction with SC21, November 21, 2021,
Cong Xu, Suparna Bhattacharya, Martin Foltin, Suren Byna, and Paolo Faraboschi, "Data-Aware Storage Tiering for Deep Learning", 6th International Parallel Data Systems Workshop (PDSW) 2021, held in conjunction with SC21, November 21, 2021,
Bing Xie, Houjun Tang, Suren Byna, Jesse Hanley, Quincey Koziol, Tonglin Li, Sarp Oral,, "Battle of the Defaults: Extracting Performance Characteristics of HDF5 under Production Load", CCGrid 2021, May 31, 2021,
Jean Luca Bez, Houjun Tang, Bing Xie, David Williams-Young, Rob Latham, Rob Ross, Sarp Oral, Suren Byna, "I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis", 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW), January 1, 2021, 15-22, doi: 10.1109/PDSW54622.2021.00008
Tonglin Li, Suren Byna, Quincey Koziol, Houjun Tang, Jean Luca Bez, Qiao Kang, "h5bench: HDF5 I/O Kernel Suite for Exercising HPC I/O Patterns", Cray User Group (CUG) 2021, January 1, 2021,
Houjun Tang, Suren Byna, Bin Dong, Quincey Koziol, "Parallel Query Service for Object-centric Data Management Systems", 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), IEEE, May 18, 2020, 406-415,
Richard Warren, Jerome Soumagne, Jingqing Mu, Houjun Tang, Suren Byna, Bin Dong, Quincey Koziol, "Analysis in the Data Path of an Object-centric Data Management System", 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2019), December 18, 2019,
Houjun Tang, Suren Byna, Stephen Bailey, Zarija Lukic, Jialin Liu, Quincey Koziol, Bin Dong, "Tuning Object-centric Data Management Systems for Large Scale Scientific Applications", 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2019), December 18, 2019,
Wei Zhang, Suren Byna, Houjun Tang, Brody Williams, Yong Chen, "MIQS: Metadata Indexing and erying Service for Self-Describing File Formats", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), November 19, 2019,
- Download File: 3295500.3356146.pdf (pdf: 1 MB)
Houjun Tang, Quincey Koziol, Suren Byna, John Mainzer, Tonglin Li, "Enabling Transparent Asynchronous I/O using Background Threads", 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW 2019), November 19, 2019, doi: DOI 10.1109/PDSW49588.2019.00006
Tonglin Li, Quincey Koziol, Houjun Tang, Jialin Liu, Suren Byna, "I/O Performance Analysis of Science Applications Using HDF5 File-level Provenance", Cray User Group (CUG) 2019, May 10, 2019,
Jingqing Mu, Jerome Soumagne, Suren Byna, Quincey Koziol, Houjun Tang, Richard Warren, "Interfacing HDF5 with A Scalable Object-centric Storage System on Hierarchical Storage", Cray User Group (CUG) 2019, May 7, 2019,
Bin Dong, Kesheng Wu, Suren Byna, Houjun Tang, "SLOPE: Structural Locality-Aware Programming Model for Composing Array Data Analysis", International Conference on High Performance Computing, January 1, 2019, 61--80,
- Download File: slope-cr.pdf (pdf: 623 KB)
Jialin Liu, Quincey Koziol, Gregory Butler, Neil Fortner, Mohamad Chaarawi, Houjun Tang, Suren Byna, Glenn Lockwood, Ravi Cheema, Kristy Kallback-Rose, Damian Hazen, Prabhat, "Evaluation of HPC Application I/O on Object Storage Systems", 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS), November 12, 2018,
Wei Zhang, Houjun Tang, Suren Byna, Yong Chen, "DART: Distributed Adaptive Radix Tree for Efficient Affix-based Keyword Search on HPC Systems", Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, November 1, 2018, 24,
- Download File: 3243176.3243207.pdf (pdf: 1.1 MB)
Kimmy Mu, Jerome Soumagne, Houjun Tang, Suren Byna, Quincey Koziol, Richard Warren, "A Server-managed Transparent Object Storage Abstraction for HPC", 2018 IEEE International Conference on Cluster Computing (CLUSTER), September 10, 2018,
Teng Wang, Suren Byna, Bin Dong, and Houjun Tang, "UniviStor: Integrated Hierarchical and Distributed Storage for HPC", IEEE Cluster 2018., September 1, 2018,
Houjun Tang, Suren Byna, Francois Tessier, Teng Wang, Bin Dong, Jingqing Mu, Quincey Koziol, Jerome Soumagne, Venkatram Vishwanath, Jialin Liu, and Richard Warren, "Toward Scalable and Asynchronous Object-centric Data Management for HPC", 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2018, May 1, 2018,
Bin Dong, Teng Wang, Houjun Tang, Quincey Koziol, Kesheng Wu, Suren Byna, "ARCHIE: Data analysis acceleration with array caching in hierarchical storage", 2018 IEEE International Conference on Big Data (Big Data), January 1, 2018, 211--220,
- Download File: DataElevator-ARCHIE.pdf (pdf: 613 KB)