Wei Zhang
[OnlineCV]
Dr. Wei Zhang (张威) is a Computer Science Researcher at Lawrence Berkeley National Laboratory (LBNL), specializing in advancing data management for scientific applications in heterogeneous computing environments. His research focuses on bridging the gap between high-performance computing (HPC) and artificial intelligence (AI) by developing innovative solutions for managing and discovering large-scale scientific data. Dr. Zhang's key contributions include:
- Graph-Based Metadata Management: His work on GraphMeta, IOGP, and AKIN has laid the foundation for efficient metadata organization and retrieval in complex scientific computing environments.
- Scientific Data Discovery: He has made substantial advancements through projects like DART, MIQS, and IDIOMS, significantly improving metadata indexing and querying in parallel object-centric storage environments.
- Activeness-Based Data Retention: His novel approach, ActiveDR, optimizes storage based on user activity and access patterns, addressing long-term storage challenges in data-intensive, heterogeneous environments.
Currently, Dr. Zhang is leading research initiatives at LBNL on I/O optimization for GNN training, accelerating AI-powered data search, and LLM/RAG-powered scientific data discovery.
Prior to joining LBNL, Dr. Zhang held positions as a Senior Member of Technical Staff at Oracle Corporation and a Research Assistant at Texas Tech University. He has authored numerous publications in top-tier conferences and journals, including SC, PACT, CCGRID, and IEEE TPDS. He actively serves as invited paper reviewer or program committee members in prestigious journals/conferences like TPDS, SC, IPDPS, CCGrid, and HiPC.
Dr. Zhang obtained his Ph.D. in Computer Science from Texas Tech University and his BSc in Computer Science from Hebei University of Science and Technology. With his strong expertise in data management, HPC, and AI, he is committed to advancing scientific computing infrastructure to support groundbreaking research and discovery.
Journal Articles
Naizhuo Zhao, Guofeng Cao, Wei Zhang, Eric L. Samson, Yong Chen, "A comparison study between nighttime lights and location-based social media at the 500 m spatial resolution", International Journal of Applied Earth Observation and Geoinformation, May 1, 2020, 87,
Dong Dai, Yong Chen, Philip Carns, John Jenkins, Wei Zhang, Robert Ross, "Managing Rich Metadata in High-Performance Computing Systems Using a Graph Model", IEEE Transactions on Parallel and Distributed Systems, July 1, 2019, 30, doi: 10.1109/TPDS.2018.2887380
Naizhuo Zhao, Guofeng Cao, Wei Zhang, Eric L. Samson, "Tweets or nighttime lights: Comparison for preeminence in estimating socioeconomic factors", ISPRS Journal of Photogrammetry and Remote Sensing, December 1, 2018, 146:1-10,
Naizhuo Zhao, Wei Zhang, Ying Liu, Eric L. Samson, Yong Chen, Guofeng Cao, "Improving Nighttime Light Imagery With Location-Based Social Media Data", IEEE Transactions on Geoscience and Remote Sensing, October 24, 2018, 2161, doi: 10.1109/TGRS.2018.2871788
Conference Papers
Hyunju Oh, Wei Zhang, Christopher D. Rickett, Sreenivas R. Sukumar, Suren Byna, "Evaluating Performance Trade-offs of Caching Strategies for AI-Powered Querying Systems", 2024 IEEE International Conference on Big Data (IEEE BigData 2024), Washington DC, USA, 2024,
Camera-ready in preparation
Wei Zhang, Houjun Tang, Suren Byna, "BULKI - Binary Unified Layout for Key-value Interchange", 9th International Parallel Data Systems Workshop (PDSW), 2024,
Wei Zhang, Houjun Tang, Suren Byna, "IDIOMS: Index-powered Distributed Object-centric Metadata Search for Scientific Data Management", The 24th IEEE/ACM international Symposium on Cluster, Cloud and Internet Computing. Philadelphia, 2024 (CCGrid 2024), Philadelphia, PA, USA, IEEE, May 9, 2024, doi: 10.1109/CCGrid59990.2024.00072
- Download File: 956600a598.pdf (pdf: 782 KB)
Chenxu Niu, Wei Zhang, Suren Byna, Yong Chen, "PSQS: Parallel Semantic Querying Service for Self-describing File Formats", 2023 IEEE International Conference on Big Data (BigData), December 1, 2023, doi: 10.1109/BigData59044.2023.10386205
Chenxu Niu, Wei Zhang, Suren Byna, Yong Chen, "Kv2vec: A Distributed Representation Method for Key-value Pairs from Metadata Attributes", 2022 IEEE Conference on High Performance Extreme Computing (HPEC), September 19, 2022, doi: 10.1109/HPEC55821.2022.9926389
Wei Zhang, Suren Byna, Hyogi Sim, Sangkeun Lee, Sudharshan Vazhkudai, and Yong Chen,, "Exploiting User Activeness for Data Retention in HPC Systems", International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21), November 21, 2021, doi: https://doi.org/10.1145/3458817.3476201
- Download File: 3458817.3476201-2.pdf (pdf: 1.5 MB)
Wei Zhang, Suren Byna, Chenxu Niu, Yong Chen, "Exploring Metadata Search Essentials for Scientific Data Management", 26th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2019), December 17, 2019,
Wei Zhang, Suren Byna, Houjun Tang, Brody Williams, Yong Chen, "MIQS: Metadata Indexing and erying Service for Self-Describing File Formats", The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19), November 19, 2019,
- Download File: 3295500.3356146.pdf (pdf: 1 MB)
Wei Zhang, Houjun Tang, Suren Byna, Yong Chen, "DART: Distributed Adaptive Radix Tree for Efficient Affix-based Keyword Search on HPC Systems", Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, November 1, 2018, 24,
- Download File: 3243176.3243207.pdf (pdf: 1.1 MB)
Wei Zhang, Yong Chen, Dong Dai, "AKIN: A Streaming Graph Partitioning Algorithm for Distributed Graph Storage Systems", 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), May 4, 2018,
- Download File: AKIN.pdf (pdf: 314 KB)