Bin Dong (董斌)
Research Scientist (career), Scientific Data Management (SDM) Group, LBNL, 2018 - Present
Research Scientist (temp-career), Scientific Data Management (SDM) Group, LBNL, 2016 - 2018
Postdoctoral Research Fellow, Scientific Data Management (SDM) Group, LBNL, 2013 - 2016.
Ph.D. in Computer Science and Technology, Beihang University, China, January 2013.
B.S. in Computer Science and Technology, University of Electronic and Science Technology of China, June 2008.
Bin's research interests are in Big scientific data management and analysis, parallel computing, machine learning, etc.
Currently, Bin is exploring new and scalable algorithms and data structures for sorting, organizing, indexing, searching, analyzing Big scientific data (mostly as array) with supercomputers.
Temporary repositories for the software I am working on:
SDS framework (ask permission): https://code.lbl.gov/svn/sds/
ArrayUDF : https://bitbucket.org/arrayudf/
DataElevator : https://bitbucket.org/sbyna/dataelevator
SDS-Sort: to come soon.
Full publication lists are here: Google Scholar
Bin Dong, Xiuqiao Li, Limin Xiao, Li Ruan, "Towards minimizing disk I/O contention: A partitioned file assignment approach", Future Generation Computer Systems, Volume 37, July 2014, Pages 178-190, 2014,
Bin Dong, Xiuqiao Li, Qimeng Wu, Limin Xiao, Li Ruan, "A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers", Journal of Parallel and Distributed Computing (JPDC), Volume 72, Issue 10, October 2012, Pages 1254-1268, 2012,
Bin Dong, Kesheng Wu, Suren Byna, Houjun Tang, "SLOPE: Structural Locality-aware Programming Model for Composing Array Data Analysis", ISC 2019 ((Acceptance rate:24%),), June 16, 2019,
Bin Dong, Teng Wang, Houjun Tang, Quincey Koziol, Kesheng Wu, and Suren Byna, "ARCHIE: Data Analysis Acceleration with Array Caching in Hierarchical Storage", IEEE BigData, 2018, December 10, 2018,
- Download File: DataElevator-ARCHIE.pdf (pdf: 613 KB)
Xin Xing, Bin Dong, Jonathan Ajo-Franklin, Kesheng Wu, "Automated Parallel Data Processing Engine with Application to Large-Scale Feature Extraction", 2018 IEEE/ACM Machine Learning in HPC Environments (MLHPC) in SC 2018, November 10, 2018,
- Download File: arrayudf-das.pdf (pdf: 2.7 MB)
Teng Wang, Suren Byna, Bin Dong, and Houjun Tang, "UniviStor: Integrated Hierarchical and Distributed Storage for HPC", IEEE Cluster 2018., September 1, 2018,
Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu, Anna Ho, and Peter Nugent, "Distributed Caching for Processing Raw Arrays", SSDBM, 2018,
Houjun Tang, Suren Byna, Francois Tessier, Teng Wang, Bin Dong, Jingqing Mu, Quincey Koziol, Jerome Soumagne, Venkatram Vishwanath, Jialin Liu, and Richard Warren, "Toward Scalable and Asynchronous Object-centric Data Management for HPC", 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2018, May 1, 2018,
Tzu-Hsien Wu, Jerry Chou, Shyng Hao, Bin Dong, KeshengWu, Scott Klasky, "Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling", The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'17), November 13, 2017,
Houjun Tang, Suren Byna, Bin Dong, Jialin Liu, and Quincey Koziol, "SoMeta: Scalable Object-centric Metadata Management for High Performance Computing", IEEE Cluster 2017, September 5, 2017,
Bin Dong, Kesheng Wu, Surendra Byna, Jialin Liu, Weijie Zhao, Florin Rusu, "ArrayUDF: User-Defined Scientific Data Analysis on Arrays", The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2017 (Acceptance rate:19%), June 26, 2017,
- Download File: hpdc02.pdf (pdf: 921 KB)
Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu, and Peter Nugent, "Incremental View Maintenance over Array Data", In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17) (Acceptance rate: 20%). ACM, New York, NY, USA, May 14, 2017,
Bin Dong, Suren Byna, Kesheng Wu, Prabhat, Hans Johansen, Jeffrey N. Johnson, and Noel Keen, "Data Elevator: Low-contention Data Movement in Hierarchical Storage System", The 23rd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC) (Acceptance rate: 25%), December 19, 2016,
- Download File: 201612-DataElevator-HiPC2016-Bin-Byna.pdf (pdf: 765 KB)
Wenzhao Zhang, Houjun Tang, Stephen Ranshous, Surendra Byna, Daniel F Martín, Kesheng Wu, Bin Dong, Scott Klasky, Nagiza F Samatova, "Exploring memory hierarchy and network topology for runtime AMR data sharing across scientific applications", 2016 IEEE International Conference on Big Data (Big Data) (Acceptance rate: 19.39% as short papers.), December 5, 2016,
Houjun Tang, Suren Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F Martin, Bin Dong, Dharshi Devendran, Kesheng Wu, David Trebotich, others, "In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses", 2016 45th International Conference on Parallel Processing (ICPP) (Acceptance rate: 21.1%), August 16, 2016, 406--415,
Bin Dong, Suren Byna, and Kesheng Wu,, "SDS-Sort: Scalable Dynamic Skew-aware Parallel Sorting", The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2016, July 1, 2016,
- Download File: SDS-Sort.pdf (pdf: 450 KB)