Julian is a senior at UC Berkeley studying Computer Science. His primary research interests are parallel programming tools for scientific computing, data management techniques for large scientific datasets, and dense linear algebra. His current work at LBL focuses on symPACK, a UPC++ based solver for sparse symmetric matrices.
Julian Bellavita, Esmond Ng, Mathias Jacquelin, Dan Bonachea, Johnny Corbino, Paul H. Hargrove, "symPACK: A GPU-Capable Fan-Out Sparse Cholesky Solver", IEEE/ACM Parallel Applications Workshop, Alternatives To MPI+X (PAW-ATM) (to appear), ACM, November 13, 2023,
Large sparse symmetric positive definite systems of linear equations are ubiquitous in scientific workloads and applications. Parallel sparse Cholesky factorization is the method of choice for solving these sorts of linear systems. Therefore, the development of parallel sparse Cholesky codes that can efficiently run on today's large-scale heterogeneous distributed memory platforms is of vital importance. Modern supercomputers, such as the Perlmutter supercomputer at NERSC or the Frontier supercomputer at ORNL, offer nodes that contain a mix of CPUs and GPUs. To take full advantage of the computing power of these nodes, scientific codes need to be adapted to offload expensive computations to GPUs.
We present symPACK, a GPU-capable parallel sparse Cholesky factorization code that uses one-sided communication primitives and remote procedure calls provided by the \upcxx library. We also utilize a \upcxx feature known as ``memory kinds'', allowing for communication and memory movement between the host and device via a simple, portable interface. We show that on a number of large problems, symPACK outperforms comparable state-of-the-art GPU-capable Cholesky factorization codes by up to 10x. symPACK's GPU functionality and its high performance make it appropriate for use on the heterogeneous HPC systems of the modern era.
J. Bellavita, A. Sim, K. Wu, I. Monga, C. Guok, F. Würthwein, D. Davila, "Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches", 5th ACM International Workshop on System and Network Telemetry and Analysis (SNTA) 2022, in conjunction with The 31st ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2022, doi: 10.1145/3526064.3534111
Julian Bellavita, Alex Sim (advisor), John Wu (advisor), "Predicting Scientific Dataset Popularity Using dCache Logs", ACM/IEEE The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’22), ACM Student Research Competition (SRC), Second place winner, 2022,
The dCache installation is a storage management system that acts as a disk cache for high-energy physics (HEP) data. Storagespace on dCache is limited relative to persistent storage devices, therefore, a heuristic is needed to determine what data should be kept in the cache. A good cache policy would keep frequently accessed data in the cache, but this requires knowledge of future dataset popularity. We present methods for forecasting the number of times a dataset stored on dCache will be accessed in the future. We present a deep neural network that can predict future dataset accesses accurately, reporting a final normalized loss of 4.6e-8. We present a set of algorithms that can forecast future dataset accesses given an access sequence. Included are two novel algorithms, Backup Predictor and Last N Successors, that outperform other file prediction algorithms. Findings suggest that it is possible to anticipate dataset popularity in advance.