Careers | Phone Book | A - Z Index

Aydın Buluç

aydin front 2018 small
Aydın Buluç
Staff Scientist
Phone: +1 510 486 5197

Aydın Buluç is a Staff Scientist at the Computational Research Division of Berkeley Lab, working on high-performance graph analysis, libraries, and their applications in genomics and bioinformatics. His current research interests also include parallel sparse matrix computations and communication-avoiding algorithms. Previously, he was a Luis W. Alvarez Fellow. He earned his doctorate in computer science from the University of California at Santa Barbara in 2010. He is a recipient of the DOE Early Career Award in 2013 and the IEEE TCSC Award for Excellence for Early Career Researchers in 2015. 

» Visit Aydın Buluç’s personal web page for more information.

Current Research

Past Research

Journal Articles

Katherine Yelick, Aydın Buluç, Muaaz Awan, Ariful Azad, Benjamin Brock, Rob Egan, Saliya Ekanayake, Marquita Ellis, Evangelos Georganas, Giulia Guidi, Steven Hofmeyr, Oguz Selvitopi, Cristina Teodoropol, Leonid Oliker, "The parallelism motifs of genomic data analysis", Philosophical Transactions of The Royal Society A: Mathematical, Physical and Engineering Sciences, 2020,

Giulia Guidi, Marquita Ellis, Daniel Rokhsar, Katherine Yelick, Aydın Buluç, "BELLA: Berkeley Efficient Long-Read to Long-Read Aligner and Overlapper (Preprint)", Submitted, 2018, doi:

Ariful Azad, Georgios A. Pavlopoulos, Christos A. Ouzounis, Nikos C. Kyrpides, Aydin Buluç, "HipMCL: A high-performance parallel implementation of the Markov cluster algorithm for large scale networks", Nucleic Acids Research, April 2018,

Ariful Azad, Grey Ballard, Aydin Buluc, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams, "Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication", SIAM Journal on Scientific Computing, 38(6), C624–C651, November 2016, doi: 10.1137/15M104253X

Hasan Metin Aktulga, Md. Afibuzzaman, Samuel Williams, Aydın Buluc, Meiyue Shao, Chao Yang, Esmond G. Ng, Pieter Maris, James P. Vary, "A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations", IEEE Transactions on Parallel and Distributed Systems (TPDS), November 2016, doi: 10.1109/TPDS.2016.2630699

Ariful Azad, Aydın Buluç, "A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs", Parallel Computing, June 2016,

Ariful Azad, Aydın Buluç, Alex Pothen, "Computing Maximum Cardinality Matchings in Parallel on Bipartite Graphs via Tree-Grafting", IEEE Transactions on Parallel and Distributed Systems (TPDS), May 2016,

cover copy parco

Aydin Buluc, John Gilbert, Leonid Oliker, "Special Issue: Graph Analysis for Scientific Discovery", Parallel Computing Journal Special Issue Editors, August 1, 2015,

J. Chapman, M. Mascher, A. Buluç, K. Barry, E. Georganas, A. Session, V. Strnadova, J. Jenkins, S. Sehgal, L. Oliker, J Schmutz, K. Yelick, U. Scholz, R. Waugh, J. Poland, G. Muehlbauer, N. Stein, D. Rokhsar, "A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome", Genome biology, 2015,

Adam Lugowski, Shoaib Kamil, Aydın Buluç, Samuel Williams, Erika Duriakova, Leonid Oliker, Armando Fox, John R. Gilbert,, "Parallel processing of filtered queries in attributed semantic graphs", Journal of Parallel and Distributed Computing (JPDC), September 2014, doi: 10.1016/j.jpdc.2014.08.010

A. Buluç, K. Madduri, "Graph partitioning for scalable distributed graph computations", AMS Contemporary Mathematics, Graph Partitioning and Graph Clustering (Proc. 10th DIMACS Implementation Challenge), 2013,

A. Buluç, J. Gilbert, "Parallel sparse matrix-matrix multiplication and indexing: Implementation and experiments", SIAM Journal on Scientific Computing (SISC), 2012,

A. Buluç, J. Gilbert, "The Combinatorial BLAS: Design, implementation, and applications", International Journal of High-Perormance Computing Applications (IJHPCA), 2011,

A. Buluç, J. R. Gilbert, C. Budak, "Solving path problems on the GPU", Parallel Computing, 36(5-6):241 - 253., 2010, doi:

Conference Papers

Oguz Selvitopi*, Saliya Ekanayake*, Giulia Guidi, Georgios Pavlopoulos, Ariful Azad, Aydın Buluç, "Distributed Many-to-Many Protein Sequence Alignment Using Sparse Matrices", Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’20)., 2020,

(*:joint first authors)

Alberto Zeni, Giulia Guidi, Marquita Ellis, Nan Ding, Marco D. Santambrogio, Steven Hofmeyr, Aydın Buluç, Leonid Oliker, Katherine Yelick, "LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment", 34th IEEE International Parallel and Distributed Processing Symposium (IPDPS20), 2020,

Benjamin A. Brock, Yuxin Chen, Jiakun Yan, John Owens, Aydın Buluç, Katherine Yelick, "RDMA vs. RPC for Implementing Distributed Data Structures", Proceedings of the 2019 IEEE/ACM 9th Workshop on Irregular Applications: Architectures and Algorithms (IA3), Denver, CO, USA, IEEE, November 18, 2019, doi: 10.1109/IA349570.2019.00009

Distributed data structures are key to implementing scalable applications for scientific simulations and data analysis. In this paper we look at two implementation styles for distributed data structures: remote direct memory access (RDMA) and remote procedure call (RPC). We focus on operations that require individual accesses to remote portions of a distributed data structure, e.g., accessing a hash table bucket or distributed queue, rather than global operations in which all processors collectively exchange information. We look at the trade-offs between the two styles through microbenchmarks and a performance model that approximates the cost of each. The RDMA operations have direct hardware support in the network and therefore lower latency and overhead, while the RPC operations are more expressive but higher cost and can suffer from lack of attentiveness from the remote side. We also run experiments to compare the real-world performance of RDMA- and RPC-based data structure operations with the predicted performance to evaluate the accuracy of our model, and show that while the model does not always precisely predict running time, it allows us to choose the best implementation in the examples shown. We believe this analysis will assist developers in designing data structures that will perform well on current network architectures, as well as network architects in providing better support for this class of distributed data structures.

Benjamin Brock, Aydın Buluç, Katherine Yelick, "BCL: A Cross-Platform Distributed Data Structures Library", ICPP 2019: Proceedings of the 48th International Conference on Parallel Processing, Kyoto, Japan, Association for Computing Machinery, August 2019, doi: 10.1145/3337821.3337912

One-sided communication is a useful paradigm for irregular parallel applications, but most one-sided programming environments, including MPI's one-sided interface and PGAS programming languages, lack application-level libraries to support these applications. We present the Berkeley Container Library, a set of generic, cross-platform, high-performance data structures for irregular applications, including queues, hash tables, Bloom filters and more. BCL is written in C++ using an internal DSL called the BCL Core that provides one-sided communication primitives such as remote get and remote put operations. The BCL Core has backends for MPI, OpenSHMEM, GASNet-EX, and UPC++, allowing BCL data structures to be used natively in programs written using any of these programming environments. Along with our internal DSL, we present the BCL ObjectContainer abstraction, which allows BCL data structures to transparently serialize complex data types while maintaining efficiency for primitive types. We also introduce the set of BCL data structures and evaluate their performance across a number of high-performance computing systems, demonstrating that BCL programs are competitive with hand-optimized code, even while hiding many of the underlying details of message aggregation, serialization, and synchronization.

Marquita Ellis, Giulia Guidi, Aydın Buluç, Leonid Oliker, Katherine Yelick, "diBELLA: Distributed Long Read to Long Read Alignment", 48th International Conference on Parallel Processing (ICPP), June 25, 2019,

P Koanantakool, A Ali, A Azad, A Buluç, D Morozov, L Oliker, KA Yelick, S-Y Oh, "Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation.", Proceedings of Machine Learning Research, PMLR, 2018, 84:1376--1386,

Yang You, Aydin Buluc, James Demmel, "Scaling deep learning on GPU and Knights Landing clusters", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'17), 2017,

Ariful Azad, Aydin Buluc, "Towards a GraphBLAS Library in Chapel", IPDPS Workshops, Orlando, FL, May 2017,

Aydin Buluc, Tim Mattson, Scott McMillan, Jose Moreira, Carl Yang, "Design of the GraphBLAS API for C", IEEE Workshop on Graph Algorithm Building Blocks, IPDPSW, 2017,

Ariful Azad, Aydin Buluc, "A work-efficient parallel sparse matrix-sparse vector multiplication algorithm", IEEE International Parallel & Distributed Processing Symposium (IPDPS), Orlando, FL, May 2017,

Ariful Azad, Mathias Jacquelin, Aydin Bulu\cc, Esmond G Ng, "The reverse Cuthill-McKee algorithm in distributed-memory", Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE International, January 2017, 22--31,

M Ellis, E Georganas, R Egan, S Hofmeyr, A Buluç, B Cook, L Oliker, K Yelick, "Performance characterization of de novo genome assembly on leading parallel systems", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, 10417 LN:79--91, doi: 10.1007/978-3-319-64203-1_6

E Georganas, M Ellis, R Egan, S Hofmeyr, A Buluç, B Cook, L Oliker, K Yelick, "MerBench: PGAS benchmarks for high performance genome assembly", Proceedings of PAW 2017: 2nd Annual PGAS Applications Workshop - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, 2017-Jan:1--4, doi: 10.1145/3144779.3169109

Jeremy Kepner, Peter Aaltonen, David Bader, Aydin Buluç, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, Scott McMillan, José Moreira, John Owens, Carl Yang, Marcin Zalewski, Timothy Mattson., "Mathematical foundations of the GraphBLAS", IEEE High Performance Extreme Computing (HPEC), September 1, 2016,

Veronika Strnadova-Neeley, Aydin Buluc, John R. Gilbert, Leonid Oliker, Weimin Ouyang, "LiRa: A New Likelihood-Based Similarity Score for Collaborative Filtering", August 30, 2016,

Ariful Azad, Aydin Buluç, "Distributed-Memory Algorithms for Maximum Cardinality Matching in Bipartite Graphs", IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2016,

P Koanantakool, A Azad, A Buluc, D Morozov, SY Oh, L Oliker, K Yelick, "Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication", Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016, January 2016, 842--853, doi: 10.1109/IPDPS.2016.117

Veronika Strnadová-Neeley, Aydın Buluç, Jarrod Chapman, John R. Gilbert, Joseph Gonzalez, Leonid Oliker, "Efficient Data Reduction for Large-Scale Genetic Mapping", ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), September 10, 2015,

Ariful Azad, Aydin Buluc, "Distributed-Memory Algorithms for Maximal Cardinality Matching using Matrix Algebra", IEEE Cluster, Chicago, IL, September 2015,

Evangelos Georganas, Aydın Buluç, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine Yelick, "merAligner: A Fully Parallel Sequence Aligner", IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2015, doi: 10.1109/IPDPS.2015.96

Ariful Azad, Aydin Buluc, John Gilbert, "Parallel Triangle Counting and Enumeration using Matrix Algebra", Workshop on Graph Algorithms Building Blocks (GABB), in conjunction with IPDPS, IEEE, May 2015,

Ariful Azad, Aydin Buluç, Alex Pothen, "A Parallel Tree Grafting Algorithm for Maximum Cardinality Matching in Bipartite Graphs", International Parallel and Distributed Processing Symposium (IPDPS), May 2015,

E Georganas, A Buluç, J Chapman, S Hofmeyr, C Aluru, R Egan, L Oliker, D Rokhsar, K Yelick, "HipMer: An extreme-scale de novo genome assembler", International Conference for High Performance Computing, Networking, Storage and Analysis, SC, January 1, 2015, 15-20-No, doi: 10.1145/2807591.2807664

Evangelos Georganas, Aydin Buluç, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine Yelick, "Parallel de bruijn graph construction and traversal for de novo genome assembly", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'14), November 2014,

Veronika Strnadova, Aydın Buluç, Joseph Gonzalez, Stefanie Jegelka, Jarrod Chapman, John Gilbert, Daniel Rokhsar, Leonid Oliker, "Efficient and accurate clustering for large-scale genetic mapping", IEEE International Conference on Bioinformatics and Biomedicine (BIBM'14), November 1, 2014,

H. M. Aktulga, A. Buluc, S. Williams, C. Yang, "Optimizing Sparse Matrix-Multiple Vector Multiplication for Nuclear Configuration Interaction Calculations", International Parallel and Distributed Processing Symposium (IPDPS 2014), May 2014, doi: 10.1109/IPDPS.2014.125

Tim Mattson, David Bader, Jon Berry, Aydin Buluc, Jack Dongarra, Christos Faloutsos, John Feo, John Gilbert, Joseph Gonzalez, Bruce
Hendrickson, Jeremy Kepner, Charles Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Mike Stonebraker, Steve Wallach,
Andrew Yoo,
"Standards for Graph Algorithm Primitives", HPEC, 2013,

Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Benjamin Lipshitz, Oded Schwartz, Sivan Toledo, "Communication optimal parallel multiplication of sparse random matrices", SPAA 2013: The 25th ACM Symposium on Parallelism in Algorithms and Architectures, Montreal, Canada, 2013, 222-231, doi: 10.1145/2486159.2486196

E. Solomonik, A. Buluç, J. Demmel, "Minimizing communication in all-pairs shortest paths", International Parallel and Distributed Processing Symposium (IPDPS), 2013,

Aydın Buluç, Erika Duriakova, Armando Fox, John Gilbert, Shoaib Kamil, Adam Lugowski, Leonid Oliker, Samuel Williams, "High-Productivity and High-Performance Analysis of Filtered Semantic Graphs", International Parallel and Distributed Processing Symposium (IPDPS), 2013, doi: 10.1145/2370816.2370897

Scott Beamer, Aydın Buluç, Krste Asanović, David A Patterson, "Distributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search", Proc. Workshop on Multithreaded Architectures and Applications (MTAAP), in conjunction with IPDPS, 2013,

K. Kandalla, A. Buluç, H. Subramoni, K. Tomko, J. Vienne, L. Oliker, D. K. Panda, "Can network-offload based non-blocking neighborhood MPI collectives improve communication overheads of irregular graph algorithms?", International Workshop on Parallel Algorithms and Parallel Software (IWPAPS 2012), 2012,

A. Lugowski, D. Alber, A. Buluç, J. Gilbert, S. Reinhardt, Y. Teng, A. Waranis, "A flexible open-source toolbox for scalable complex graph analysis", SIAM Conference on Data Mining (SDM), 2012,

A. Lugowski, A. Buluç, J. Gilbert, S. Reinhardt, "Scalable complex graph analysis with the knowledge discovery toolbox", International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2012,

A. Buluç, K. Madduri, "Parallel breadth-first search on distributed memory systems", Supercomputing (SC), November 2011,

Aydın Buluç, Samuel Williams, Leonid Oliker, James Demmel, "Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication", IPDPS, IEEE, 2011, doi:

A. Buluç, J. T. Fineman, M. Frigo, J. R. Gilbert, C. E. Leiserson, "Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks", SPAA '09 Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, 2009, doi:

A, Buluç, J. Gilbert, "Challenges and Advances in Parallel Sparse Matrix-Matrix Multiplication", Proceedings of the 37th International Conference on Parallel Processing (ICPP), 2008, doi: 10.1109/ICPP.2008.45

A. Buluç, J.R. Gilbert, "On the Representation and Multiplication of Hypersparse Matrices", IEEE International Symposium on Parallel and Distributed Processing (IPDPS), 2008, doi:

Book Chapters

Aydin Buluç, Scott Beamer, Kamesh Madduri, Krste Asanović, David Patterson., "Distributed-memory breadth-first search on massive graphs.", In D. Bader (editor), Parallel Graph Algorithms. CRC Press/Taylor-Francis, ( 2015)

Aydin Buluç, Henning Meyerhenke, Ilya Safro, Peter Sanders, Christian Schulz., "Recent advances in graph partitioning", ArXiv, ( 2015)

A. Buluç, J. R. Gilbert, V. B. Shah, "Implementing Sparse Matrices for Graph Algorithms", Graph Algorithms in the Language of Linear Algebra. SIAM Press, ( 2011)

A. Buluç, J. R. Gilbert, "New Ideas in Sparse Matrix-Matrix Multiplication", Graph Algorithms in the Language of Linear Algebra. SIAM Press, ( 2011)


Ariful Azad, Aydın Buluç, Distributed-memory algorithms for cardinality matching using matrix algebra, SIAM Conference on Parallel Processing for Scientific Computing (PP), Paris, France, April 2016,

A. Azad, G. Ballard, A. Buluc, J. Demmel, J. Gilbert, L. Grigori, O. Schwartz, S. Toledo, S. Williams, Parallel Sparse Matrix-Matrix Multiplication and Its Use in Triangle Counting and Enumeration, SIAM ALA, October 26, 2015,


A. Buluç, A. Fox, J. R. Gilbert, S. Kamil, A. Lugowski, L. Oliker, S. Williams, "High-performance analysis of filtered semantic graphs", PACT '12 Proceedings of the 21st international conference on Parallel architectures and compilation techniques (extended abstract), 2012, doi: 10.1145/2370816.2370897