Careers | Phone Book | A - Z Index

Mathias Jacquelin

photo2
Mathias Jacquelin
Research Scientist
Computational Research Division
Phone: +1 510 495 2605
Lawrence Berkeley National Laboratory
1 Cyclotron Rd, M/S 50A3111
Berkeley, CA 94720 US

Position


Research scientist,  Computational Research Division, November 2015 - present.

Education


  • Ph.D., Ecole Normale Superieure de Lyon, Juillet 2011.
  • M.Sc., Institut National des Sciences Appliquees de Lyon, Juillet 2008.

Short Bio


Mathias Jacquelin is a research scientist in the Scalable Solvers Group in the Computational Research Division at the Lawrence Berkeley National Laboratory since November 2015. He is currently working on highly scalable algorithms for solving large sparse symmetric linear systems.

He obtained a Master’s Degree in Computer Science from INSA of Lyon, France (National Institute of Applied Sciences). He went on to the Ecole Normale Superieure of Lyon, to complete his Ph.D. in Computer Science, focusing on algorithms and scheduling, and how hierarchical memory architectures should be handled. He joined Laura Grigori's team at INRIA of Saclay, France, for 18 months, working on communication avoiding algorithms for dense linear algebra. He then joined LBNL in 2013 as a postdoctoral fellow in the Scietific Computing Group and then the Scalable Solvers Group, focusing on various sparse linear algebra computations.

For more information, please visit his personal website:

http://graal.ens-lyon.fr/~mjacquel/index

Journal Articles

M. Jacquelin, L. Lin and C. Yang, "A Distributed Memory Parallel Algorithm for Selected Inversion: the non-symmetric case", PMAA, December 30, 2016,

M. Jacquelin, L. Lin, W. Jia, Y. Zhao and C. Yang, "A Left-looking selected inversion algorithm and task parallelism on shared memory systems", April 9, 2016,

Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight, Hong Diep Nguyen, "Reconstructing Householder vectors from Tall-Skinny QR", Journal of Parallel and Distributed Computing, November 1, 2015, 85:3-31, doi: 10.1016/j.jpdc.2015.06.003

Mathias Jacquelin, Lin Lin, Chao Yang, "A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case", To appear in ACM Transactions on Mathematical Software (TOMS), May 28, 2015,

Jack Dongarra, Mathieu Faverge, Thomas Herault, Mathias Jacquelin, Julien Langou, Yves Robert, "Hierarchical QR factorization algorithms for multi-core clusters", Parallel Computing, 2013, 39:212--232,

Tudor David, Mathias Jacquelin, Loris Marchal, "Scheduling streaming applications on a complex multicore platform", Concurrency and Computation: Practice and Experience, 2011, n/a--n/a, doi: 10.1002/cpe.1874

Conference Papers

John Bachan, Dan Bonachea, Paul H Hargrove, Steven Hofmeyr, Mathias Jacquelin, Amir Kamil, Brian Van Straalen, Scott Baden, "The UPC++ PGAS library for exascale computing", PAW 2017: 2nd Annual PGAS Applications Workshop - Held in conjunction with SC 2017, November 12, 2017, doi: 10.1145/3144779.3169108

We describe UPC++ V1.0, a C++11 library that supports APGAS programming. UPC++ targets distributed data structures where communication is irregular or fine-grained. The key abstractions are global pointers, asynchronous programming via RPC, and futures. Global pointers incorporate ownership information useful in optimizing for locality. Futures capture data readiness state, are useful for scheduling and also enable the programmer to chain operations to execute asynchronously as high-latency dependencies become satisfied, via continuations. The interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and closely resemble those used in modern C++. Communication in UPC++ runs at close to hardware speeds by utilizing the low-overhead GASNet-EX communication library.

Ariful Azad, Mathias Jacquelin, Aydin Buluc, Esmond G. Ng, "The Reverse Cuthill-McKee Algorithm in Distributed-Memory", IEEE International Parallel & Distributed Processing Symposium (IPDPS), Orlando, FL, May 2017,

Mathias Jacquelin, Yili Zheng, Esmond Ng, Katherine Yelick, "An Asynchronous Task-based Fan-Both Sparse Cholesky Solver", Submitted to SuperComputing'16, May 10, 2016,

Mathias Jacquelin, Lin Lin, Weile Jia, Yonghua Zhao, Chao Yang, "A Left-Looking Selected Inversion Algorithm and Task Parallelism on Shared Memory Systems", Submitted to SuperComputing'16, May 10, 2016,

M. Jacquelin, L. Lin, N. Wichmann and C. Yang, "Enhancing the scalability tree-based asynchronous communication", accepted IPDPS16, November 25, 2015,

G. Ballard, J. Demmel, L. Grigori, M. Jacquelin, Hong Diep Nguyen, E. Solomonik, "Reconstructing Householder Vectors from Tall-Skinny QR", Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, 2014, 1159-1170, doi: 10.1109/IPDPS.2014.120

Laura Grigori, Mathias Jacquelin, Amal Khabou, "Performance predictions of multilevel communication optimal LU and QR factorizations on hierarchical platforms", 29th IEEE International Supercomputing Conference (ISC'2014), Springer, 2014, 76--92,

Mathias Jacquelin, Marchal, Robert, Bora U\ccar, "On Optimal Tree Traversals for Sparse Matrix Factorization", IPDPS, 2011, 556-567,

Henricus Bouwmeester, Jacquelin, Langou, Yves Robert, "Tiled QR factorization algorithms", SC, 2011, 7,

Franck Cappello, Jacquelin, Marchal, Robert, Marc Snir, "Comparing archival policies for Blue Waters", HiPC, 2011, 1-10,

Matthieu Gallet, Jacquelin, Loris Marchal, "Scheduling complex streaming applications on the Cell processor", IPDPS Workshops, 2010, 1-8,

Mathias Jacquelin, Loris Marchal, Yves Robert, "Complexity analysis and performance evaluation of matrix product on multicore architectures", Parallel Processing, 2009. ICPP 09. International Conference on, 2009, 196--203,

Presentation/Talks

Mathias Jacquelin, Lin Lin, Nathan Wichmann, Chao Yang, Enhancing scalability and load balancing of Parallel Selected Inversion via tree-based asynchronous communication, IPDPS 16, May 24, 2016,

Mathias Jacquelin, Scheduling Sparse Symmetric Fan-Both Cholesky Factorization, The 11th Scheduling for Large Scale Systems Workshop, May 18, 2016,

Mathias Jacquelin, Scheduling Sparse Symmetric Fan-Both Cholesky Factorization, SIAM PP'16, April 15, 2016,

Reports

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Programmer’s Guide, v1.0-2018.3.0", Lawrence Berkeley National Laboratory Tech Report, March 31, 2018, LBNL 2001136, doi: 10.2172/1430693

UPC++ is a C++11 library that provides Partitioned Global Address Space (PGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The PGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, PGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Bryce Lelbach, Brian van Straalen,, "UPC++ Specification v1.0, Draft 6", Lawrence Berkeley National Laboratory Tech Report, March 26, 2018, LBNL 2001135, doi: 10.2172/1430689

UPC++ is a C++11 library providing classes and functions that support Partitioned Global Address Space (PGAS) programming. We are revising the library under the auspices of the DOE’s Exascale Computing Project, to meet the needs of applications requiring PGAS support. UPC++ is intended for implementing elaborate distributed data structures where communication is irregular or fine-grained. The UPC++ interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and similar to those used in conventional C++. The UPC++ programmer can expect communication to run at close to hardware speeds. The key facilities in UPC++ are global pointers, that enable the programmer to express ownership information for improving locality, one-sided communication, both put/get and RPC, futures and continuations. Futures capture data readiness state, which is useful in making scheduling decisions, and continuations provide for completion handling via callbacks. Together, these enable the programmer to chain together a DAG of operations to execute asynchronously as high-latency dependencies become satisfied.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Programmer’s Guide, v1.0-2017.9", Lawrence Berkeley National Laboratory Tech Report, September 29, 2017, LBNL 2001065, doi: 10.2172/1398522

This document has been superseded by: UPC++ Programmer’s Guide, v1.0-2018.3.0 (LBNL-2001136)

UPC++ is a C++11 library that provides Asynchronous Partitioned Global Address Space (APGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The APGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, APGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Bryce Lelbach, Brian van Straalen,, "UPC++ Specification v1.0, Draft 4", Lawrence Berkeley National Laboratory Tech Report, September 27, 2017, LBNL 2001066, doi: 10.2172/1398521

This document has been superseded by: UPC++ Specification v1.0, Draft 6 (LBNL-2001135)

UPC++ is a C++11 library providing classes and functions that support Asynchronous Partitioned Global Address Space (APGAS) programming. We are revising the library under the auspices of the DOE’s Exascale Computing Project, to meet the needs of applications requiring PGAS support. UPC++ is intended for implementing elaborate distributed data structures where communication is irregular or fine-grained. The UPC++ interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and similar to those used in conventional C++. The UPC++ programmer can expect communication to run at close to hardware speeds. The key facilities in UPC++ are global pointers, that enable the programmer to express ownership information for improving locality, one-sided communication, both put/get and RPC, futures and continuations. Futures capture data readiness state, which is useful in making scheduling decisions, and continuations provide for completion handling via callbacks. Together, these enable the programmer to chain together a DAG of operations to execute asynchronously as high-latency dependencies become satisfied.

Posters

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ and GASNet: PGAS Support for Exascale Apps and Runtimes", Poster at Exascale Computing Project (ECP) Annual Meeting 2018., February 2018,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian Van Straalen, "UPC++: a PGAS C++ Library", ACM/IEEE Conference on Supercomputing, SC'17, November 2017,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ and GASNet: PGAS Support for Exascale Apps and Runtimes", Poster at Exascale Computing Project (ECP) Annual Meeting 2017., January 2017,

Mathias Jacquelin, "Memory-Aware Algorithms and Scheduling Techniques: From Multicore Processors to Petascale Supercomputers", IPDPS Workshops, Pages: 2038-2041 2011,