SuperLU-DIST
Recently we have developed a communication-avoiding 3D sparse LU factorization in SuperLU_DIST, which reduces the latency by a factor of O(log n) for plannar graphs and O(n^(1/3)) for non-plannar graphs.
Our new 3D code achieves speedups up to 27x for planar graphs and up to 2.5x for non-planar graphs over the baseline 2D SuperLU_DIST when run on 24,000 cores of a Cray XC30, Edison at NERSC.
Contact: Sherry Li, xsli@lbl.gov