Careers | Phone Book | A - Z Index

X-Tune: Auto-tuning for Exascale

Automatic Performance Tuning (or Auto-tuning) has emerged as an effective means of providing performance portability from one architecture to the next.  Rather than hoping a compiler can deliver optimal performance on ever more novel multicore architectures, or worse manually hand tune, auto-tuned kernels and applications can tune themselves on the target CPU, network, and programming model.

Our work represents one component of a larger DOE X-Stack2 project (X-Tune) that represents a collaboration between the University of Utah, Lawrence Berkeley Lab, the University of Southern California, and Argonne National Lab.  Building on the algorithmic and pathfinding work of the CACHE institute in conjunction with the CHiLL/ROSE auto-tuning framework, we at LBL are researching and developing tools that automatically implement code transformations that minimize vertical (i.e. from DRAM) data movement and aggregate horizontal (i.e. MPI) data movement.  To that end, we are leveraging the CHiLL/ROSE compiler to automatically transform and autotune numerical methods including Multigrid, the Spectral Element Method, and block eigensolvers like LOBPCG.

Researchers

 

Software

  • HPGMG-FV (a scalable compact benchmark developed under the ExaCT project for understanding the challenges of Geometric Multigrid on petascale and exascale systems built from multicore processors and manycore accelerators).  X-Tune leverages this code for compiler research.
  • miniGMG (A compact geometric multigrid benchmark developed under the CACHE project for optimization, architecture, and algorithmic research at small scale)  X-Tune leverages this code for compiler research.

 

 

Exascale Research Conference Materials

  • Handout
  • Quad Chart
  • Highlights
  • Poster

 

Publications

 

Journal Article

2017

Protonu Basu, Samuel Williams, Brian Van Straalen, Leonid Oliker, Phillip Colella, Mary Hall, "Compiler-Based Code Generation and Autotuning for Geometric Multigrid on GPU-Accelerated Supercomputers", Parallel Computing (PARCO), April 2017, doi: 10.1016/j.parco.2017.04.002

Conference Paper

2015

Protonu Basu, Samuel Williams, Brian Van Straalen, Mary Hall, Leonid Oliker, Phillip Colella, "Compiler-Directed Transformation for Higher-Order Stencils", International Parallel and Distributed Processing Symposium (IPDPS), May 2015,

2014

Protonu Basu, Samuel Williams, Brian Van Straalen, Leonid Oliker, Mary Hall, "Converting Stencils to Accumulations for Communication-Avoiding Optimization in Geometric Multigrid", Workshop on Stencil Computations (WOSC), October 2014,

2013

Protonu Basu, Anand Venkat, Mary Hall, Samuel Williams, Brian Van Straalen, Leonid Oliker, "Compiler generation and autotuning of communication-avoiding operators for geometric multigrid", 20th International Conference on High Performance Computing (HiPC), December 2013, 452--461,

P. Basu, A. Venkat, M. Hall, S. Williams, B. Van Straalen, L. Oliker, "Compiler Generation and Autotuning of Communication-Avoiding Operators for Geometric Multigrid", Workshop on Stencil Computations (WOSC), 2013,

Presentation/Talk

2015

Samuel Williams, X-TUNE, X-Stack PI Meeting, December 2015,

Report

2012

Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian Van Straalen, Mikhail Smelyanskiy,
Ann Almgren, Pradeep Dubey, John Shalf, Leonid Oliker,
"Implementation and Optimization of miniGMG - a Compact Geometric Multigrid Benchmark", December 2012, LBNL 6676E,