Berkeley Lab Scientific Computing Seminar

Date:
Wednesday, March 21, 2007
Time:
2:00pm-3:00pm
Location:
Building 70-191
Seminar Speaker:
Felix Wolf
Title:
Scalable Trace-Based Performance Analysis of Parallel Applications
Abstract:
The complex architectures of high-end computing systems present difficult challenges for performance optimization of scientific applications. Tools are needed that collect and present relevant information on application performance so as to enable developers to easily identify and determine the causes of performance bottlenecks.

Event tracing is a widely used technique for the performance analysis of parallel applications. Time-stamped events, such as entering a function or sending a message, are recorded at runtime and analyzed afterwards. The KOJAK trace analyzer, a joint development of Forschungszentrum Jlich and the University of Tennessee, can automatically locate complex performance problems by searching event traces for patterns of inefficient behavior and quantifying their significance.

While KOJAK has proved to provide useful higher-level feedback on the performance of medium-scale parallel applications, its method of sequentially analyzing trace data does not scale to applications with thousands of processes. In the SCALASCA project, we are developing a scalable version of KOJAK that exploits both distributed memory and parallel processing capabilities available on the target platform. The talk will describe the new parallel analyzer architecture and discuss results for up to 16,384 processes obtained on an IBM Blue Gene/L system.

Sponsor of Seminar:
David Bailey
Scientific Computing

Contact Esmond G. Ng EGNg@lbl.gov