Berkeley Lab Scientific Computing Seminar

Date:
Thursday, September 6, 2007
Time:
11:00am-12:00pm
Location:
Building 50B-4205
Seminar Speaker:
Mark Silberstein
Technion, Israel
Title:
Superlink-online: a large-scale distributed system for genetic linkage analysis
Abstract:
Genetic linkage analysis is a statistical tool used by geneticists for mapping disease-susceptibility genes in the study of genetic diseases. However such analysis is often beyond the capabilities of a single computer.

We present a distributed system for faster analysis of genetic data, called Superlink-online. The system achieves high performance through parallel execution of linkage analysis tasks over thousands of computational resources residing in multiple opportunistic computing environments, aka Grids.

Notably, the system is available online, which allows geneticists to perform computationally intensive analyses with no need for either installation of software, or maintenance of a complicated distributed environment.

In this talk we will describe the scheduling system architecture which drives Superlink-online. The main challenges have been to efficiently split large tasks for distributed execution in highly dynamic non-dedicated running environment, and to provide nearly interactive response time for shorter tasks while simultaneously serving massively parallel ones. The system utilizes resources in all the available grids, unifying thousands CPUs over campus grids in the Technion and the University of Wisconsin in Madison, EGEE grids in Europe, and Community Computing Grid Superlink@Technion.

The system is being extensively used by medical centers worldwide. Since January 2006, over 12,000 interactive genetic analysis tasks were performed, utilizing over 240 years of CPU time.

The talk is self-contained and does not require any prior knowledge in human genetics or distributed computing.

Mark Silberstein is a PhD student at the CS department in the Technion under the joint supervision of Prof. Assaf Schuster and Dan Geiger. His main research focus has been efficient serial and parallel algorithms for inference in Bayesian networks (in the context of genetic linkage analysis), and their execution in large-scale opportunistic computing environments, aka Grids. He is currently visiting UC Davis, working with Prof. John Owens on the parallelization of Bayesian inference on GPUs.

This work has been done as a part of Mark's PhD in the Technion under joint supervision of Prof. Assaf Schuster and Dan Geiger. The initial version has been published in American Journal of Human Genetics and presented at High Performance Distributed Computing conference in 2006.

Sponsor of Seminar:
TEMP:Sponsor
Scientific Computing

Contact Esmond G. Ng EGNg@lbl.gov