Careers | Phone Book | A - Z Index

Costin Iancu

fata2
Costin Iancu
Senior Staff Scientist
Phone: +1 510 495 2122
Fax: +1 510 486 6900

I am performing research in the areas of programming models and code optimization for large scale parallel systems.  I tend to favor simple and practical designs. Over the years I've been involved in multiple projects.

 

  • Big Data Support on HPC Systems. (Intel Parallel Computing Center): There exists incentive in the HPC community to run commercial data analytics frameworks (Hadoop, Spark) on  capability systems. Adoption, so far has been hampered by design decisions that are commercial data-center specific, e.g. local disk available. We plan to explore R&D for technologies to enable  Spark to run on large scale HPC systems. The IPCC is part of our larger research agenda and focuses on improving the interaction between Spark and the Lustre parallel global file system on production systems at NERSC. First notable accomplishment is that we show how to scale Spark from 100 cores to 10.000 cores on production HPC systems.
  • CORVETTE (Correctness, Verification and Testing of Parallel Programs): I'm interested in tools that assist developers in bug finding or program state exploration for debugging or optimization purposes. The focus is on developing a scalable dynamic program analysis framework for the HPC programming models
  •   Berkeley UPCI did write a lot of the compiler code and moved on to code optimizations, where I've been successful when using a combination of static and dynamic program analyses to guide communication and task scheduling optimizations.  To guide optimizations, I'd rather use "qualitative" performance models over the more traditional approches: track derivatives and preserve ordering rather than predict absolute values. 
  • THOR (Throughput Oriented Runtime): In this project we are trying to build a software overlay  network to perform on-the-fly communication optimizations for large scale systems.  There are three dimensions of control: concurrency, order and granularity. We have developed mechanisms to regulate the concurrency of communication operations, avoid congestion, scheduling/reordering for improved throughput etc.

The individual project pages contain more information: slides, posters, software, fun ...

 

Journal Articles

Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, Costin Iancu, "Reaching Bandwidth Saturation Using Transparent Injection Parallelization", International Journal of High Performance Computing Applications (IJHPCA), November 2016, doi: 10.1177/1094342016672720

Nicholas Chaimov, Khaled Ibrahim, Samuel Williams, Costin Iancu, "Exploiting Communication Concurrency on High Performance Computing Systems", IJHPCA, April 17, 2015,

Khaled Z. Ibrahim, Steven Hofmeyr, Costin Iancu, "The Case for Partitioning Virtual Machines on Manycore Architectures", IEEE TPDS, April 17, 2014,

S Hofmeyr, J Colmenares, J Kubiatowicz, C Iancu, "Juggle: Addressing Extrinsic Load Imbalances in SPMD Applications on Multicore Computer", Cluster Computing, 2012,

Katherine A. Yelick, Dan Bonachea, Wei-Yu Chen, Phillip Colella, Kaushik Datta, Jason Duell, Susan L. Graham, Paul Hargrove, Paul N. Hilfinger, Parry Husbands, Costin Iancu, Amir Kamil, Rajesh Nishtala, Jimmy Su, Michael L. Welcome, Tong Wen, "Productivity and performance using partitioned global address space languages", Parallel Symbolic Computation (PASCO), 2007,

Conference Papers

Wim Lavrijsen, Costin Iancu, Wibe Albert de Jong, Xin Chen, Karsten Schwan, "Exploiting Variability for Energy Optimization of Load Balanced Parallel Programs", EuroSys 2016, February 5, 2016,

Nicholas Chaimov, Allen Malony, Shane Canon, Costin Iancu, Khaled Ibrahim, Jay Srinivasan, "Scaling Spark on HPC Systems", High Performance and Distributed Computing (HPDC), February 5, 2016,

Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu, "SReplay: Deterministic Sub-Group Replay for One-Sided Communication", International Conference on Supercomputing (ICS), 2016, February 5, 2016,

Cuong Nguyen, Cindy Rubio-Gonzalez, Benjamin Mehne, Koushik Sen, Costin Iancu, James Demmel, William Kahan, Wim Lavrijsen, David H. Bailey, David Hough, "Floating-point precision tuning using blame analysis", 38th International Conference on Software Engineering (ICSE 2016), January 20, 2016,

Costin Iancu, Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, "Exploiting Communication Concurrency on High Performance Computing Systems", Programming Models and Applications for Multicores and Manycores (PMAM), February 2015,

M. Chabbi, W. Lavrijsen, W.A. de Jong, K. Sen, J. Mellor-Crummey, C. Iancu, "Barrier elision for production parallel programs", PPoPP ’15: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, New York, NY, USA, ACM, February 7, 2015, 109-119, doi: 10.1145/2688500.2688502

Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushik Sen, John Mellor Crummey, Costin Iancu, "Barrier Elision for Production Parallel Programs", PPOPP 2015, February 5, 2015,

Khaled Ibrahim, Paul Hargrove, Costin Iancu, Katherine Yelick, "A Performance Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect", IPDPS 2014, April 17, 2014,

Cindy Rubio-Gonzalez, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, David Hough, "Precimonious: Tuning Assistant for Floating-Point Precision", Supercomputing 2013, July 19, 2013,

Chang-Seo Park, Koushik Sen, Costin Iancu, "Scaling Data Race Detection for Partitioned Global Address Space Programs", International Supercomputing Conference (ICS) 2013, 2013,

Miao Luo, Dhabaleswar K. Panda, Khaled Z. Ibrahim, Costin Iancu, "Congestion avoidance on manycore high performance computing systems", International Conference on Supercomputing (ICS), 2012,

Chang-Seo Park, Koushik Sen, Paul Hargrove, Costin Iancu, "Efficient data race detection for distributed memory parallel programs", Supercomputing (SC), 2011,

Steven A. Hofmeyr, Juan A. Colmenares, Costin Iancu, John Kubiatowicz, "Juggle: proactive load balancing on multicore computers", High-Performance Parallel and Distributed Computing (HPDC), 2011,

Seung-Jai Min, Costin Iancu and Katherine Yelick, "Hierarchical Work Stealing on Manycore Clusters", Fifth Conference on Partitioned Global Address Space Programming Models (PGAS11), 2011,

Khaled Z. Ibrahim, S. Hofmeyr, Eric Roman, "Optimized Pre-Copy Live Migration for Memory Intensive Applications", The International Conference for High Performance Computing, Networking, Storage, and Analysis, 2011,

Khaled Z. Ibrahim, S. Hofmeyr, C. Iancu, "Characterizing the Performance of Parallel Applications on Multi-socket Virtual Machines", Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on, 2011, 1 -12,

Filip Blagojevic, Paul Hargrove, Costin Iancu, and Katherine Yelick, "Hybrid PGAS Runtime Support for Multicore Nodes", Fourth Conference on Partitioned Global Address Space Programming Model (PGAS10), October 2010,

Steven A. Hofmeyr, Costin Iancu, Filip Blagojevic, "Load balancing on speed", Principles and Practice of Parallel Programming (PPoPP), June 6, 2010,

Costin Iancu, Steven A. Hofmeyr, Filip Blagojevic, Yili Zheng, "Oversubscription on multicore processors", International Parallel & Distributed Processing Symposium (IPDPS), 2010,

Filip Blagojevic, Costin Iancu, Katherine A. Yelick, Matthew Curtis-Maury, Dimitrios S. Nikolopoulos, Benjamin Rose, "Scheduling dynamic parallelism on accelerators.", Computing Frontiers (CF), June 6, 2009,

Costin Iancu, Steven A. Hofmeyr, "Runtime optimization of vector operations on large scale SMP clusters", Parallel Architectures and Compilation Techniques (PACT), 2008,

Costin Iancu, Wei Chen, Katherine A. Yelick, "Performance portable optimizations for loops containing communication operations", International Conference on Supercomputing (ICS), June 6, 2008,

Katherine Yelick, Dan Bonachea, Wei-Yu Chen, Phillip Colella, Kaushik Datta, Jason Duell, Susan L. Graham, Paul Hargrove, Paul Hilfinger, Parry Husbands, Costin Iancu, Amir Kamil, Rajesh Nishtala, Jimmy Su, Michael Welcome, Tong Wen, "Productivity and Performance Using Partitioned Global Address Space Languages", Parallel Symbolic Computation (PASCO'07), July 2007,

Partitioned Global Address Space (PGAS) languages combine the programming convenience of shared memory with the locality and performance control of message passing. One such language, Unified Parallel C (UPC) is an extension of ISO C defined by a consortium that boasts multiple proprietary and open source compilers. Another PGAS language, Titanium, is a dialect of Java T M designed for high performance scientific computation. In this paper we describe some of the highlights of two related projects, the Titanium project centered at U.C. Berkeley and the UPC project centered at Lawrence Berkeley National Laboratory. Both compilers use a source-to-source strategy that translates the parallel languages to C with calls to a communication layer called GASNet. The result is portable highperformance compilers that run on a large variety of shared and distributed memory multiprocessors. Both projects combine compiler, runtime, and application efforts to demonstrate some of the performance and productivity advantages to these languages.

Costin Iancu, Erich Strohmaier, "Optimizing communication overlap for high-speed networks", Principles and Practice of Parallel Programming (PPoPP), 2007,

Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine A. Yelick, "Automatic nonblocking communication for partitioned global address space programs", International Conference on Supercomputing (ICS), 2007,

Leonid Oliker, Canning, Carter, Iancu, Lijewski, Kamil, Shalf, Shan, Strohmaier, Ethier, Tom Goodale, "Scientific Application Performance on Candidate PetaScale Platforms", IEEE International Symposium on Parallel & Distributed Processing (IPDPS). BEST PAPER AWARD - application track., 2007, 1-12, doi: 10.1109/IPDPS.2007.370259

Costin Iancu, Parry Husbands, Paul Hargrove, "HUNTing the Overlap", IEEE Parallel Architectures and Compilation Techniques (PACT), 2005,

Wei-Yu Chen, Costin Iancu, Katherine A. Yelick, "Communication Optimizations for Fine-Grained UPC Applications", IEEE Parallel Architectures and Compilation Techniques, 2005,

Costin Iancu, Parry Husbands, Wei Chen, "Message Strip-Mining Heuristics for High Speed Networks", VECPAR, 2004,

Christian Bell, Dan Bonachea, Yannick Cote, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Michael L. Welcome, Katherine A. Yelick, "An Evaluation of Current High-Performance Networks", IPDPS - IEEE International Parallel & Distributed Processing Symposium, 2003,

Parry Husbands, Costin Iancu, Katherine A. Yelick, "A performance analysis of the Berkeley UPC compiler", International Conference on Supercomputing (ICS), 2003,

Costin Iancu, Anurag Acharya, "A Comparison of Feedback Based and Fair Queuing Mechanisms for Handling Unresponsive Traffic", IEEE Symposium on Computers and Communications (ISCC), 2001,

Costin Iancu, Anurag Acharya, "An evaluation of search tree techniques in the presence of caches", International Symposium on Performance Analysis of Systems and Software, June 6, 2001,

Andrew Duncan, Bogdan Cocosel, Costin Iancu, Holger Kienle, Radu Rugina, Urs Holzle, "OSUIF: SUIF 2.0 With Objects", 2nd SUIF Compiler Workshop, June 6, 1997,

Book Chapters

L. Oliker, A. Canning, J. Carter, C. Iancu, M. Lijewski, S. Kamil, J. Shalf, H. Shan, E. Strohmaier, S. Ethier, T. Goodale, "Performance Characteristics of Potential Petascale Scientific Applications", Petascale Computing: Algorithms and Applications. Chapman & Hall/CRC Computational Science Series (Hardcover), edited by David A. Bader, ( 2007)

Chapter

Presentation/Talks

Yili Zheng, Filip Blagojevic, Dan Bonachea, Paul H. Hargrove, Steven Hofmeyr, Costin Iancu, Seung-Jai Min, Katherine Yelick, Getting Multicore Performance with UPC, SIAM Conference on Parallel Processing for Scientific Computing, February 2010,

Yili Zheng, Costin Iancu, Paul H. Hargrove, Seung-Jai Min, Katherine Yelick, Extending Unified Parallel C for GPU Computing, SIAM Conference on Parallel Processing for Scientific Computing, February 24, 2010,

Reports

Cindy Rubio-Gonz ́alez, Cuong Nguyen, James Demmel, William Kahan, Koushik Sen, Wim Lavrijsen, Costin Iancu, "Floating Point Precision Tuning Using Blame Analysis", LBNL TR, April 17, 2015,

Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu, "OPR: Partial Deterministic Record and Replay for One-Sided Communication", LBNL TR, April 17, 2015,

Michael Schmitt. Anurag Acharya, Max Ibel, Costin Iancu, "Service Sockets: A Uniform User-Level Interface for Networking Applications", 2001,

Posters

Chang-Seo Park, Koushik Sen, Costin Iancu, "Scaling Data Race Detection for Partitioned Global Address Space Programs", Principles and Practice of Parallel Programming (PPoPP 2013), March 4, 2013,

Costin Iancu, Wei Chen, Katherine A. Yelick, "Performance Portable Optimizations for Loops Containing Communication Operations", IEEE PACT, 2007,

Christian Bell, Dan Bonachea, Wei Chen, Jason Duell, Paul Hargrove, Parry Husbands, Costin Iancu, Wei Tu, Mike Welcome, Kathy Yelick, "GASNet 2 - An Alternative High-Performance Communication Interface", Poster Session at SuperComputing 2004, November 9, 2004,

Annual Reports

James Demmel, Costin Iancu, Kouhsik Sen, "Corvette Progress Report 2015", April 1, 2015,