Careers | Phone Book | A - Z Index

Publications

2018

Tal Shachaf, Alexander Sim, Kesheng Wu, Wilko Kroeger, "Detecting Anomalies in the LCLS Workflow", 3rd workshop on Open Science in Big Data (OSBD 2018), in conjunction with IEEE International Conference on Big Data (Big Data 2018), 2018,

A. Lazar, K. Wu, A. Sim, "Predicting Network Traffic Using TCP Anomalies", IEEE International Conference on Big Data (Big Data 2018), 2018,

Kade Gibson, Dongeun Lee, Jaesik Choi, Alex Sim, "Dynamic Online Performance Optimization in Streaming Data Compression", IEEE International Conference on Big Data (Big Data 2018), 2018,

Bin Dong, Teng Wang, Houjun Tang, Quincey Koziol, Kesheng Wu, and Suren Byna, "ARCHIE: Data Analysis Acceleration with Array Caching in Hierarchical Storage", IEEE BigData, 2018, December 10, 2018,

Tuowen Zhao, Samuel Williams, Mary Hall, Hans Johansen, "Delivering Performance Portable Stencil Computations on CPUs and GPUs Using Bricks", International Workshop on Performance, Portability and Productivity in HPC (P3HPC), November 2018,

Maximilian H Bremer, John D Bachan, Cy P Chan, "Semi-Static and Dynamic Load Balancing for Asynchronous Hurricane Storm Surge Simulations", 2018 Parallel Applications Workshop, Alternatives To MPI (PAW-ATM), November 16, 2018,

Charlene Yang, Rahulkumar Gayatri, Thorsten Kurth, Protonu Basu, Zahra Ronaghi, Adedoyin Adetokunbo, Brian Friesen, Brandon Cook, Douglas Doerfler, Leonid Oliker, Jack Deslippe, Samuel Williams, "An Empirical Roofline Methodology for Quantitatively Assessing Performance Portability", International Workshop on Performance, Portability and Productivity in HPC (P3HPC), November 2018,

Paul H. Hargrove, Dan Bonachea, "GASNet-EX Performance Improvements Due to Specialization for the Cray Aries Network", Parallel Applications Workshop, Alternatives To MPI (PAW-ATM), Dallas, Texas, USA, November 16, 2018, doi: 10.25344/S44S38

GASNet-EX is a portable, open-source, high-performance communication library designed to efficiently support the networking requirements of PGAS runtime systems and other alternative models on future exascale machines. This paper reports on the improvements in performance observed on Cray XC-series systems due to enhancements made to the GASNet-EX software. These enhancements, known as "specializations", primarily consist of replacing network-independent implementations of several recently added features with implementations tailored to the Cray Aries network. Performance gains from specialization include (1) Negotiated-Payload Active Messages improve bandwidth of a ping-pong test by up to 14%, (2) Immediate Operations reduce running time of a synthetic benchmark by up to 93%, (3) non-bulk RMA Put bandwidth is increased by up to 32%, (4) Remote Atomic performance is 70% faster than the reference on a point-to-point test and allows a hot-spot test to scale robustly, and (5) non-contiguous RMA interfaces see up to 8.6x speedups for an intra-node benchmark and 26% for inter-node. These improvements are all available in GASNet-EX version 2018.3.0 and later.

Karen Tu, Alex Sim (Advisor), John Wu (Advisor), "Identification of Network Data Transfer Bottlenecks in HPC Systems", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’18), ACM Student Research Competition (SRC), 2018,

Scott B. Baden, Paul H. Hargrove, Hadia Ahmed, John Bachan, Dan Bonachea, Steve Hofmeyr, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ and GASNet-EX: PGAS Support for Exascale Applications and Runtimes", The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'18), November 13, 2018,

Lawrence Berkeley National Lab is developing a programming system to support HPC application development using the Partitioned Global Address Space (PGAS) model. This work is driven by the emerging need for adaptive, lightweight communication in irregular applications at exascale. We present an overview of UPC++ and GASNet-EX, including examples and performance results.

GASNet-EX is a portable, high-performance communication library, leveraging hardware support to efficiently implement Active Messages and Remote Memory Access (RMA). UPC++ provides higher-level abstractions appropriate for PGAS programming such as: one-sided communication (RMA), remote procedure call, locality-aware APIs for user-defined distributed objects, and robust support for asynchronous execution to hide latency. Both libraries have been redesigned relative to their predecessors to meet the needs of exascale computing. While both libraries continue to evolve, the system already demonstrates improvements in microbenchmarks and application proxies.

Hongzhang Shan, Samuel Williams, Calvin W. Johnson, "Improving MPI Reduction Performance for Manycore Architectures with OpenMP and Data Compression", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), November 2018,

Samuel Williams, Introduction to the Roofline Model, Supercomputing, November 2018,

I. Monga, C. Guok, J. MacAuley, A. Sim, H. Newman, J. Balcas, P. DeMar, L. Winkler, T. Lehman, X. Yang, "SDN for End-to-end Networked Science at the Exascale (SENSE)", Innovate the Network for Data-Intensive Science Workshop (INDIS 2018), in conjunction with the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'18), 2018, doi: 10.1109/INDIS.2018.00007

Anna Giannakou, Daniel Gunter, Sean Peisert, "Flowzilla: A Methodology for Detecting Data Transfer Anomalies in Research Networks", Workshop on Innovating the Network for Data-Intensive Science (INDIS), November 11, 2018,

Xin Xing, Bin Dong, Jonathan Ajo-Franklin, Kesheng Wu, "Automated Parallel Data Processing Engine with Application to Large-Scale Feature Extraction", HLHPC at SC 2018, November 10, 2018,

Sean Peisert, Usable Computer Security and Privacy to Enable and Encourage Data Sharing for Scientific Research, National Academies of Sciences, Engineering, and Medicine Committee on Science, Engineering, Medicine, and Public Policy (COSEMPUP) Meeting, November 8, 2018,

Cy P Chan, Bin Wang, John D Bachan, Jane Macfarlane, "Mobiliti: Scalable Transportation Simulation Using High-Performance Parallel Computing", 2018 IEEE International Conference on Intelligent Transportation Systems (ITSC), November 6, 2018,

Cheah You-Wei, Drew Paine, Devarshi Ghoshal, Lavanya Ramakrishnan, Bringing Data Science to Qualitative Analysis, 2018 IEEE 14th International Conference on e-Science, Pages: 325-326 2018, doi: 10.1109/eScience.2018.00076

Gonzalo P. Rodrigo, Matt Henderson, Gunther H. Weber, Colin Ophus, Katie Antypas, Lavanya Ramakrishnan, "ScienceSearch: Enabling Search through Automatic Metadata Generation", 2018 IEEE 14th International Conference on e-Science, IEEE Computer Society, 2018, 93-104, doi: 10.1109/eScience.2018.00025

Mahdi Jamei, Anna Scaglione, Sean Peisert, "Cyber-Physical Relaying Reliability Enhancement through Hybrid Network Intrusion Detection Systems", Proceedings of the 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Allborg, Denmark, IEEE, October 29, 2018,

Cy Chan, Vesselin Drensky, Alan Edelman, Raymond Kan, Plamen Koev, "On Computing Schur Functions and Series Thereof", Journal of Algebraic Combinatorics, October 20, 2018,

Hang Deng, Sergi Molins, David Trebotich, Carl Steefel, Donald DePaolo, "Pore-scale numerical investigation of the impacts of surface roughness: Up-scaling of reaction rates in rough fractures", Geochimica et Cosmochimica Acta, October 15, 2018, 239:374-389, doi: 10.1016/j.gca.2018.08.005

Dan Bonachea, Paul H. Hargrove, "GASNet-EX: A High-Performance, Portable Communication Library for Exascale", Languages and Compilers for Parallel Computing (LCPC'18), Salt Lake City, Utah, USA, October 11, 2018, LBNL 2001174, doi: 10.25344/S4QP4W

Partitioned Global Address Space (PGAS) models, typified by such languages as Unified Parallel C (UPC) and Co-Array Fortran, expose one-sided communication as a key building block for High Performance Computing (HPC) applications. Architectural trends in supercomputing make such programming models increasingly attractive, and newer, more sophisticated models such as UPC++, Legion and Chapel that rely upon similar communication paradigms are gaining popularity.

GASNet-EX is a portable, open-source, high-performance communication library designed to efficiently support the networking requirements of PGAS runtime systems and other alternative models in future exascale machines. The library is an evolution of the popular GASNet communication system, building upon over 15 years of lessons learned. We describe and evaluate several features and enhancements that have been introduced to address the needs of modern client systems. Microbenchmark results demonstrate the RMA performance of GASNet-EX is competitive with several MPI-3 implementations on current HPC systems.

George Michelogiannakis, How Open Source Hardware Will Drive the Next Generation of HPC Systems, CROSS Symposium at UCSC, October 2018,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Programmer's Guide, v1.0-2018.9.0", Lawrence Berkeley National Laboratory Tech Report, September 26, 2018, LBNL 2001180, doi: 10.25344/S49G6V

UPC++ is a C++11 library that provides Partitioned Global Address Space (PGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The PGAS model is single program, multiple-data (SPMD), with each separate constituent process having access to local memory as it would in C++. However, PGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the processes. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Specification v1.0, Draft 8", Lawrence Berkeley National Laboratory Tech Report, September 26, 2018, LBNL 2001179, doi: 10.25344/S45P4X

UPC++ is a C++11 library providing classes and functions that support Partitioned Global Address Space (PGAS) programming. We are revising the library under the auspices of the DOE’s Exascale Computing Project, to meet the needs of applications requiring PGAS support. UPC++ is intended for implementing elaborate distributed data structures where communication is irregular or fine-grained. The UPC++ interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and similar to those used in conventional C++. The UPC++ programmer can expect communication to run at close to hardware speeds. The key facilities in UPC++ are global pointers, that enable the programmer to express ownership information for improving locality, one-sided communication, both put/get and RPC, futures and continuations. Futures capture data readiness state, which is useful in making scheduling decisions, and continuations provide for completion handling via callbacks. Together, these enable the programmer to chain together a DAG of operations to execute asynchronously as high-latency dependencies become satisfied.

George Michelogiannakis, John Shalf, Benjamin Aivazi, Yiwen Shen, Keren Bergman, Madeleine Glick, Larry Dennison, Architectural Opportunities and Challenges from Emerging Photonics in Future Systems, IEEE conference on Photonics in Switching and Computing (PSC), September 2018,

S. Balasubramanian, D. Ghosal, K. N. Balasubramanian, E. Pouyoul, A. Sim, K. Wu, B. Tierney, "Auto-tuned Publisher in a Pub/Sub System: Design and Performance Evaluation", the 15th IEEE International Conference on Autonomic Computing (ICAC 2018), 2018,

Teng Wang, Suren Byna, Bin Dong, and Houjun Tang, "UniviStor: Integrated Hierarchical and Distributed Storage for HPC", IEEE Cluster 2018., September 1, 2018,

Samuel Williams, Roofline on Manycore and Accelerated Systems, ModSim, August 2018,

Sean Peisert, Security Concerns of an NRP, Second National Research Platform (NRP) Workshop, August 6, 2018,

Samuel Williams, Parallelism and Performance, MolSSI Summer School, August 2018,

"A Holistic Approach to Distribution Grid Intrusion Detection Systems", Ciaran Roberts, Anna Scaglione Sean Peisert,, EnergyCentral, July 18, 2018,

Khaled Ibrahim, Samuel Williams, Leonid Oliker, "Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis", HPCS Special Session on High Performance Computing Benchmarking and Optimization (HPBench), July 2018,

J. Wang, K. Wu, A. Sim, S. Hwangbo, "Feature Engineering and Classification Models for Partial Discharge in Power Transformers", Joint Workshop on Deep Learning for Safety-Critical in Engineering Systems (DISE1), in conjunction with ICML, AAMAS, IJCAI, and ECAI 2018, 2018,

Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu, Anna Ho, and Peter Nugent, "Distributed Caching for Processing Raw Arrays", SSDBM, 2018,

C. Dao, X. Liu, J. Jiang, A. Sim, C. E. Tull, K. Wu, "Modeling Data Transfers: Change Point and Anomaly Detection", International Workshop on Scalable Network Traffic Analytics (SNTA 2018), 2018, in conjunction with the 38th IEEE International Conference on Distributed Computing Systems (ICDCS 2018), 2018,

J. Kim, J. Choi, A. Sim, "Spatio-temporal Analysis of HPC I/O and Connection Data", International Workshop on Scalable Network Traffic Analytics (SNTA 2018), 2018, in conjunction with the 38th IEEE International Conference on Distributed Computing Systems (ICDCS 2018), 2018,

Sean Peisert, Cyber Security Challenges and Opportunities in High-Performance Computing Environments, International Supercomputing Conference, June 26, 2018,

Tuomas Koskela, Zakhar Matveev, Charlene Yang, Adetokunbo Adedoyin4, Roman Belenov, Philippe Thierry, Zhengji Zhao, Rahulkumar Gayatri, Hongzhang Shan, Leonid Oliker, Jack Deslippe, Ron Green, and Samuel Williams, "A Novel Multi-Level Integrated Roofline Model Approach for Performance Characterization", ISC, June 2018,

Joseph P. Kenny, Khachik Sargsyan, Samuel Knight, George Michelogiannakis, Jeremiah J. Wilke, "The Pitfalls of Provisioning Exascale Networks: A Trace Replay Analysis for Understanding Communication Performance", ISC High Performance 2018, June 2018, 10876,

R. Kettimuthu, Z. Liu, I. Foster, P. Beckman, A. Sim, K. Wu, W. Liao, Q. Kang, A. Agrawal, A. Choudhary, "Towards Autonomic Science Infrastructure: Architecture, Limitations, and Open Issues", Workshop in Autonomous Infrastructure for Science (AI-Science 2018), 2018, in conjunction with the 27th International Symposium on High-Performance Parallel and Distributed Computing (ACM HPDC 2018), 2018, doi: 10.1145/3217197.3217205

M. Yang, X. Liu, W. Kroeger, A. Sim, K. Wu, "Identifying Anomalous File Transfer Events in LCLS Workflow", Workshop in Autonomous Infrastructure for Science (AI-Science 2018), 2018, in conjunction with the 27th International Symposium on High-Performance Parallel and Distributed Computing (ACM HPDC 2018), 2018, doi: 10.1145/3217197.3217203

Keren Bergman, John Shalf, George Michelogiannakis, Sebastien Rumley, Larry Dennison, Monia Ghobadi, "PINE: An Energy Efficient Flexibly Interconnected Photonic Data Center Architecture for Extreme Scalability", 31st annual conference of the IEEE Photonics Society, IEEE, June 2018,

Charlene Yang, Brian Friesen, Thorsten Kurth, Brandon Cook, Samuel Williams, "Toward Automated Application Profiling on Cray Systems", Cray User Group (CUG), May 2018,

J. Kim, A. Sim, B. Tierney, S. Suh, I. Kim, "Multivariate Network Traffic Analysis using Clustered Patterns", Journal of Computing, 2018, doi: 10.1007/s00607-018-0619-4

Houjun Tang, Suren Byna, Francois Tessier, Teng Wang, Bin Dong, Jingqing Mu, Quincey Koziol, Jerome Soumagne, Venkatram Vishwanath, Jialin Liu, and Richard Warren, "Toward Scalable and Asynchronous Object-centric Data Management for HPC", 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2018, May 1, 2018,

Meiyue Shao, Felipe H. da Jornada, Lin Lin, Chao Yang, Jack Deslippe, Steven G. Louie, "A structure preserving Lanczos algorithm for computing the optical absorption spectrum", SIAM Journal on Matrix Analysis and Applications, 2018, 39:683--711, doi: 10.1137/16M1102641

Haoyuan Xing, Sofoklis Floratos, Spyros Blanas, Suren Byna, Prabhat, Kesheng Wu, and Paul Brown,, "ArrayBridge: Interweaving declarative array processing with imperative high-performance computing", 34th IEEE International Conference on Data Engineering (ICDE) 2018, April 17, 2018,

Bharti Wadhwa, Suren Byna, Ali R. Butt, "Toward Transparent Data Management in Multi-layer Storage Hierarchy for HPC Systems", IEEE International Conference on Cloud Engineering 2018 (IC2E 2018), April 17, 2018,

Ariful Azad, Georgios A. Pavlopoulos, Christos A. Ouzounis, Nikos C. Kyrpides, Aydin Buluç, "HipMCL: A high-performance parallel implementation of the Markov cluster algorithm for large scale networks", Nucleic Acids Research, April 2018,

Junmin Gu, Scott Klasky, Norbert Podhorszki, Ji Qiang, Kesheng Wu, "Querying Large Scientific Data Sets with Adaptable IO System ADIOS", Supercomputing Frontiers (Best Paper Award), Springer International Publishing, 2018, 51-69,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Programmer’s Guide, v1.0-2018.3.0", Lawrence Berkeley National Laboratory Tech Report, March 31, 2018, LBNL 2001136, doi: 10.2172/1430693

UPC++ is a C++11 library that provides Partitioned Global Address Space (PGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The PGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, PGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.

Sean Peisert, Keynote: Cybersecurity for HPC Systems: State of the Art and Looking to the Future, High-Performance Computing Security Workshop, National Institute of Standards and Technology (NIST), March 28, 2018,

Dan Bonachea, Paul Hargrove, "GASNet-EX Performance Improvements Due to Specialization for the Cray Aries Network", Lawrence Berkeley National Laboratory Tech Report, March 27, 2018, LBNL 2001134, doi: 10.2172/1430690

This document is a deliverable for milestone STPM17-6 of the Exascale Computing Project, delivered by WBS 2.3.1.14. It reports on the improvements in performance observed on Cray XC-series systems due to enhancements made to the GASNet-EX software. These enhancements, known as “specializations”, primarily consist of replacing network-independent implementations of several recently added features with implementations tailored to the Cray Aries network. Performance gains from specialization include (1) Negotiated-Payload Active Messages improve bandwidth of a ping-pong test by up to 14%, (2) Immediate Operations reduce running time of a synthetic benchmark by up to 93%, (3) non-bulk RMA Put bandwidth is increased by up to 32%, (4) Remote Atomic performance is 70% faster than the reference on a point-to-point test and allows a hot-spot test to scale robustly, and (5) non-contiguous RMA interfaces see up to 8.6x speedups for an intra-node benchmark and 26% for inter-node. These improvements are available in the GASNet-EX 2018.3.0 release.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Bryce Lelbach, Brian van Straalen,, "UPC++ Specification v1.0, Draft 6", Lawrence Berkeley National Laboratory Tech Report, March 26, 2018, LBNL 2001135, doi: 10.2172/1430689

UPC++ is a C++11 library providing classes and functions that support Partitioned Global Address Space (PGAS) programming. We are revising the library under the auspices of the DOE’s Exascale Computing Project, to meet the needs of applications requiring PGAS support. UPC++ is intended for implementing elaborate distributed data structures where communication is irregular or fine-grained. The UPC++ interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and similar to those used in conventional C++. The UPC++ programmer can expect communication to run at close to hardware speeds. The key facilities in UPC++ are global pointers, that enable the programmer to express ownership information for improving locality, one-sided communication, both put/get and RPC, futures and continuations. Futures capture data readiness state, which is useful in making scheduling decisions, and continuations provide for completion handling via callbacks. Together, these enable the programmer to chain together a DAG of operations to execute asynchronously as high-latency dependencies become satisfied.

H. Zhan, G. Gomes, X. S. Li, K. Madduri, A. Sim, K. Wu, "Consensus Ensemble System for Traffic Flow Prediction", IEEE Transactions on Intelligent Transportation Systems, 2018, doi: 10.1109/TITS.2018.2791505

Sean Peisert, Eli Dart, William K. Barnett, James Cuff, Robert L. Grossman, Edward Balas, Ari Berman, Anurag Shankar, Brian Tierney, "The Medical Science DMZ: An Network Design Pattern for Data-Intensive Medical Science", Journal of the American Medical Informatics Association (JAMIA), March 2018, 25(3):267-274, doi: 10.1093/jamia/ocx104

Tuowen Zhao, Mary Hall, Protonu Basu, Samuel Williams, Hans Johansen, "SIMD code generation for stencils on brick decompositions", Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), February 2018,

Bin Wang, John D Bachan, Cy P Chan, "ExaGridPF: A parallel power flow solver for transmission and unbalanced distribution systems", 2018 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), February 22, 2018,

M. S. Waibel, C. L. Hulbe, C. S. Jackson, D. F. Martin, "Rate of Mass Loss Across the Instability Threshold for Thwaites Glacier Determines Rate of Mass Loss for Entire Basin", Geophysical Resarch Letters, February 19, 2018, 45:809-816, doi: 10.1002/2017GL076470

Daniel F Martin, Xylar Asay-Davis, Jan De Rydt,, "Sensitivity of Ice-Ocean coupling to interactions with subglacial hydrology", AGU 2018 Ocean Sciences Meeting,, February 14, 2018,

Samuel Williams, Introduction to the Roofline Model, ECP Annual Meeting, February 8, 2018,

Protonu Basu, Using Empirical Roofline Toolkit and Nvidia nvprof, ECP Annual Meeting, February 8, 2018,

Samuel Williams, Advisor Hand-On: Stencil Example, ECP Annual Meeting, February 8, 2018,

Sean Peisert, Ciaran Roberts, Cyber Security of Power Distribution Systems Using Micro-Synchrophasor Measurements and Cyber-Reported SCADA, EPRI Power Delivery & Utilization Winter 2018 Program Advisory & Sector Council Meeting, February 7, 2018,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ and GASNet: PGAS Support for Exascale Apps and Runtimes", Poster at Exascale Computing Project (ECP) Annual Meeting 2018., February 2018,

Scott Baden, Dan Bonachea, Paul Hargrove, "GASNet-EX: PGAS Support for Exascale Apps and Runtimes", ECP Annual Meeting 2018, February 2018,

Samuel Williams, Performance Modeling and Analysis, CS267 lecture, University of California at Berkeley, January 30, 2018,

George Michelogiannakis, Open-Source Hardware in the Post Moore Era, NovelHPC: Beyond Exascale: Workshop on Novel HPC Architectures (HiPEAC 2018), January 2018,

George Michelogiannakis, An Architect’s Point of View of the Post Moore Era, 3rd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems (AISTECS with HiPEAC 2018), January 2018,

Kesheng Wu, Horst D Simon, "High-Performance Computational Intelligence and Forecasting Technologies", Lawrence Berkeley National Laboratory, 2018,

T. Kim, J. Choi, D. Lee, A. Sim, C. A. Spurlock, A. Todd, K. Wu, "Predicting Baseline for Analysis of Electricity Pricing", International Journal of Big Data Intelligence. Special issue on Data to Decision, 2018, 5:3-20, doi: 10.1504/IJBDI.2018.10008133

Meiyue Shao, Hasan Metin Aktulga, Chao Yang, Esmond G. Ng, Pieter Maris, James P. Vary, "Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver", Computer Physics Communications, 2018, 222:1--13, doi: 10.1016/j.cpc.2017.09.004

JL Vay, A Almgren, J Bell, L Ge, DP Grote, M Hogan, O Kononenko, R Lehe, A Myers, C Ng, J Park, R Ryne, O Shapoval, M Thévenet, W Zhang, "Warp-X: A new exascale computing platform for beam-plasma simulations", Nuclear Instruments and Methods in Physics Research, Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 2018, doi: 10.1016/j.nima.2018.01.035

M Zingale, AS Almgren, MG Barrios Sazo, VE Beckner, JB Bell, B Friesen, AM Jacobs, MP Katz, CM Malone, AJ Nonaka, DE Willcox, W Zhang, "Meeting the Challenges of Modeling Astrophysical Thermonuclear Explosions: Castro, Maestro, and the AMReX Astrophysics Suite", Journal of Physics: Conference Series, 2018, 1031, doi: 10.1088/1742-6596/1031/1/012024

Terry Benzel and Sean Peisert, Selected Papers from the 2017 IEEE Symposium on Security and Privacy [Guest editors' introduction], IEEE Security and Privacy, Pages: 10-11 January 2018,

C. E. Harris, P. E. Nugent, A. Horesh, J. S. Bright, R. P., M. L. Graham, K. Maguire, M. Smith, N., S. Valenti, A. V. Filippenko, O. Fox, A. Goobar, P. L. Kelly, K. J. Shen, Don't Blink: Constraining the Circumstellar Environment of the Interacting Type Ia Supernova 2015cp, Astrophysical Journal, Pages: 21 2018, doi: 10.3847/1538-4357/aae521

2017

George Michelogiannakis, John Shalf, "Last Level Collective Hardware Prefetching For Data-Parallel Applications", IEEE 24th International Conference on High Performance Computing, IEEE, December 2017,

George Michelogiannakis, John Shalf, Last Level Collective Hardware Prefetching For Data-Parallel Applications, IEEE 24th International Conference on High Performance Computing, December 18, 2017,

J. Wang, A. Sim, K. Wu, S. Hwangbo, "Accurate Signal Timing from High Frequency Streaming Data", 2017 IEEE International Conference on Big Data (Big Data 2017), 2017,

A. Lazar, L. Jin, A. Spurlock, A. Todd, K. Wu, A. Sim, "Data Quality Challenges with Missing Values and Mixed Types in Joint Sequence Analysis", Workshop in Data Quality Issues in Big Data and Machine Learning Applications: Going Beyond Data Cleaning and Transformations, in conjunction with the 2017 IEEE International Conference on Big Data (Big Data 2017), 2017, doi: 10.1109/BigData.2017.8258222

Shashanka Ubaru, Kesheng Wu, Kristofer E. Bouchard, "UoI-NMF Cluster: A Robust Nonnegative Matrix Factorization Algorithm for Improved Parts-Based Decomposition and Reconstruction of Noisy Data", the 16th IEEE International Conference on Machine Learning and Applications (ICMLA 2017), 2017, 241-248, doi: 10.1109/ICMLA.2017.0-152

J. Wang, K. Wu, A. Sim, S. Hwangbo, "Feature Engineering and Classification Models for Partial Discharge Events in Power Transformers", 10th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2017), 2017,

P. Harrington, W. Yoo (Advisor), A. Sim (Advisor), K. Wu (Advisor), "Diagnosing Parallel I/O Bottlenecks in HPC Applications", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’17), ACM Student Research Competition (SRC), First place winner, 2017,

Glenn Lockwood, Shane Snyder, Wucherl Yoo, Kevin Harms, Zachary Nault, Suren Byna, Philip Carns, Nicholas Wright, "UMAMI: A Recipe for Generating Meaningful Metrics through Holistic I/O Performance Analysis", 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS), 2017 (Held in conjunction with SC17), November 14, 2017,

Tzu-Hsien Wu, Jerry Chou, Shyng Hao, Bin Dong, KeshengWu, Scott Klasky, "Optimizing the Query Performance of Block Index Through Data Analysis and I/O Modeling", The International Conference for High Performance Computing, Networking, Storage and Analysis (SC'17), November 13, 2017,

Yang You, Aydin Buluc, James Demmel, "Scaling deep learning on GPU and Knights Landing clusters", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'17), 2017,

John Bachan, Dan Bonachea, Paul H Hargrove, Steven Hofmeyr, Mathias Jacquelin, Amir Kamil, Brian Van Straalen, Scott Baden, "The UPC++ PGAS library for exascale computing", PAW 2017: 2nd Annual PGAS Applications Workshop - Held in conjunction with SC 2017, November 12, 2017, doi: 10.1145/3144779.3169108

We describe UPC++ V1.0, a C++11 library that supports APGAS programming. UPC++ targets distributed data structures where communication is irregular or fine-grained. The key abstractions are global pointers, asynchronous programming via RPC, and futures. Global pointers incorporate ownership information useful in optimizing for locality. Futures capture data readiness state, are useful for scheduling and also enable the programmer to chain operations to execute asynchronously as high-latency dependencies become satisfied, via continuations. The interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and closely resemble those used in modern C++. Communication in UPC++ runs at close to hardware speeds by utilizing the low-overhead GASNet-EX communication library.

Samuel Williams, Introduction to the Roofline Model, Roofline Training, November 2017,

B. Van Straalen, D. Trebotich, A. Ovsyannikov and D.T. Graves, "Scalable Structured Adaptive Mesh Refinement with Complex Geometry", Exascale Scientific Applications: Programming Approaches for Scalability, Performance, and Portability, edited by Straatsma, T., Antypas, K., Williams, T., (Chapman and Hall/CRC: November 9, 2017)

J. Wang, K. Wu, A. Sim, S. Hwangbo, "Convolutional Filtering for Accurate Signal Timing from Noisy Streaming Data", 3rd IEEE International Conference on Big Data Intelligence and Computing (DataCom2017), 2017, doi: 10.1109/DASC-PICom-DataCom-CyberSciTec.2017.157

David Hatchell, Patrick Miller, Michael Coleman, Sean Peisert, Cybersecurity for the Electricity Grid", Bits & Watts Annual Conference, November 6, 2017,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian Van Straalen, "UPC++: a PGAS C++ Library", ACM/IEEE Conference on Supercomputing, SC'17, November 2017,

Erik Paulson, Dan Bonachea, Paul Hargrove, GASNet ofi-conduit, Presentation at the Open Fabrics Interface BoF at Supercomputing 2017, November 2017,

Philip C. Roth, Hongzhang Shan, David Riegner, Nikolas Antolin, Sarat Sreepathi, Leonid Oliker, Samuel Williams, Shirley Moore, Wolfgang Windl, "Performance Analysis and Optimization of the RAMPAGE Metal Alloy Potential Generation Software", SIGPLAN International Workshop on Software Engineering for Parallel Systems (SEPS), October 2017,

Sean Peisert, Security in High Performance Computing Environments, Computing Sciences/NERSC Security Seminar, October 5, 2017,

Meiyue Shao and Chao Yang, "Properties of Definite Bethe--Salpeter Eigenvalue Problems", Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing. EPASA 2015. Lecture Notes in Computational Science and Engineering, vol 117., 2017, 91--105, doi: 10.1007/978-3-319-62426-6_7

Sean Peisert, Matt Bishop, Ed Talbot,, "A Model of Owner Controlled, Full-Provenance, Non-Persistent, High-Availability Information Sharing", Proceedings of the 2017 New Security Paradigms Workshop (NSPW), Santa Cruz, CA, October 2017, 80-89, doi: 10.1145/3171533.3171536

Sean Peisert, Security and Privacy in Data-Intensive, High-Performance Computing Contexts, Berkeley Institute for Data Science (BIDS), October 2, 2017,

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ Programmer’s Guide, v1.0-2017.9", Lawrence Berkeley National Laboratory Tech Report, September 29, 2017, LBNL 2001065, doi: 10.2172/1398522

This document has been superseded by: UPC++ Programmer’s Guide, v1.0-2018.3.0 (LBNL-2001136)

UPC++ is a C++11 library that provides Asynchronous Partitioned Global Address Space (APGAS) programming. It is designed for writing parallel programs that run efficiently and scale well on distributed-memory parallel computers. The APGAS model is single program, multiple-data (SPMD), with each separate thread of execution (referred to as a rank, a term borrowed from MPI) having access to local memory as it would in C++. However, APGAS also provides access to a global address space, which is allocated in shared segments that are distributed over the ranks. UPC++ provides numerous methods for accessing and using global memory. In UPC++, all operations that access remote memory are explicit, which encourages programmers to be aware of the cost of communication and data movement. Moreover, all remote-memory access operations are by default asynchronous, to enable programmers to write code that scales well even on hundreds of thousands of cores.

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Bryce Lelbach, Brian van Straalen,, "UPC++ Specification v1.0, Draft 4", Lawrence Berkeley National Laboratory Tech Report, September 27, 2017, LBNL 2001066, doi: 10.2172/1398521

This document has been superseded by: UPC++ Specification v1.0, Draft 6 (LBNL-2001135)

UPC++ is a C++11 library providing classes and functions that support Asynchronous Partitioned Global Address Space (APGAS) programming. We are revising the library under the auspices of the DOE’s Exascale Computing Project, to meet the needs of applications requiring PGAS support. UPC++ is intended for implementing elaborate distributed data structures where communication is irregular or fine-grained. The UPC++ interfaces for moving non-contiguous data and handling memories with different optimal access methods are composable and similar to those used in conventional C++. The UPC++ programmer can expect communication to run at close to hardware speeds. The key facilities in UPC++ are global pointers, that enable the programmer to express ownership information for improving locality, one-sided communication, both put/get and RPC, futures and continuations. Futures capture data readiness state, which is useful in making scheduling decisions, and continuations provide for completion handling via callbacks. Together, these enable the programmer to chain together a DAG of operations to execute asynchronously as high-latency dependencies become satisfied.

Jonathan Ganz, Sean Peisert, "ASLR: How Robust is the Randomness", Proceedings of the IEEE Secure Development Conference (SecDev), Cambridge, MA, IEEE Computer Society, September 24, 2017, doi: 10.1109/SecDev.2017.19

Mahdi Jamei, Anna Scaglione, Ciaran Roberts, Emma Stewart, Sean Peisert, Chuck McParland, Alex McEachern, "Anomaly Detection Using μPMU Measurements in Distribution Grids", IEEE Transactions on Power Systems, 2017, doi: 10.1109/TPWRS.2017.2764882

Marquita Ellis, Evangelos Georganas, Rob Egan, Steven Hofmeyr, Aydin Buluc, Brandon Cook, Leonid Oliker, Katherine Yelick, "Performance characterization of de novo genome assembly on leading parallel systems", Europar - International European Conference on Parallel and Distributed Computing, 2017,

Houjun Tang, Suren Byna, Bin Dong, Jialin Liu, and Quincey Koziol, "SoMeta: Scalable Object-centric Metadata Management for High Performance Computing", IEEE Cluster 2017, September 5, 2017,

Sean Peisert, "Security in High-Performance Computing Environments", Communications of the ACM (CACM), September 2017, 60(9):72-80, doi: 10.1145/3096742

Jose Oñorbe, Joseph F. Hennawi, Zarija Lukić, and Michael Walther, "Constraining Reionization with the z ~ 5-6 Lyman-alpha Forest Power Spectrum: the Outlook after Planck", The Astrophysical Journal, 2017,

Frederick B. Davies, Joseph F. Hennawi, Anna-Christina Eilers, and Zarija Lukić, "A New Method to Measure the Post-Reionization Ionizing Background from the Joint Distribution of Lyman-alpha and Lyman-beta Forest Transmission", The Astrophysical Journal, 2017,

Jack Deslippe, Doug Doerfler, Brandon Cook, Tareq Malas, Samuel Williams, Sudip Dosanjh, "Optimizing science applications for the Cori, Knights Landing, System at NERSC", Advances in Parallel Computing, New Frontiers in High Performance Computing and Big Data, August 2017, 30, doi: 10.3233/978-1-61499-816-7-235

Dan Bonachea, Paul Hargrove, "GASNet Specification, v1.8.1", Lawrence Berkeley National Laboratory Tech Report, August 31, 2017, LBNL 2001064, doi: 10.2172/1398512

GASNet is a language-independent, low-level networking layer that provides network-independent, high-performance communication primitives tailored for implementing parallel global address space SPMD languages and libraries such as UPC, UPC++, Co-Array Fortran, Legion, Chapel, and many others. The interface is primarily intended as a compilation target and for use by runtime library writers (as opposed to end users), and the primary goals are high performance, interface portability, and expressiveness. GASNet stands for "Global-Address Space Networking".

Daniel Martin, Stephen Cornford, Antony Payne, Millennial-Scale Vulnerability of the Antarctic Ice Sheet to localized subshelf warm-water forcing, International Symposium on Polar Ice, Polar Climate, Polar Change, August 18, 2017,

K. Wu, D. Lee, A. Sim, J. Choi, "Statistical Data Reduction for Streaming Data", 2017 New York Scientific Data Summit (NYSDS), Data-Driven Discovery in Science and Industry, 2017, doi: 10.1109/NYSDS.2017.8085035

Jinoh Kim, Alex Sim, "A New Approach to Online, Multivariate Network Traffic Analysis", 2nd Workshop on Network Security Analytics and Automation (NSAA), in conjunction with the 26th International Conference on Computer Communications and Networks (ICCCN 2017), 2017, doi: 10.1109/ICCCN.2017.8038520

Dilip Vasudevan, George Michelogiannakis, John Shalf, "CASPER - Configurable Design Space Exploration of Programmable Architectures for Machine Learning using Beyond Moore Devices", IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), July 2017,

Mahdi Jamei, Anna Scaglione, Ciaran Roberts, Alex McEachern, Emma Stewart, Sean Peisert, Chuck McParland, "Online Thevenin Parameter Tracking Using Synchrophasor Data", Proceedings of the 2017 IEEE Power Engineering Society (PES) General Meeting (GM), Chicago, IL, IEEE, July 2017,

Gunther H. Weber, Mark S. Bandstra, Daniel Chivers, Hamdy H. Elgammal, Hendrix, John Kua, Jonathan Maltz, Krishna Muriki, Yeongshnn Ong, Song, Michael Quinlan, Lavanya Ramakrishnan, Brian J. Quiter, "Web-based Visual Data Exploration for Improved Radiological Source Detection", Concurrency and Computation: Practice and Experience, 2017, 29 (18):e4203, doi: 10.1002/cpe.4203

Sugeerth Murugesan, Kristofer Bouchard, Jesse A. Brown, Bernd Hamann, William W. Seeley, Andrew Trujillo, Gunther H. Weber, "Brain Modulyzer: Interactive Visual Analysis of Functional Brain Connectivity", IEEE Transactions on Computational Biology and Bioinformatics, 2017, 14(4):805-818, LBNL 1005732, doi: 10.1109/TCBB.2016.2564970

Nishant Nangia, Hans Johansen, Neelesh A. Patankar, Amneet Pal Singh Bhalla, "A moving control volume approach to computing hydrodynamic forces and torques on immersed bodies", Journal of Computational Physics, June 29, 2017, doi: 10.1016/j.jcp.2017.06.047

Dongeun Lee, Alex Sim, Jaesik Choi, Kesheng Wu, "Improving Statistical Similarity Based Data Reduction for Non-Stationary Data", 29th International Conference on Scientific and Statistical Database Management (SSDBM2017), 2017, doi: 10.1145/3085504.3085583

Updated experiment version: https://sdm.lbl.gov/oapapers/ssdbm17-lee-upd.pdf
Original version: http://dl.acm.org/citation.cfm?doid=3085504.3085583

Bin Dong, Kesheng Wu, Surendra Byna, Jialin Liu, Weijie Zhao, Florin Rusu, "ArrayUDF: User-Defined Scientific Data Analysis on Arrays", The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2017 (Acceptance rate:19%), June 26, 2017,

Alberto Gonzalez, Jason Leigh, Sean Peisert, Brian Tierney, Andrew Lee, Jennifer M. Schopf, "Big Data and Analysis of Data Transfers for International Research Networks Using NetSage", Proceedings of IEEE BigData Congress 2017, Honolulu, Hawaii, June 2017, doi: 10.1109/BigDataCongress.2017.51

Dilip Vasudevan, Anastasiia Butko, George Michelogiannakis, David Donofrio, John Shalf, "Towards an Integrated Strategy to Preserve Digital Computing Performance Scaling Using Emerging Technologies", Workshop on HPC computing in a Post Moore’s law world (HCPM), June 22, 2017,

With the decline and eventual end of historical rates of lithographic scaling, we arrive at a crossroad where synergistic and holistic decisions are required to preserve Moore's law technology scaling. Numerous emerging technologies aim to extend digital electronics scaling of performance, energy efficiency, and computational power/density,
ranging from devices (transistors), memories, 3D integration capabilities, specialized architectures, photonics, and others.
The wide range of technology options creates the need for an integrated strategy to understand the impact of these emerging technologies on future large-scale digital systems for diverse application requirements and optimization metrics.
In this paper, we argue for a comprehensive methodology that spans the different levels of abstraction -- from materials, to devices, to complex digital systems and applications. Our approach integrates compact models of low-level characteristics of the emerging technologies to inform higher-level simulation models to evaluate their responsiveness to application requirements.
The integrated framework can then automate the search for an optimal architecture using available emerging technologies to maximize a targeted optimization metric.

Sean Peisert, Mike Corn, Dewight Kramer, David Rusting, Tye Stallard, The Role of the WAN and the Community to Improve Security, 2017 UC Information Security Symposium,, June 21, 2017,

Galen Rasche, Jenna Goodward, Sheeraz Haji, Gabriel Paun, Sean Peisert, Managing Energy: Role of Data and Security, Prospect Silicon Valley 2017 Innovation and Impact Symposium, June 14, 2017,

Sugeerth Murugesan, Kristofer Bouchard, Edward Chang, Dougherty, Bernd Hamann, Gunther H. Weber, "Multi-scale Visual Analysis of Time-varying Electrocorticography Data Clustering of Brain Regions", BMC Bioinformatics, 2017, 18:236, doi: 10.1186/s12859-017-1633-9

Bryce Adelstein Lelbach, Hans Johansen, Samuel Williams, "Simultaneously Solving Swarms of Small Sparse Systems on SIMD Silicon", Parallel and Distributed Scientific and Engineering Computing (PDSEC), June 2017,

Hongzhang Shan, Samuel Williams, Calvin Johnson, Kenneth McElvain, "A Locality-based Threading Algorithm for the Configuration-Interaction Method", Parallel and Distributed Scientific and Engineering Computing (PDSEC), June 2017,

Brandon Cook, Thorsten Kurth, Brian Austin, Samuel Williams, Jack Deslippe, "Performance Variability on Xeon Phi", Intel Xeon Phi Users Group (IXPUG), June 2017,

Thorsten Kurth, William Arndt, Taylor Barnes, Brandon Cook, Jack Deslippe, Doug Doerfler, Brian Friesen, Yun (Helen) He, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey Ovsyannikov, Samuel Williams, Woo-Sun Yang, and Zhengji Zhao, "Analyzing Performance on Selected NESAP Applications on the Cori HPC System", Intel Xeon Phi Users Group (IXPUG), June 2017,

Jonas Lukasczyk, Ross Maciejewski, Gunther H. Weber, Garth, Heike Leitte, "Nested Tracking Graphs", Computer Graphics Forum (Special Issue, Proceedings Symposium on Visualization), 2017, 36 (3):12--22, doi: 10.1111/cgf.13164

Ariful Azad, Aydin Buluc, "Towards a GraphBLAS Library in Chapel", IPDPS Workshops, Orlando, FL, May 2017,

Aydin Buluc, Tim Mattson, Scott McMillan, Jose Moreira, Carl Yang, "Design of the GraphBLAS API for C", IEEE Workshop on Graph Algorithm Building Blocks, IPDPSW, 2017,

Ariful Azad, Mathias Jacquelin, Aydin Buluc, Esmond G. Ng, "The Reverse Cuthill-McKee Algorithm in Distributed-Memory", IEEE International Parallel & Distributed Processing Symposium (IPDPS), Orlando, FL, May 2017,

Ariful Azad, Aydin Buluc, "A work-efficient parallel sparse matrix-sparse vector multiplication algorithm", IEEE International Parallel & Distributed Processing Symposium (IPDPS), Orlando, FL, May 2017,

Nathan Zhang, Michael Driscoll, Armando Fox, Charles Markley, Samuel Williams, Protonu Basu, "Snowflake: A Lightweight Portable Stencil DSL", High-level Parallel Programming Models and Supportive Environments (HIPS), May 2017,

Jonathan Wang, Wucherl Yoo, Alex Sim, Peter Nugent, K. John Wu, "Parallel Variable Selection for Effective Performance Prediction", the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid2017), 2017, doi: 10.1109/CCGRID.2017.47

George Michelogiannakis, Khaled Z. Ibrahim, John Shalf, Jeremiah J. Wilke, Samuel Knight, Joseph P. Kenny, "APHiD: Hierarchical Task Placement to Enable a Tapered Fat Tree Topology for Lower Power and Cost in HPC Networks", 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, IEEE, May 2017, LBNL 1007126,

Weijie Zhao, Florin Rusu, Bin Dong, Kesheng Wu, and Peter Nugent, "Incremental View Maintenance over Array Data", In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17) (Acceptance rate: 20%). ACM, New York, NY, USA, May 14, 2017,

Dharshi Devendran, Daniel T. Graves, Hans Johansen,Terry Ligocki, "A Fourth Order Cartesian Grid Embedded Boundary Method for Poisson's Equation", Communications in Applied Mathematics and Computational Science, edited by Silvio Levy, May 12, 2017, 12:51-79, doi: DOI 10.2140/camcos.2017.12.51

Anna-Christina Eilers, Frederick B. Davies, Joseph F. Hennawi, J. Xavier Prochaska, Zarija Lukić, and Chiara Mazzucchelli, "Implications of z ~ 6 Quasar Proximity Zones for the Epoch of Reionization and Quasar Lifetimes", The Astrophsyical Journal, 2017, 840:24,

Bei Wang, Stephane Ethier, William Tang, Khaled Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker, "Modern Gyrokinetic Particle-in-cell Simulation of Fusion Plasmas on Top Supercomputers", International Journal of High-Performance Computing Applications (IJHPCA), May 2017, doi: https://doi.org/10.1177/1094342017712059

Sean Peisert, Reinhard Gentz, Joshua Boverhof, Chuck McParland, Sophie Engle, Abdelrahman Elbashandy, and Dan Gunter, "LBNL Open Power Data", LBNL Technical Report, May 2017, doi: 10.21990/C21599

Alberto Rorai, Joseph F. Hennawi, Jose Oñorbe, Martin White, J. Xavier Prochaska, Girish Kulkarni, Michael Walther, Zarija Lukić, and Khee-Gan Lee, "Measurement of the small-scale structure of the intergalactic medium using close quasar pairs", Science, 2017, 356:418,

Gunther H. Weber, Sheelagh Carpendale, David Ebert, Brian Fisher Hans Hagen, Ben Shneiderman, Anders Ynnerman, "Apply or Die: On the Role and Assessment of Application Papers in", IEEE Computer Graphics \& Applications, 2017, 37 (3):96--104, doi: 10.1109/MCG.2017.51

ellis simulation

Sergi Molins, David Trebotich, Gregory H. Miller, Carl I. Steefel, "Mineralogical and transport controls on the evolution of porous media texture using direct numerical simulation", Water Resources Research, April 7, 2017, doi: 10.1002/2016WR020323

Protonu Basu, Samuel Williams, Brian Van Straalen, Leonid Oliker, Phillip Colella, Mary Hall, "Compiler-Based Code Generation and Autotuning for Geometric Multigrid on GPU-Accelerated Supercomputers", Parallel Computing (PARCO), April 2017, doi: 10.1016/j.parco.2017.04.002

Dongeun Lee, Alex Sim, Jaesik Choi, Kesheng Wu, "Expanding Statistical Similarity Based Data Reduction to Capture Diverse Patterns", Data Compression Conference (DCC 2017), 2017,

Stone, D. A., H. Krishnan, R. Lance, S. Sippel, and M. F. Wehner, "The First and Second Hackathons of the International CLIVAR C20C+ Detection and Attribution Project", CLIVAR Exchanges, 2017,

Sean Peisert, Greg Bell, Anita Nikolich, Von Welch, Cybersecurity: New Directions for Research and Education - Your own safety is at stake when your neighbor's wall is ablaze. (—Horace), CENIC Annual Conference — The Right Connection ¦ CENIC 2.0, March 22, 2017,

Jose Oñorbe, Joseph F. Hennawi, and Zarija Lukić, "Self-consistent Modeling of Reionization in Cosmological Hydrodynamical Simulations", The Astrophysical Journal, 2017, 837:106,

Leon J. Osterweil, Matt Bishop, Heather M. Conboy, Huong Phan, Borislava I. Simidchieva, George S. Avrunin, Lori A. Clarke, Sean Peisert, "A Comprehensive Framework for Using Iterative Analysis to Improve Human-Intensive Process Security: An Election Example", ACM Transactions on Privacy and Security (TOPS), 2017, 20(2), doi: https://doi.org/10.1145/3041041

Alex Krolewski, Khee-Gan Lee, Zarija Lukić, and Martin White, "Measuring Alignments between Galaxies and the Cosmic Web at z ~ 2-3 Using IGM Tomography", The Astrophysical Journal, 2017, 837:31,

Khaled Z. Ibrahim, Evgeny Epifanovsky, Samuel Williams, Anna I. Krylov, "Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends", Journal of Parallel and Distributed Computing (JPDC), February 2017, doi: 10.1016/j.jpdc.2017.02.010

Ma, S., T. Zhou, D. A. Stone, D. Polson, A. Dai, P. A. Stott, H. von Storch, Y. Qian, C. Burke, P. Wu, L. Zou, and A. Ciavarella, "Detectable anthropogenic shift toward heavy precipitation over eastern China", Journal of Climate, 2017, 30:1381-1396, doi: 10.1175/JCLI-D-16-0311.1

Timmermans, B., D. Stone, M. Wehner, and H. Krishnan, "Impact of tropical cyclones on modeled wind-wave climate", Geophysical Research Letters, 2017, 44:1393-1401, doi: 10.1002/2016GL071681

Sean Peisert, Von Welch, Andrew Adams, Michael Dopheide, Susan Sons, RuthAnne Bevier, Rich LeDuc, Pascal Meunier, Stephen Schwab, and Karen Stocks, Ilkay Altintas, James Cuff, Reagan Moore, Warren Raquel, "Open Science Cyber Risk Profile", February 10, 2017, doi: 2022/21259

Mitchell, D., K. AchutaRao, M. Allen, I. Bethke, U. Beyerle, A. Ciavarella, P. M. Forster, J. Fuglestvedt, N. Gillett, K. Haustein, W. Ingram, T. Iverson, V. Kharin, N. Klingaman, N. Massey, E. Fischer, C.-F. Schleussner, J. Scinocca, O. Seland, H. Shiogama, E. Shuckburgh, S. Sparrow, D. Stone, P. Uhe, D. Wallom, M. Wehner, and R. Zaaboul, "Half a degree additional warming, prognosis and projected impacts (HAPPI): background and experimental design", Geoscientific Model Development, 2017, 10:571-583, doi: 10.5194/gmd-10-571-2017

Ling Jin, Doris Lee, Alex Sim, Sam Borgeson, John Wu, Anna Spurlock, Annika Todd, "Comparison of Clustering Techniques for Residential Energy Behavior using Smart Meter Data", 2nd International Workshop on Artificial Intelligence for Smart Grids and Smart Buildings, In conjunction with AAAI 2017, 2017,

J. Kim, A. Sim, S.C. Suh, I. Kim, "An Approach to Online Network Monitoring Using Clustered Patterns", International Conference on Computing, Networking and Communications (ICNC 2017), 2017, doi: 10.1109/ICCNC.2017.7876207

J. Kim, W. Yoo, A. Sim, S.C. Suh, I. Kim, "A Lightweight Network Anomaly Detection Technique", International Workshop on Computing, Networking and Communications (CNC 2017), 2017, doi: 10.1109/ICCNC.2017.7876251

Richard LeDuc, Sean Peisert, Karen Stocks, Von Welch, Open Science Cyber Risk Profile (OSCRP), National Science Foundation Cybersecurity Center of Excellence (CCoE) Webinar Series, January 23, 2017,

Mahdi Jamei, Anna Scaglione, Ciaran Roberts, Emma Stewart, Sean Peisert, Chuck McParland, Alex McEachern, "Automated Anomaly Detection in Distribution Grids Using µPMU Measurements", Proceedings of the 50th Hawaii International Conference on System Sciences (HICSS), Electric Energy Systems Track, Resilient Networks Minitrack, IEEE, January 2017, doi: http://hdl.handle.net/10125/41543

E. Wes Bethel, In Situ Processing Overview and Relevance to the HPC Community, SIAM Conference on Computational Science and Engineering, MS74: In Situ Methods and Infrastructures: Faster Insight Through Smarter Computing, 2017,

E. Vecharynski and C. Yang, "Preconditioned iterative methods for eigenvalue counts", Lecture Notes in Computational Science, January 1, 2017,

Angélil, O., D. Stone, M. Wehner, C. J. Paciorek, H. Krishnan, W. Collins, "An independent assessment of anthropogenic attribution statements for recent extreme temperature and rainfall events", Journal of Climate, 2017, 30:5-16, doi: 10.1175/JCLI-D-16-0077.1

MN Farooqi, D Unat, T Nguyen, W Zhang, A Almgren, J Shalf, "Nonintrusive AMR asynchrony for communication optimization", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), January 1, 2017, 10417 LN:682--694, doi: 10.1007/978-3-319-64203-1_49

Esmond Ng, Katherine J. Evans, Peter Caldwell, Forrest M. Hoffman, Charles Jackson, Kerstin Van Dam, Ruby Leung, Daniel F. Martin, George Ostrouchov, Raymond Tuminaro, Paul Ullrich, Stefan Wild, Samuel Williams, "Advances in Cross-Cutting Ideas for Computational Climate Science (AXICCS)", January 2017, doi: 10.2172/1341564

T Nguyen, D Unat, W Zhang, A Almgren, N Farooqi, J Shalf, "Perilla: Metadata-Based Optimizations of an Asynchronous Runtime for Adaptive Mesh Refinement", International Conference for High Performance Computing, Networking, Storage and Analysis, SC, January 1, 2017, 945--956, doi: 10.1109/SC.2016.80

John Bachan, Scott Baden, Dan Bonachea, Paul Hargrove, Steven Hofmeyr, Khaled Ibrahim, Mathias Jacquelin, Amir Kamil, Brian van Straalen, "UPC++ and GASNet: PGAS Support for Exascale Apps and Runtimes", Poster at Exascale Computing Project (ECP) Annual Meeting 2017., January 2017,

2016

M. Jacquelin, L. Lin and C. Yang, "A Distributed Memory Parallel Algorithm for Selected Inversion: the non-symmetric case", PMAA, December 30, 2016,

Bin Dong, Suren Byna, Kesheng Wu, Prabhat, Hans Johansen, Jeffrey N. Johnson, and Noel Keen, "Data Elevator: Low-contention Data Movement in Hierarchical Storage System", The 23rd annual IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC) (Acceptance rate: 25%), December 19, 2016,

E. Vecharynski, A. Knyazev, "Preconditioned steepest descent-like methods for symmetric indefinite systems", Linear Algebra and its Applications, Vol. 511, pp. 274–295, 2016,

We construct preconditioned steepest descent (PSD)-like methods for iterative solution of symmetric indefinite linear systems using symmetric and positive definite (SPD) preconditioners. Our construction is based on a locally optimal residual minimization over two-dimensional subspaces, mathematically equivalent in exact arithmetic to preconditioned MINRES (PMINRES) restarted after every two steps. A convergence bound is derived. If certain information on the spectrum of the preconditioned system is available, we present a simpler PSD-like algorithm that performs only one-dimensional residual minimization. Search direction randomization for accelerating this algorithm is discussed. Our primary goal is to bridge the theoretical gap between the optimal (PMINRES) and PSD-like methods for solving symmetric indefinite systems. We also demonstrate situations where the suggested PSD-like schemes can be preferable to the optimal PMINRES iteration. 

M. Wehner, D. Stone, H. Krishnan, K. AchutaRao, F. Castillo, "The deadly combination of heat and humidity in India and Pakistan in summer 201", Bulletin of the American Meteorological Society, 2016, 97:S81-S86, doi: 10.1175/BAMS-D-16-0145.2

K. A. Lawal, A. A. Abatan, O. Ang\'elil, E. Olaniyan, V. H. Olusoji, P. G. Oguntunde, B. Lamptey, B. J. Abiodun, H. Shiogama, M. F. Wehner, D. A. Stone, "The late onset of the 2015 wet season in Nigeria", Bulletin of the American Meteorological Society, 2016, 97:S63-S69, doi: 10.1175/BAMS-D-16-0131.2

Sean Peisert, Overcoming Security and Privacy Challenges in Computing and Networking in Medical Research Environments, Department of Public Health Sciences, University of California, Davis School of Medicine,, December 14, 2016,

Wenzhao Zhang, Houjun Tang, Stephen Ranshous, Surendra Byna, Daniel F Martín, Kesheng Wu, Bin Dong, Scott Klasky, Nagiza F Samatova, "Exploring memory hierarchy and network topology for runtime AMR data sharing across scientific applications", 2016 IEEE International Conference on Big Data (Big Data) (Acceptance rate: 19.39% as short papers.), December 5, 2016,

R. Atta-Fynn, E.J. Bylaska, W.A. de Jong, "Strengthening of the Coordination Shell by Counter Ions in Aqueous Th4+ Solutions", J. Phys. Chem. A, December 1, 2016, 120:10216–1022, doi: 10.1021/acs.jpca.6b09878

S.V. Venkatakrishnan, Jeffrey Donatelli, Dinesh Kumar, Abhinav Sarje, Sunil K. Sinha, Xiaoye S. Li, Alexander Hexemer, "A Multi-slice Simulation Algorithm for Grazing-Incidence Small-Angle X-ray Scattering", Journal of Applied Crystallography, December 2016, 49-6, doi: 10.1107/S1600576716013273

Grazing-incidence small-angle X-ray scattering (GISAXS) is an important technique in the characterization of samples at the nanometre scale. A key aspect of GISAXS data analysis is the accurate simulation of samples to match the measurement. The distorted-wave Born approximation (DWBA) is a widely used model for the simulation of GISAXS patterns. For certain classes of sample such as nanostructures embedded in thin films, where the electric field intensity variation is significant relative to the size of the structures, a multi-slice DWBA theory is more accurate than the conventional DWBA method. However, simulating complex structures in the multi-slice setting is challenging and the algorithms typically used are designed on a case-by-case basis depending on the structure to be simulated. In this paper, an accurate algorithm for GISAXS simulations based on the multi-slice DWBA theory is presented. In particular, fundamental properties of the Fourier transform have been utilized to develop an algorithm that accurately computes the average refractive index profile as a function of depth and the Fourier transform of the portion of the sample within a given slice, which are key quantities required for the multi-slice DWBA simulation. The results from this method are compared with the traditionally used approximations, demonstrating that the proposed algorithm can produce more accurate results. Furthermore, this algorithm is general with respect to the sample structure, and does not require any sample-specific approximations to perform the simulations.

Sam Fries, Sasha Ames, Alex Sim, Dean Williams, "HPSS Connections to ESGF: BASEJumper", 2016 Earth System Grid Federation (ESGF) Conference, 2016,

Cy Chan, John Bachan, Joseph Kenny, Jeremiah Wilke, Vincent Beckner, Ann Almgren, John Bell, "Topology-Aware Performance Optimization and Modeling of Adaptive Mesh Refinement Codes for Exascale", (BEST PAPER AWARD) COMHPC 2016 - SC16 Workshop on Communication Optimization in High Performance Computing, Salt Lake City, UT, November 18, 2016,

Best Paper Award

Utkarsh Ayachit, Andrew Bauer, Earl P. N. Duque, Greg Eisenhauer, Nicola Ferrier, Junmin Gu, Kenneth E. Jansen, Burlen Loring, Zarija Lukić, Suresh Menon, Dmitriy Morozov, Patrick O'Leary, Reetesh Ranjan, Michel Rasquin, Christopher P. Stone, Venkat Vishwanath, Gunther H. Weber, Brad Whitlock, Matthew Wolf, K. John Wu, E. Wes Bethel, "Performance analysis, design considerations, and applications of extreme-scale in situ infrastructures", Supercomputing, 2016, 921-932, LBNL 1007264, doi: 10.1109/SC.2016.78

Mark Adams, Samuel Williams, HPGMG BoF - Introduction, HPGMG BoF, Supercomputing, November 2016,

Samuel Williams, HPGMG on the Knights Landing Processor, HPGMG BoF, Supercomputing, November 2016,

J. Wang, W. Yoo (Advisor), A. Sim (Advisor), K. Wu (Advisor), "Analysis of Variable Selection Methods on Scientific Cluster Measurement Data", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’16), ACM Student Research Competition (SRC), Second place winner, 2016, 2016,

M. Bae, W. Yoo (Advisor), A. Sim (Advisor), K. Wu (Advisor), "Discovering Energy Resource Usage Patterns on Scientific Clusters", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’16), ACM Student Research Competition (SRC), Third place winner, 2016, 2016,

M. Bryson, S. Byna (Advisor), A. Sim (Advisor), K. Wu (Advisor), "The Search for Missing Parallel IO Performance on the Cori Supercomputer", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’16), ACM Student Research Competition (SRC), 2016,

Samuel Williams, HPGMG Benchmark, Top500 BoF, Supercomputing, November 2016,

William Tang, Bei Wang, Stephane Ethier, Grzegorz Kwasniewski, Torsten Hoefler, Khaled Z. Ibrahim4, Kamesh Madduri, Samuel Williams, Leonid Oliker, Carlos Rosales-Fernandez, Tim Williams, "Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide", Supercomputing, November 2016,

Taylor Barnes, Brandon Cook, Jack Deslippe, Douglas Doerfler, Brian Friesen, Yun (Helen) He, Thorsten Kurth, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey Ovsyannikov, Abhinav Sarje, Jean-Luc Vay, Henri Vincenti, Samuel Williams, Pierre Carrier, Nathan Wichmann, Marcus Wagner, Paul Kent, Christopher Kerr, John Dennis, "Evaluating and Optimizing the NERSC Workload on Knights Landing", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), November 2016,

Utkarsh Ayachit, Andrew Bauer, Earl P. N. Duque, Greg Eisenhauer, Nicola Ferrier, Junmin Gu, Kenneth Jansen, Burlen Loring, Zarija Luki\ c, Suresh Menon, Dmitriy Morozov, Patrick O Leary, Michel Rasquin, Christopher P. Stone, Venkat Vishwanath, Gunther H. Weber, Brad Whitlock, Matthew Wolf, K. John Wu, E. Wes Bethel, "Performance Analysis, Design Considerations, and Applications of Extreme-scale In Situ Infrastructures", ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), Salt Lake City, UT, USA, 2016, doi: 10.1109/SC.2016.78

George Michelogiannakis, David Donofrio, John Shalf, Modeling of Novel Transistors, Manufacturing Technologies, and Architectures to Preserve Digital Computing Performance Scaling, Post-Moore's Era Supercomputing (PMES) Workshop, November 2016,

George Michelogiannakis, Dave Donofrio, John Shalf, "Modeling of Novel Transistors, Manufacturing Technologies, and Architectures to Preserve Digital Computing Performance Scaling", 1ST INTERNATIONAL WORKSHOP ON POST-MOORE’S ERA SUPERCOMPUTING (PMES), November 2016,

Andrew C. Bauer, Kenneth E. Jansen, E. Wes Bethel, Utkarsh Ayachit, Michel Rasquin, Benjamin Matthews, Steve Jordan, "In Situ Analysis and Visualization at Scale with PHASTA and ParaView Catalyst on Mira and Theta", SC16 Scientific Visualization Showcase, 2016,

Utkarsh Ayachit, Brad Whitlock, Matthew Wolf, Burlen Loring, Berk Geveci, David Lonie, E. Wes Bethel, "The SENSEI Generic In Situ Interface", Proceedings of In Situ Infrastructures for Enabling Extreme-scale Analysis and Visualization (ISAV 2016), Salt Lake City, UT, USA, 2016,

Jared O. Ferguson, Christiane Jablonowski, Hans Johansen, Peter McCorquodale, Phillip Colella, Paul A. Ullrich, "Analyzing the adaptive mesh refinement (AMR) characteristics of a high-order 2D cubed-sphere shallow-water model", Mon. Wea. Rev., November 9, 2016, 144:4641–4666, doi: 10.1175/MWR-D-16-0197.1

Ariful Azad, Grey Ballard, Aydin Buluc, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams, "Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication", SIAM Journal on Scientific Computing, 38(6), C624–C651, November 2016, doi: 10.1137/15M104253X

Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, Costin Iancu, "Reaching Bandwidth Saturation Using Transparent Injection Parallelization", International Journal of High Performance Computing Applications (IJHPCA), November 2016, doi: 10.1177/1094342016672720

Hasan Metin Aktulga, Md. Afibuzzaman, Samuel Williams, Aydın Buluc, Meiyue Shao, Chao Yang, Esmond G. Ng, Pieter Maris, James P. Vary, "A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations", IEEE Transactions on Parallel and Distributed Systems (TPDS), November 2016, doi: 10.1109/TPDS.2016.2630699

A. Aghamousa et al., "The DESI Experiment Part I: Science,Targeting, and Survey Design", arXiv:1611.00036, 2016,

A. Aghamousa et al., "The DESI Experiment Part II: Instrument Design", arXiv:1611.00037, 2016,

E. Wes Bethel, Martin Greenwald, Kerstin Kleese van Dam, Manish Parashar, Stefan M. Wild, H. Steven Wiley, "Management, Analysis, and Visualization of Experimental and Observational Data -- The Convergence of Data and Computing", Proceedings of the 2016 IEEE 12th International Conference on eScience, Baltimore, MD, USA, 2016,

moderator E. Wes Bethel (organizer, Hank Childs, Ken Moreland, Dave Pugmire, Matt Larsen, Matthieu Dorier, In Situ Efforts and Challenges in Large Data Analysis and Visualization, IEEE Symposium on Large Data Analysis and Visualization (LDAV), 2016,

Hoa Nguyen, D\ aith\ i Stone, E. Wes Bethel, "Statistical Projections for Multi-dimensional Visual Data Exploration", 6th IEEE Symposium on Large Data Analysis and Visualization, 2016,

Sean Peisert (moderator), Jill Gemmill, Michael Sinatra, Von Welch, National Cybersecurity Panel, NSF Campus Cyberinfrastructure/ESCC/The Quilt Colocated Meeting, October 20, 2016,

Gillett, N. P., H. Shiogama, B. Funke, G. Hegerl, R. Knutti, K. Matthes, B. D. Santer, D. Stone, C. Tebaldi, "The Detection and Attribution Model Intercomparison Project (DAMIP v1.0) contribution to CMIP6", Geoscientific Model Development, 2016, 9:3685-3697, doi: 10.5194/gmd-9-3685-2016

Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, and Andrea L. Bertozzi, "OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms", 12th International Workshop on OpenMP (iWOMP), October 2016, doi: 10.1007/978-3-319-45550-1_2

A. S. Banerjee, L. Lin, W. Hu, C. Yang and J. E. Pask, "Chebyshev polynomial filtered subspace iteration in the discontinuous Galerkin method for large-scale electronic structure calculations", Journal of Chemical Physics, October 1, 2016,

Pieter Ghysels, Xiaoye S. Li, François-Henry Rouet, Samuel Williams, Artem Napov, "An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling", SIAM J. Sci. Comput. 38-5, pp. S358-S384, October 2016, doi: 10.1137/15M1010117

Lee Beausoleil, David Lombard, Angelos Keromytis, Sean Peisert, Panel: HPC Monitoring, NSCI: High-Performance Computing Security Workshop, September 30, 2016,

Sean Peisert, Security Expert on Why HPC Matters - Cybersecurity for HPC Systems: Challenges and Opportunities, NSCI: High-Performance Computing Security Workshop, September 29, 2016,

journalcover 2016 10

C. Huggel, I. Wallimann-Helmer, D. Stone, W. Cramer, "Reconciling justice and attribution research to advance climate policy", Nature Climate Change, 2016, 6:901-908, doi: 10.1038/NCLIMATE3104

Talita Perciano, Daniela Ushizima, E. Wes Bethel, Yariv Mizrahi, James Sethian, "Reduced-complexity Image Segmentation Under Parallel Markov Random Field Formulation Using Graph Partitioning", IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 2016,

S.L. Cornford, D.F.Martin, V. Lee, A.J. Payne, E.G. Ng, "Adaptive mesh refinement versus subgrid friction interpolation in simulations of Antarctic ice dynamics", Annals of Glaciology, September 2016, 57 (73), doi: 10.1017/aog.2016.13

Jeremy Kepner, Peter Aaltonen, David Bader, Aydin Buluç, Franz Franchetti, John Gilbert, Dylan Hutchison, Manoj Kumar, Andrew Lumsdaine, Henning Meyerhenke, Scott McMillan, José Moreira, John Owens, Carl Yang, Marcin Zalewski, Timothy Mattson., "Mathematical foundations of the GraphBLAS", IEEE High Performance Extreme Computing (HPEC), September 1, 2016,

Mahdi Jamei, Emma Stewart, Sean Peisert, Anna Scaglione, Chuck McParland, Ciaran Roberts, Alex McEachern, "Micro Synchrophasor-Based Intrusion Detection in Automated Distribution Systems: Towards Critical Infrastructure Security", IEEE Internet Computing, September 2016, 20(5):18-27, doi: 10.1109/MIC.2016.102

Lingfei Wu, Kesheng Wu, Alex Sim, Michael Churchill, Jong Choi, Andreas Stathopoulos, Choong-Seock Chang, Scott Klasky, "Towards Real-Time Detection and Tracking of Spatio-Temporal Features: Blob-Filaments in Fusion Plasma", IEEE Transactions on Big Data (TBD), 2016, 2:3:262-275, doi: 10.1109/TBDATA.2016.2599929

W. Yoo, M. Koo, Y. Cao, A. Sim, P. Nugent, K. Wu, "Performance Analysis Tool for HPC and Big Data Applications on Scientific Clusters", Conquering Big Data with High Performance Computing, edited by R. Arora, (Springer International: 2016) Pages: 139-161 doi: 10.1007/978-3-319-33742-5

Ariful Azad, Bartek Rajwa, Alex Pothen, "Immunophenotype Discovery, Hierarchical Organization, and Template-based Classification of Flow Cytometry Samples", Frontiers in Oncology, August 31, 2016,

Veronika Strnadova-Neeley, Aydin Buluc, John R. Gilbert, Leonid Oliker, Weimin Ouyang, "LiRa: A New Likelihood-Based Similarity Score for Collaborative Filtering", August 30, 2016,

David H. Bailey, Jonathan M. Borwein, "Computation and experimental evaluation of Mordell-Tornheim-Witten sum derivatives", Experimental Mathematics, August 29, 2016,

David H. Bailey, Jonathan M. Borwein, Jason Kimberley, Watson Ladd, "Computer discovery and analysis of large Poisson polynomials", Experimental Mathematics, August 27, 2016, doi: http://dx.doi.org/10.1080/10586458.2016.1180565

David H. Bailey, Jonathan M. Borwein, Richard Brent, Mohsen Reisi Ardali, "Reproducibility in computational science: a case study: Randomness of the digits of Pi", Experimental mathematics, August 24, 2016, doi: http://dx.doi.org/10.1080/10586458.2016.1163755

Brian Friesen, Ann Almgren, Zarija Lukić, Gunther Weber, Dmitriy Morozov, Vincent Beckner, Marcus Day, "In situ and in-transit analysis of cosmological simulations", Computational Astrophysics and Cosmology, 2016, 3 (4):1-18,

Abhinav Sarje, Xiaoye S Li, Nicholas Wright, "Achieving High Parallel Efficiency on Modern Processors for X-ray Scattering Data Analysis", International Workshop on Multicore Software Engineering at EuroPar, 2016,

Abhinav Sarje, Achieving High Parallel Efficiency on Modern Processors for X-ray Scattering Data Analysis, EuroPar 2016, August 22, 2016,

R. Li, Y. Xi, E. Vecharynski, C. Yang, and Y. Saad, "A Thick-Restart Lanczos algorithm with polynomial filtering for Hermitian eigenvalue problems", SIAM Journal on Scientific Computing, Vol. 38, Issue 4, pp. A2512–A2534, 2016, doi: 10.1137/15M1054493

Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation ('locking') and a new type of polynomial filters obtained from a least-squares technique. The resulting algorithm can be utilized in a 'spectrum-slicing' approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different sub-intervals independently from one another.

Houjun Tang, Suren Byna, Steve Harenberg, Wenzhao Zhang, Xiaocheng Zou, Daniel F Martin, Bin Dong, Dharshi Devendran, Kesheng Wu, David Trebotich, others, "In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses", 2016 45th International Conference on Parallel Processing (ICPP) (Acceptance rate: 21.1%), August 16, 2016, 406--415,

Anshu Dubey, Hajime Fujita, Daniel T. Graves, Andrew Chien Devesh Tiwari, "Granularity and the Cost of Error Recovery in Resilient AMR Scientific Applications", SuperComputing 2016, August 10, 2016,

H. Shiogama, Y. Imada, M. Mori, R. Mizuta, D. Stone, K. Yoshida, O. Arakawa, M. Ikeda, C. Takahashi, M. Arai, M. Ishii, M. Watanabe, and M. Kimoto, "Attributing historical changes in probabilities of record-breaking daily temperature and precipitation extreme events", Scientific Online Letters on the Atmosphere, 2016, 12:225-231, doi: 10.2151/sola.2016-045

David H. Bailey and Jonathan M. Borwein, "A computational mathematics view of space, time and complexity", Space, Time and the Limits of Human Understanding, edited by Shyam Wuppuluri, Giancarlo Ghirardi, (Springer: August 1, 2016)

Daniele Sorini, José Oñorbe, Zarija Lukić, Joseph Hennawi, "Modeling the Lyman-alpha Forest in Collisionless Simulations", The Astrophysical Journal, 2016, 827:97,

Zhangpeng Guo, Nicolas Zweibaum, Meiyue Shao, Lakshana R. Huddar, Per F. Peterson, and Suizheng Qiu, "Development of the FHR advanced natural circulation analysis code and application to FHR safety analysis", Progress in Nuclear Energy, 2016, 91:56–67, doi: 10.1016/j.pnucene.20

Xylar S. Asay-Davis, Stephen L. Cornford, Gaël Durand, Benjamin K. Galton-Fenzi, Rupert M. Gladstone, G. Hilmar Gudmundsson, Tore Hattermann, David M. Holland, Denise Holland, Paul R. Holland, Daniel F. Martin, Pierre Mathiot, Frank Pattyn, Hélène Seroussi, "Experimental design for three interrelated marine ice sheet and ocean model intercomparison projects: MISMIP v. 3 (MISMIP +), ISOMIP v. 2 (ISOMIP +) and MISOMIP v. 1 (MISOMIP1)", Geoscientific Model Development, July 2016, 9(7), doi: doi:10.5194/gmd-9-2471-2016

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado,, "Backtest overfitting in financial markets", Journal of Portfolio Management, July 19, 2016,

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, "Stock portfolio design and backtest overfitting", Journal of Investment Management, July 19, 2016,

W. Yoo, B. Foster, A. Sim, K. Wu, "Machine Learning Based Job Status Prediction in Scientific Clusters", IEEE SAI Computing Conference, 2016, 44-53, doi: 10.1109/SAI.2016.7555961

Bin Dong, Suren Byna, and Kesheng Wu,, "SDS-Sort: Scalable Dynamic Skew-aware Parallel Sorting", The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2016, July 1, 2016,

Patrick Oesterling, Christian Heine, Gunther H. Weber, Dmitriy Morozov, Gerik Scheuermann, Computing and Visualizing Time-Varying Merge Trees for High-Dimensional Data, Mathematics and Visualization, 2016,

Khaled Z. Ibrahim, Evgeny Epifanovsky, Samuel Williams, Anna I. Krylov, "Cross-scale Efficient Tensor Contractions for Coupled Cluster Computations Through Multiple Programming Model Backends", LBNL. - Report Number: LBNL-1005853, July 1, 2016,

Dongeun Lee, Alex Sim, Jaesik Choi, Kesheng, "Novel Data Reduction Based on Statistical Similarity", International Conference on Scientific and Statistical Database Management (SSDBM'16), New York, NY, USA, ACM, 2016, 21:1--21:1, doi: 10.1145/2949689.2949708

Brandon Cook, Pieter Maris, Meiyue Shao, Nathan Wichmann, Marcus Wagner, John O'Neill, Thanh Phung, Gaurav Bansal, "High performance optimizations for nuclear pyhsics code MFDn on KNL", High Performance Computing, 2016, 366--377, doi: 10.1007/978-3-319-46079-6_26

Osni Marques, Paulo B. Vasconcelos, "Computing the Bidiagonal SVD through an Associated Tridiagonal Eigenproblem", VECPAR 2016, Porto, Portugal, Springer, June 2016,

Naoya Nomura, Akihiro Fujii, Teruo Tanaka, Kengo Nakajima, Osni Marques, "Performance Analysis of SA-AMG Method by Setting Extracted Near-kernel Vectors", VECPAR 2016, Porto, Portugal, Springer, June 2016,

Jonathan Ganz, Matt Bishop, and Sean Peisert, "Security Analysis of Scantegrity, an Electronic Voting System", University of California, Davis, Department of Computer Science Technical Report, June 2016,

Fabien Bruneval, Tonatiuh Rangel, Samia M. Hamed, Meiyue Shao, Chao Yang, Jeffrey B. Neaton, "MOLGW 1: many-body perturbation theory software for atoms, molecules, and clusters", Computer Physics Communications, 2016, 208:149–161, doi: 10.1016/j.cpc.2016.06.019

Douglas Doerfer, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq Malas, Jean-Luc Vay, and Henri Vincenti, "Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor", Intel Xeon Phi User Group Workshop (IXPUG), June 2016,

Ariful Azad, Bartek Rajwa, Alex Pothen, "flowVS: Channel-Speci c Variance Stabilization in Flow Cytometry", BMC Bioinformatics, June 2016,

Weiqun Zhang, Ann Almgren, Marcus Day, Tan Nguyen, John Shalf, Didem Unat, "BoxLib with Tiling: An AMR Software Framework", SIAM Journal on Scientific Computing, 2016,

Robert Saye, James Sethian, "Multiscale modelling of evolving foams", Journal of Computational Physics, June 15, 2016, doi: 10.1016/j.jcp.2016.02.077

Alberto Gonzalez, Jason Leigh, Sean Peisert, Brian Tierney, Andrew Lee, Jennifer M. Schopf, "NetSage: Open Privacy-Aware Network Measurement, Analysis, And Visualization Service", Proceedings of TNC16 Networking Conference, Prague, Czech Republic, June 2016,

Robert Saye, "Interfacial gauge methods for incompressible fluid dynamics", Science Advances, June 10, 2016,

Osni Marques, Alex Druinsky, Xiaoye S. Li, Andrew T. Barker, Panayot Vassilevski, Delyan Kalchev, "Tuning the Coarse Space Construction in a Spectral AMG Solver", ICCS 2016 (The International Conference on Computational Science), San Diego, CA, Elsevier, June 2016,

Meiyue Shao, Lin Lin, Chao Yang, Fang Liu, Felipe H. da Jornada, Jack Deslippe and Steven G. Louie, "Low rank approximation in G0W0 calculations", Science China Mathematics, June 4, 2016, 59:1593–1612, doi: 10.1007/s11425-016-0296-x

Dmitriy Morozov and Zarija Lukić, "Master of Puppets: Cooperative Multitasking for In Situ Processing", HPDC, 2016, 285,

Ariful Azad, Aydın Buluç, "A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs", Parallel Computing, June 2016,

David H. Bailey, Jonathan M. Borwein, Victoria Stodden, "Facilitating reproducibility in scientific computing: Principles and practice", Reproducibility: Principles, Problems, Practices, edited by Harald Atmanspacher, Sabine Maasen, (John Wiley and Sons: June 1, 2016)

Mathias Jacquelin, Lin Lin, Nathan Wichmann, Chao Yang, Enhancing scalability and load balancing of Parallel Selected Inversion via tree-based asynchronous communication, IPDPS 16, May 24, 2016,

D. Ozog, A. Kamil, Y. Zheng, P. Hargrove, J. Hammond, A. Malony, W.A. de Jong, K. Yelick, "A Hartree-Fock Application using UPC++ and the New DArray Library", 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 23, 2016, doi: 10.1109/IPDPS.2016.108

 

Md. Mostofa Ali Patwary, Nadathur Rajagopalan Satish, Narayanan Sundaram, Jialin Liu, Peter Sadowski, Evan Racah, Suren Byna, Craig Tull, Wahid Bhimji, Prabhat, and Pradeep Dubey,, "PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures", 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS) 2016, Chicago, May 23, 2016,

Penporn Koanantakool, Ariful Azad, Aydın Buluç, Dmitriy Morozov, Sang-Yun Oh, Leonid Oliker, Katherine Yelick, "Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication", IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2016,

Ariful Azad, Aydin Buluç, "Distributed-Memory Algorithms for Maximum Cardinality Matching in Bipartite Graphs", IEEE International Parallel & Distributed Processing Symposium (IPDPS), May 2016,

D. Pugmire, J. Kress, H. Childs, M. Wolf, G. Eisenhauer, J. Low, R. M. Churchill, T. Kurc, K. Wu, A. Sim, J. Gu, J. Choi, S. Klasky, "Visualization and Analysis for Near-Real-Time Decision Making in Distributed Workflows", High Performance Data Analysis and Visualization Workshop (HPDAV2016) in conjunction with the 30th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2016), 2016, doi: 10.1109/IPDPSW.2016.175

Mark R. Krumholz, Andrew T. Myers, Richard I. Klein, Christopher F. McKee,, "What Physics Determines the Peak of the IMF? Insights from the Structure of Cores in Radiation-Magnetohydrodynamic Simulations", accepted by MNRAS, May 19, 2016,

Mathias Jacquelin, Scheduling Sparse Symmetric Fan-Both Cholesky Factorization, The 11th Scheduling for Large Scale Systems Workshop, May 18, 2016,

Tzuhsien Wu, Shyng Hao, Jerry Chou, Bin Dong and Kesheng Wu, "Indexing Blocks to Reduce Space and Time Requirements for Searching Large Data Files", 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2016, May 16, 2016,

Ariful Azad, Aydın Buluç, Alex Pothen, "Computing Maximum Cardinality Matchings in Parallel on Bipartite Graphs via Tree-Grafting", IEEE Transactions on Parallel and Distributed Systems (TPDS), May 2016,

Abhinav Sarje, Douglas W. Jacobsen, Samuel W. Williams, Todd Ringler, Leonid Oliker, "Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers", Cray User Group (CUG), London, UK, May 2016,

Abhinav Sarje, Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers, Cray Users Group (CUG), May 12, 2016,

Mathias Jacquelin, Lin Lin, Weile Jia, Yonghua Zhao, Chao Yang, "A Left-Looking Selected Inversion Algorithm and Task Parallelism on Shared Memory Systems", Submitted to SuperComputing'16, May 10, 2016,

Wahid Bhimji, Debbie Bard, Melissa Romanus, David Paul, Andrey Ovsyannikov, Brian Friesen, Matt Bryson, Joaquin Correa, Glenn K. Lockwood, Vakho Tsulaia, Suren Byna, Steve Farrell, Doga Gursoy, Chris Daley, Vince Beckner, Brian Van Straalen, Nicholas Wright, Katie Antypas, Prabhat,, "Accelerating Science with the NERSC Burst Buffer Early User Program", Cray User Group (CUG) 2016, May 10, 2016,

Cong Xu, Suren Byna, Vishwanath Venkatesan, Robert Sisneros, Omkar Kulkarni, Mohamad Chaarawi, and Kalyana Chadalavada, "LIOProf: Exposing Lustre File System Behavior for I/O Middleware", Cray User Group (CUG) 2016, May 10, 2016,

Dharshi Devendran, Suren Byna, Bin Dong, Brian van Straalen, Hans Johansen, Noel Keen, and Nagiza Samatova,, "Collective I/O Optimizations for Adaptive Mesh Refinement Data Writes on Lustre File System", Cray User Group (CUG) 2016, May 10, 2016,

Mathias Jacquelin, Yili Zheng, Esmond Ng, Katherine Yelick, "An Asynchronous Task-based Fan-Both Sparse Cholesky Solver", Submitted to SuperComputing'16, May 10, 2016,

Sean Peisert, William K. Barnett, Eli Dart, James Cuff, Robert L. Grossman, Edward Balas, Ari Berman, Anurag Shankar, Brian Tierney, "The Medical Science DMZ", Journal of the American Medical Informatics Association (JAMIA), May 2, 2016, 23(6):1199-1201, doi: 10.1093/jamia/ocw032

Nils E. R. Zimmermann, Maciej Haranczyk, "History and Utility of Zeolite Framework-Type Discovery from a Data-Science Perspective", Crystal Growth & Design, May 2, 2016,

Mature applications such as fluid catalytic cracking and hydrocracking rely critically on early zeolite structures. With a data-driven approach, we find that the discovery of exceptional zeolite framework types around the new millennium was spurred by exciting new utilization routes. The promising processes have yet not been successfully implemented (“valley of death” effect), mainly because of the lack of thermal stability of the crystals. This foreshadows limited deployability of recent zeolite discoveries that were achieved by novel crystal synthesis routes.

Watch a movie illustrating our seeded simulation strategy here.

Hansen, G., and D. Stone, "Assessing the observed impact of anthropogenic climate change", Nature Climate Change, 2016, 6:532-537, doi: 10.1038/nclimate2896

George Michelogiannakis, John Shalf, David Donofrio, John Bachan,, "Continuing the Scaling of Digital Computing Post Moore’s Law", LBNL report, April 2016, LBNL 1005126,

The approaching end of traditional CMOS technology scaling that up until now followed Moore's law is coming to an end in the next decade. However, the DOE has come to depend on the rapid, predictable, and cheap scaling of computing performance to meet mission needs for scientific theory, large scale experiments, and national security. Moving forward, performance scaling of digital computing will need to originate from energy and cost reductions that are a result of novel architectures, devices, manufacturing technologies, and programming models. The deeper issue presented by these changes is the threat to DOE’s mission and to the future economic growth of the U.S. computing industry and to society as a whole. With the impending end of Moore’s law, it is imperative for the Office of Advanced Scientific Computing Research (ASCR) to develop a balanced research agenda to assess the viability of novel semiconductor technologies and navigate the ensuing challenges. This report identifies four areas and research directions for ASCR and how each can be used to preserve performance scaling of digital computing beyond exascale and after Moore's law ends.

Farzad Fatollahi-Fard, David Donofrio, George Michelogiannakis, John Shalf, "OpenSoC Fabric: On-Chip Network Generator", ISPASS 2016: International Symposium on Performance Analysis of Systems and Software, IEEE, April 2016,

Mathias Jacquelin, Scheduling Sparse Symmetric Fan-Both Cholesky Factorization, SIAM PP'16, April 15, 2016,

Ariful Azad, Aydın Buluç, Distributed-memory algorithms for cardinality matching using matrix algebra, SIAM Conference on Parallel Processing for Scientific Computing (PP), Paris, France, April 2016,

M. Jacquelin, L. Lin, W. Jia, Y. Zhao and C. Yang, "A Left-looking selected inversion algorithm and task parallelism on shared memory systems", April 9, 2016,

J. R. Jones, F.-H. Rouet, K. V. Lawler, E. Vecharynski, K. Z. Ibrahim, S. Williams, B. Abeln, C. Yang, C. W. McCurdy, D. J. Haxton, X. S. Li, T. N. Rescigno, "An efficient basis set representation for calculating electrons in molecules", Journal of Molecular Physics, 2016, doi: 10.1080/00268976.2016.1176262

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

 

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

S. Habib, R. Roser, R. Gerber, K. Antypas, K. Riley, T. Williams, J. Wells, T. Straatsma, A. Almgren, J. Amundson, S. Bailey, D. Bard, K. Bloom, B. Bockelman, A. Borgland, J. Borrill, R. Boughezal, R. Brower, B. Cowan, H. Finkel, N. Frontiere, S. Fuess, L. Ge, N. Gnedin, S. Gottlieb, O. Gutsche, T. Han, K. Heitmann, S. Hoeche, K. Ko, O. Kononenko, T. LeCompte, Z. Li, Z. Lukic, W. Mori, P. Nugent, C.-K. Ng, G. Oleynik, B. O'Shea, N. Padmanabhan, D. Petravick, F.J. Petriello, J. Power, J. Qiang, L. Reina, T.J. Rizzo, R. Ryne, M. Schram, P. Spentzouris, D. Toussaint, J.-L. Vay, B. Viren, F. Wurthwein, L. Xiao, "ASCR/HEP Exascale Requirements Review Report", arXiv:1603.09303, 2016,

Samuel Williams, Mark Adams, Brian Van Straalen, Performance Portability in Hybrid and Heterogeneous Multigrid Solvers, Copper Moutain, March 2016,

Sean Peisert, CENIC 2016 Conference Panel: Security in R&E Networks and Campus Environments, 2016 CENIC Annual Conference, March 22, 2016,

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, "Backtest overfitting in financial markets", Automated Trader, March 16, 2016, to appea,

Z. Wen, C. Yang, X. Liu and Y. Zhang, "A Penalty-based Trace Minimization Method for Large-scale Eigenspace Computation", J. Sci. Comp., March 1, 2016, 66:1175-1203, doi: 10.1007/s10915-015-0061-0

Sean Peisert, Computer Security & the Electric Power Grid, 15th Annual ON*VECTOR Photonics Workshop, March 1, 2016,

J. Mueller, "MISO: Mixed-Integer Surrogate Optimization Framework", Optimization and Engineering, March 2016, 17:177-203,

E. Vecharynski, C. Yang, and F. Xue, "Generalized preconditioned locally harmonic residual method for non-Hermitian eigenproblems", SIAM Journal on Scientific Computing, Vol. 38, No. 1, pp. A500–A527, 2016, doi: 10.1137/15M1027413

We introduce the Generalized Preconditioned Locally Harmonic Residual (GPLHR) method for solving standard and generalized non-Hermitian eigenproblems. The method is particularly useful for computing a subset of eigenvalues, and their eigen- or Schur vectors, closest to a given shift. The proposed method is based on block iterations and can take advantage of a preconditioner if it is available. It does not need to perform exact shift-and-invert transformation. Standard and generalized eigenproblems are handled in a unified framework. Our numerical experiments demonstrate that GPLHR is generally more robust and efficient than existing methods, especially if the available memory is limited.

E. Vecharynski and C. Yang, "Preconditioned iterative methods for eigenvalue counts", to appear in Proceedings of International Workshop on Eigenvalue Problems: Algorithms, Software and Applications in Petascale Computing, in Lecture Notes in Computational Science and Engineering, Springer, 2016,

We describe preconditioned iterative methods for estimating the number of eigenvalues of a Hermitian matrix within a given interval. Such estimation is useful in a number of applications.In particular, it can be used to develop an efficient spectrum-slicing strategy to compute many eigenpairs of a Hermitian matrix. Our method is based on the Lanczos- and Arnoldi-type of iterations. We show that with a properly defined preconditioner, only a few iterations may be needed to obtain a good estimate of the number of eigenvalues within a prescribed interval. We also demonstrate that the number of iterations required by the proposed preconditioned schemes is independent of the size and condition number of the matrix. The efficiency of the methods is illustrated on several problems arising from density functional theory based electronic structure calculations.

Wim Lavrijsen, Costin Iancu, Wibe Albert de Jong, Xin Chen, Karsten Schwan, "Exploiting Variability for Energy Optimization of Load Balanced Parallel Programs", EuroSys 2016, February 5, 2016,

Wei Hu,  Lin Lin, Chao Yang, Jun Dai and Jinlong Yang, "Edge-Modi ed Phosphorene Nano ake Heterojunctions as Highly Ecient Solar Cells", Nano Lett, February 5, 2016, 16:1675–1682, doi: 10.1021/acs.nanolett.5b04593

Nicholas Chaimov, Allen Malony, Shane Canon, Costin Iancu, Khaled Ibrahim, Jay Srinivasan, "Scaling Spark on HPC Systems", High Performance and Distributed Computing (HPDC), February 5, 2016,

Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu, "SReplay: Deterministic Sub-Group Replay for One-Sided Communication", International Conference on Supercomputing (ICS), 2016, February 5, 2016,

P. Li, X. Liu, M. Chen, P. Lin, X. Ren, L. Lin, C. Yang, L. He, "Large-scale ab initio simulations based on systematically improvable atomic basis", Computational Materials Science, February 1, 2016, 112:503–517, doi: doi:10.1016/j.commatsci.2015.07.004

Andrew Myers, Phillip Colella, Brian Van Straalen, "A 4th-Order Particle-in-Cell Method with Phase-Space Remapping for the Vlasov-Poisson Equation", submitted to SISC, February 1, 2016,

L. Lin, Y. Saad and C. Yang, "Approximating spectral densities of large matrices", SIAM Review, February 1, 2016, 58:34–65, doi: 10.1137/130934283

R.M. Cox, P.B. Armentrout, W.A. de Jong, "Reactions of Th+ + H2, D2, and HD Studied by Guided Ion Beam Tandem Mass Spectrometry and Quantum Chemical Calculations", J. Phys. Chem. B – Bruce Garrett Festschrift, February 1, 2016, 120:1601, doi: 10.1021/acs.jpcb.5b08008

J. Brabec, C. Yang, E. Epifanovsky, A.I. Krylov, and E. Ng, "Reduced-cost sparsity-exploiting algorithm for solving coupled-cluster equations", Journal of Computational Chemistry, January 24, 2016, 37:1059–1067, doi: 10.1002/jcc.24293

Cuong Nguyen, Cindy Rubio-Gonzalez, Benjamin Mehne, Koushik Sen, Costin Iancu, James Demmel, William Kahan, Wim Lavrijsen, David H. Bailey, David Hough, "Floating-point precision tuning using blame analysis", 38th International Conference on Software Engineering (ICSE 2016), January 20, 2016,

Shiogama, H., , D. Stone, S. Emori, K. Takahashi, S. Mori, A. Maeda, Y. Ishizaki, and M. R. Allen, "Predicting future uncertainty constraints on global warming projections", Scientific Reports, 2016, doi: 10.1038/srep18903

E. Vecharynski, "A generalization of Saad's bound on harmonic Ritz vectors of Hermitian matrices", Linear Algebra and its Applications, Vol. 494, pp. 219-235, 2016, doi: 10.1016/j.laa.2016.01.013

We prove a Saad's type bound for harmonic Ritz vectors of a Hermitian matrix. The new bound reveals a dependence of the harmonic Rayleigh-Ritz procedure on the condition number of a shifted problem operator. Several practical implications are discussed. In particular, the bound motivates incorporation of preconditioning into the harmonic Rayleigh-Ritz scheme.

Harinarayan Krishnan, Burlen Loring, Suren Byna, Michael F. Wehner, Travis A. O'Brien, Prabhat, Chris Paciorek, and Daithi Stone, "Enabling End-to-End Climate Science Workflows in High Performance Computing Environments", The AMS (American Meteorological Society) 96th Annual Meeting, January 6, 2016,

Burlen Loring, Suren Byna, Prabhat, Junmin Gu, Hari Krishnan, Michael Wehner, and Oliver Ruebel, "TECA an Extreme Event Detection and Climate Analysis Package for High Performance Computing", The AMS (American Meteorological Society) 96th Annual Meeting, January 6, 2016,

Andrew Myers, Phillip Colella, Brian Van Straalen, "The Convergence of Particle-in-Cell Schemes for Cosmological Dark Matter Simulations", The Astrophysical Journal, Volume 816, Issue 2, article id. 56, 2016,

Deborah A Agarwal, Boris Faybishenko, Vicky L, Harinarayan Krishnan, Carina Lansing Gary Kushner, Ellen Porter, Alexandru Romosan Arie Shoshani, Haruko Wainwright, Arthur, Kesheng Wu, "A Science Data Gateway for Environmental Management", Concurrency and Computation: Practice and Experience, 2016, 28:1994--2004, doi: 10.1002/cpe.3697

David H. Bailey, Jonathan M. Borwein, "Ancient Indian square roots", Encyclopedia of the History of Science, Technology and Medicine in Non-Western Cultures, edited by Helaine Selin, (Springer: January 1, 2016)

Salman Habib, Adrian Pope, Hal Finkel, Nicholas Frontiere, Katrin Heitmann, David Daniel, Patricia Fasel, Vitali Morozov, George Zagaris, Tom Peterka, Venkatram Vishwanath, Zarija Lukić, Saba Sehrish, Wei-keng Liao, "HACC: Simulating sky surveys on state-of-the-art supercomputing architectures", New Astronomy, 2016, 42:49,

Xiaocheng Zou, David Boyuka, Dhara Desai, Martin, Suren Byna, Kesheng Wu, Kushal, Bin Dong, Wenzhao Zhang, Houjun Tang Dharshi Devendran, David Trebotich, Scott, Hans Johansen, Nagiza Samatova, "AMR-aware In Situ Indexing and Scalable Querying", The 24th High Performance Computing Symposium (HPC, January 1, 2016,

H Shan, S Williams, Y Zheng, W Zhang, B Wang, S Ethier, Z Zhao, IEEE, "Experiences of Applying One-Sided Communication to Nearest-Neighbor Communication", PROCEEDINGS OF PAW 2016: 1ST PGAS APPLICATIONS WORKSHOP (PAW), January 2016, 17--24, doi: 10.1109/PAW.2016.8

Houjun Tang, Suren Byna, Steve Harenberg, Xiaocheng Zou, Wenzhao Zhang, Kesheng Wu, Bin Dong, Oliver Rubel, Kristofer Bouchard, Scott Klasky, others, "Usage Pattern-Driven Dynamic Data Layout Reorganization", Cluster, Cloud and Grid Computing (CCGrid), 2016 16th IEEE/ACM International Symposium on, January 1, 2016, 356--365,

Meiyue Shao, Felipe H. da Jornada, Chao Yang, Jack Deslippe, Steven G. Louie, "Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem", Linear Algebra and its Applications, 2016, 488:148–167, doi: 10.1016/j.laa.2015.09.036

Wenzhao Zhang, Houjun Tang, Steve Harenberg, Surendra Byna, Xiaocheng Zou, Dharshi Devendran, Daniel F Martin, Kesheng Wu, Bin Dong, Scott Klasky, others, "AMRZone: A Runtime AMR Data Sharing Framework for Scientific Applications", Cluster, Cloud and Grid Computing (CCGrid), 2016 16th IEEE/ACM International Symposium on, January 1, 2016, 116--125,

D Unat, T Nguyen, W Zhang, MN Farooqi, B Bastem, G Michelogiannakis, A Almgren, J Shalf, "TiDA: High-level programming abstractions for data locality management", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), January 2016, 9697:116--135, doi: 10.1007/978-3-319-41321-1_7

Weijie Zhao, Florin Rusu, Bin Dong, Kesheng, "Similarity Join over Array Data", SIGMOD 16, New York, NY, USA, ACM, January 1, 2016, 2007--2022, doi: 10.1145/2882903.2915247

David H. Bailey, Jonathan M. Borwein, "Computation and structure of character polylogarithms with applications to character Mordell-Tornheim-Witten sums", Mathematics of Computation, January 1, 2016, 85:295-324,

DML Brown, H Cho, WA De Jong, "Bridging experiment and theory: A template for unifying NMR data and electronic structure calculations", Journal of Cheminformatics, January 1, 2016, 8, doi: 10.1186/s13321-016-0120-z

2015

M. Day, S. Tachibana, J. Bell, M. Lijewski, V. Beckner and R. Cheng, "A Combined Computational and Experimental Characterization of Lean Premixed Turbulent Low Swirl Laboratory Flames. II. Hydrogen Flames", Combustion and Flame, 2015,

Stephen M. Guzik, Xinfeng Gao, Landon D. Owen, Peter McCorquodale, Phillip Colella, "A freestream-preserving fourth-order finite-volume method in mapped coordinates with adaptive-mesh refinement", Computers & Fluids, December 21, 2015, 123:202–217, doi: 10.1016/j.compfluid.2015.10.001

T. Kim, D. Lee, J. Choi, A. Spurlock, A. Sim, A. Todd, K. Wu, "Extracting Baseline Electricity Usage Using Gradient Tree Boosting", International Conference on Big Data Intelligence and Computing (DataCom 2015), Best Paper Award, 2015,

Stone, D., H. Shiogama, P. Wolski, O. Angélil, S. Cholias, N. Christidis, A. Dittus, C. Folland, A. King, J. Kinter, H. Krishnan, S.-K. Min, M. Wehner, "The C20C+ Detection and Attribution Project", Fall Meeting of the American Geophysical Union, 2015,

D. B. Szyld, E. Vecharynski, and F. Xue, "Preconditioned eigensolvers for large-scale nonlinear Hermitian eigenproblems with variational characterizations. II. Interior eigenvalues.", SIAM Journal on Scientific Computing, Vol. 37, Issue 6, pp. A2969-A2997, 2015,

We consider the solution of large-scale nonlinear algebraic Hermitian eigenproblems of the form $T(\lambda)v=0$ that admit a variational characterization of eigenvalues. These problems arise in a variety of applications and are generalizations of linear Hermitian eigenproblems $Av\!=\!\lambda Bv$. In this paper, we propose a Preconditioned Locally Minimal Residual (PLMR) method for efficiently computing interior eigenvalues of problems of this type. We discuss the development of search subspaces, preconditioning, and eigenpair extraction procedure based on the refined Rayleigh-Ritz projection. Extension to the block methods is presented, and a moving-window style soft deflation is described. Numerical experiments demonstrate that PLMR methods provide a rapid and robust convergence towards interior eigenvalues. The approach is also shown to be efficient and reliable for computing a large number of extreme eigenvalues, dramatically outperforming standard preconditioned conjugate gradient methods.

W. Yoo, M. Koo, Y. Cao, A. Sim, P. Nugent, K. Wu, "PATHA: Performance Analysis Tool for HPC Applications", the 34th IEEE International Performance Computing and Communications Conference (IPCCC 2015), 2015,

Hari Krishnan, Suren Byna, Michael Wehner, Junmin Gu, Travis O'Brien, Burlen Loring, Daithi Stone, William Collins, Prabhat, Yunjie Liu, Jeffrey Johnson, and Christopher Paciorek, "Enabling Efficient Climate Science Workflows in High Performance Computing Environments", AGU Fall Meeting, 2015, December 13, 2015,

Abhinav Sarje, Particle Swarm Optimization, DUNE Wire-Cell Reconstruction Summit, December 2015,

Samuel Williams, X-TUNE, X-Stack PI Meeting, December 2015,

S. Fries, A. Sim, "HPSS connections to ESGF", Earth System Grid Federation Conference, (ESGF 2015), 2015,

George Michelogiannakis, Xiaoye S. Li, David H. Bailey, John Shalf, "Extending Summation Precision for Network Reduction Operations", Springer International Journal of Parallel Programming, December 2015, 43:6:1218-1243, doi: 10.1007/s10766-014-0326-5

David H. Bailey, Jonathan M. Borwein, "Experimental computation as an ontological game changer: The impact of modern mathematical computation tools on the ontology of mathematics", Mathematics, Substance and Surmise: Views on the Meaning and Ontology of Mathematics, edited by Ernest Davis, Philip J. Davis, (Springer: December 1, 2015) Pages: 25-67

M. Jacquelin, L. Lin, N. Wichmann and C. Yang, "Enhancing the scalability tree-based asynchronous communication", accepted IPDPS16, November 25, 2015,

David H. Bailey and Marcos Lopez de Prado, "Stop-outs under serial correlation and the 'triple penance' rule", Journal of Risk, November 23, 2015, 18(2):61-93,

Soyoung Jeon, Prabhat, Suren Byna, Junmin Gu, William Collins, and Michael Wehner,, "Characterization of extreme precipitation within atmospheric river events over California", Advances in Statistical Climatology, Meteorology and Oceanography (ASCMO), November 21, 2015, 1:45-57, doi: 10.5194/ascmo-1-45-2015

Evangelos Georganas, Aydın Buluç, Jarrod Chapman, Steven Hofmeyr,
Chaitanya Aluru, Rob Egan, Leonid Oliker, Daniel Rokhsar, Katherine Yelick,
"HipMer: An Extreme-Scale De Novo Genome Assembler", Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), November 19, 2015,

Abhinav Sarje, Xiaoye S Li, Slim Chourou, Dinesh Kumar, Singanallur Venkatakrishnan, Alexander Hexemer, "Inverse Modeling Nanostructures from X-Ray Scattering Data through Massive Parallelism", Supercomputing (SC'15), November 2015,

We consider the problem of reconstructing material nanostructures from grazing-incidence small-angle X-ray scattering (GISAXS) data obtained through experiments at synchrotron light-sources. This is an important tool for characterization of macromolecules and nano-particle systems applicable to applications such as design of energy-relevant nano-devices. Computational analysis of experimentally collected scattering data has been the primary bottleneck in this process.
We exploit the availability of massive parallelism in leadership-class supercomputers with multi-core and graphics processors to realize the compute-intensive reconstruction process. To develop a solution, we employ various optimization algorithms including gradient-based LMVM, derivative-free trust region-based POUNDerS, and particle swarm optimization, and apply these in a massively parallel fashion.
We compare their performance in terms of both quality of solution and computational speed. We demonstrate the effective utilization of up to 8,000 GPU nodes of the Titan supercomputer for inverse modeling of organic-photovoltaics (OPVs) in less than 15 minutes.

Samuel Williams, 4th Order HPGMG-FV Implementation, HPGMG BoF, Supercomputing, November 2015,

Md. Mostofa Ali Patwary, Suren Byna, Nadathur Rajagopalan Satish, Narayanan Sundaram, Zarija Lukic, Vadim Roytershteyn, Michael J. Anderson, Yushu Yao, Mr Prabhat, and Pradeep Dubey, "BD-CATS: Big Data Clustering at Trillion Particle Scale", Supercomputing 2015 (SC15), Supercomputing 2015 (SC15), November 17, 2015,

Hongzhang Shan, Kenneth McElvain, Calvin Johnson, Samuel Williams, W. Erich Ormand, "Parallel Implementation and Performance Optimization of the Configuration-Interaction Method", Supercomputing (SC), November 2015, doi: 10.1145/2807591.2807618

M. Koo, W. Yoo (advisor), A. Sim (advisor), "I/O Performance Analysis Framework on Measurement Data from Scientific Clusters", International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15), ACM Student Research Competition (SRC), 2015, 2015,

Babak Behzad, Suren Byna, Prabhat and Marc Snir, "Pattern-driven Parallel I/O Tuning", 10th Parallel Data Storage Workshop (PDSW) 2015, held in conjunction with SC15, 10th Parallel Data Storage Workshop (PDSW) 2015, to be held in conjunction with SC15, November 16, 2015,

Bin Dong, Suren Byna, and Kesheng Wu, "Heavy-tailed Distribution of Parallel I/O System Response Time", 10th Parallel Data Storage Workshop (PDSW) 2015, to be held in conjunction with SC15, 2015,

Shane Snyder, Philip Carns, Robert Latham, Misbah Mubarak, Chris Carothers, Babak Behzad, Huong Vu Thanh Luu, Suren Byna, and Prabhat, "Techniques for Modeling Large-scale HPC I/O Workloads", the 6th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS15), in conjunction with SC15, the 6th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performa, November 15, 2015,

M. A. Patwary, S. Byna, N. Satish, N. Sundaram, Z. Lukić, V. Roytershteyn, M. Anderson, Y. Yao, Prabhat, P. Dubey, "BD–CATS: Big Data Clustering at Trillion Particles Scale", Supercomputing, 2015, 6,

E. Vecharynski, J. Brabec, M. Shao, N. Govind, C. Yang, "Efficient Block Preconditioned Eigensolvers for Linear Response Time-dependent Density Functional Theory", submitted to JCC, 2015,

We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising fromthe time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into a product eigenvalue problem that is self-adjoint with respect to a K-inner product. This product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. However, the other component of the eigenvector can be easily recovered in a postprocessing procedure. Therefore, the algorithms we present here are more efficient than existing algorithms that try to approximate both components of the eigenvectors simultaneously.The efficiency of the new algorithms is demonstrated by numerical examples.

David H. Bailey, Jonathan M. Borwein, Amir Salehipour, Marcos Lopez de Prado, Qiji Zhu, "Online tools for demonstration of backtest overfitting", November 1, 2015,

Grey Ballard, James Demmel, Laura Grigori, Mathias Jacquelin, Nicholas Knight, Hong Diep Nguyen, "Reconstructing Householder vectors from Tall-Skinny QR", Journal of Parallel and Distributed Computing, November 1, 2015, 85:3-31, doi: 10.1016/j.jpdc.2015.06.003

David H. Bailey, Jonathan M. Borwein, "Crandall's computation of the incomplete gamma function and the Hurwitz zeta function with applications to Dirichlet L-series", Applied Mathematics and Computation, November 1, 2015, 268C:462-477,

Andrew Myers, Christopher McKee, Pak Shing Li, "The CH+ abundance in turbulent, diffuse molecular clouds", Monthly Notices of the Royal Astronomical Society, Volume 453, Issue 3, p.2747-2758, November 1, 2015,

Jinoh Kim, Bin Dong, Suren Byna, and Kesheng Wu, "Security for the Scientific Data Service Framework", 2nd International Workshop on Privacy and Security of Big Data (PSBD 2015), in conjunction with IEEE BigData 2015, 2015,

J. Kim, A. Sim, "Peeking Network States with Clustered Patterns", 2015, LBNL 1003744,

Bin Dong, Suren Byna, and Kesheng Wu, "Spatially Clustered Join on Heterogeneous Scientific Data Sets", 2015 IEEE International Conference on Big Data (IEEE BigData 2015), IEEE, 2015,

E. Vecharynski, A. Knyazev, "Preconditioned Locally Harmonic Residual Method for computing interior eigenpairs of certain classes of Hermitian matrices", SIAM Journal on Scientific Computing, Vol. 37, Issue 5, pp. S3–S29, 2015,

We propose a Preconditioned Locally Harmonic Residual (PLHR) method for computing several interior eigenpairs of a generalized Hermitian eigenvalue problem, without traditional spectral transformations, matrix factorizations, or inversions. PLHR is based on a short-term recurrence, easily extended to a block form, computing eigenpairs simultaneously. PLHR can take advantage of Hermitian positive definite preconditioning, e.g., based on an approximate inverse of an absolute value of a shifted matrix, introduced in [SISC, 35 (2013), pp. A696–A718]. Our numerical experiments demonstrate that PLHR is efficient and robust for certain classes of large-scale interior eigenvalue problems, involving Laplacian and Hamiltonian operators, especially if memory requirements are tight.

Jason Adams, Monica Lieng, Brooks Kuhn, Edward Guo, Edik Simonian, Sean Peisert, JP Delplanque, Nick Anderson, "Automated Mechanical Ventilator Waveform Analysis of Patient-Ventilator Asynchrony", CHEST Journal, Pages: 175A October 2015, doi: 10.1378/chest.2281731

PURPOSE: Mechanical ventilation is a life-saving intervention but is associated with adverse effects including ventilator-induced lung injury (VILI). Patient-ventilator asynchrony (PVA) is thought to contribute to VILI, but the study of PVA has been hampered by limited access to the high frequency, large volume data streams produced by modern ventilators and a lack of robust analytics. To address these limitations, we developed an automated pipeline for breath-by-breath analysis of ventilator waveform data.

METHODS: Simulated pressure and flow time series data representing normal breaths and common forms of PVA were generated on PB840 ventilators, collected unobtrusively using small, customized wireless peripheral devices, and transmitted to a networked server for storage and analysis. Two critical care physicians reviewed all waveforms to generate gold standards. Rule-based algorithms were developed to quantify inspiratory and expiratory tidal volumes (TV) and identify PVA subtypes including double trigger and delayed termination asynchrony. Data were split randomly into derivation and validation sets. Algorithm performance was compared to ventilator reported values and clinician annotation.

RESULTS: The mean difference between algorithm-determined and ventilator-reported TVs was 3.1% (99% CI ± 1.36%). Algorithm agreement with clinician annotation was excellent for double trigger PVA and moderate for delayed termination PVA, with Kappa statistics of 0.85 and 0.58, respectively. In the validation data set (n = 492 breaths), double trigger asynchrony was detected with an overall accuracy of 94.1%, sensitivity of 100%, and specificity of 92.8%.

CONCLUSIONS: A pipeline combining wireless ventilator data acquisition and rule-based analytic algorithms informed by the principles of bedside ventilator waveform analysis allows for automated, quantitative breath-by-breath analysis of patient-ventilator interactions.

CLINICAL IMPLICATIONS: We have recently deployed this system in the medical intensive care unit of the UC Davis Medical Center, which will enable further development of mechanical ventilation analytics. We have begun to explore the use of supervised machine learning and dynamic time series modeling to improve the classification of other common types of PVA and of clinical phenotypes associated with respiratory failure. This system will help to better define the epidemiology and clinical impact of PVA and other forms of off-target mechanical ventilation, and may lead to improved decision support and patient outcomes.

A. Azad, G. Ballard, A. Buluc, J. Demmel, J. Gilbert, L. Grigori, O. Schwartz, S. Toledo, S. Williams, Parallel Sparse Matrix-Matrix Multiplication and Its Use in Triangle Counting and Enumeration, SIAM ALA, October 26, 2015,

M. van Setten; F. Carouso; S. Sharifzadeh; X. Ren; M. Scheffler; F. Liu; J. Lischner; L. Lin; J. Deslippe; S. Louie; C. Yang; F. Weigend; J. Neaton; F. Evers; P. Rinke, "GW 100: Benchmarking G0W0 for molecular systems", Journal of Chemical Theory and Computation, October 22, 2015,

J. Mueller, R. Paudel, J. Woodbury, Y. Wang, C. Shoemaker, N. Mahowald, "CH4 Parameter Estimation in CLM4.5bgc Using Surrogate Global Optimization", Geoscientific Model Development, October 2015, 8:3285-3310,

Georgia Koutsandria, Reinhard Gentz, Mahdi Jamei, Anna Scaglione, Sean Peisert, and Chuck McParland, "A Real-Time Testbed Environment for Cyber-Physical Security on the Power Grid", Proceedings of the First ACM Workshop on Cyber-Physical Systems Security & Privacy (CPS-SPC), Denver, CO, ACM, October 16, 2015, doi: 10.1145/2808705.2808707

Daniel Chung, Matt Bishop, and Sean Peisert, "Distributed Helios - Mitigating Denial of Service Attacks in Online Voting", University of California, Davis, Department of Computer Science Technical Report, October 16, 2015,

Tobias Titze, Alexander Lauerer, Lars Heinke, Christian Chmelik, Nils E. R. Zimmermann, Frerich J. Keil, Douglas M. Ruthven, Jörg Kärger, "Transport in Nanoporous Materials Including MOFs: The Applicability of Fick’s Laws", Angew. Chem. Int. Ed., 2015, doi: 10.1002/anie.201506954

Diffusion in nanoporous host–guest systems is often considered to be too complicated to comply with such “simple” relationships as Fick’s first and second law of diffusion. However, it is shown herein that the microscopic techniques of diffusion measurement, notably the pulsed field gradient (PFG) technique of NMR spectroscopy and microimaging by interference microscopy (IFM) and IR microscopy (IRM), provide direct experimental evidence of the applicability of Fick’s laws to such systems. This remains true in many situations, even when the detailed mechanism is complex. The limitations of the diffusion model are also discussed with reference to the extensive literature on this subject.

 

Burke, Daniel R., "Neuromorphic Hardware for HPC", Conference, October 7, 2015,

Presented at Simons Institute Theory of Neural Computation Workshop.

Jiri Brabec, Lin Lin, Meiyue Shao, Niranjan Govind, Chao Yang, Yousef Saad, Esmond G. Ng, "Fast Algorithms for Estimating the Absorption Spectrum within Linear Response Time-dependent Density Functional Theory", Journal of Chemical Theory and Computation, 2015, 11:5197–5208, doi: 10.1021/acs.jctc.5b00887

Robert Granat, Bo Kågström, Daniel Kressner, Meiyue Shao, "Algorithm 953: Parallel library software for the multishift QR algorithm with aggressive early deflation", ACM Transactions on Mathematical Software, 2015, 41:29:1--29:2, doi: 10.1145/2699471

Abhinav Sarje, High-Performance X-Ray Scattering Data Analysis, Intel Xeon Phi Users Group (IXPUG) Annual Meeting 2015, September 30, 2015,

Adrian Chavez, William M.S. Stout, and Sean Peisert, "Techniques for the Dynamic Randomization of Network Attributes", Proceedings of the 49th Annual International Carnahan Conference on Security Technology, Taipei, Taiwan, Republic of China, IEEE Press, September 2015, doi: 10.1109/CCST.2015.7389661

Hongzhang Shan, Samuel Williams, Yili Zheng, Amir Kamil, Katherine Yelick, "Implementing High-Performance Geometric Multigrid Solver With Naturally Grained Messages", 9th International Conference on Partitioned Global Address Space Programming Models (PGAS), September 2015,

Nils E. R. Zimmermann, Bart Vorselaars, David Quigley, Baron Peters, "Nucleation of NaCl from Aqueous Solution: Critical Sizes, Ion-Attachment Kinetics, and Rates", J. Am. Chem. Soc., 2015, doi: 10.1021/jacs.5b08098

Nucleation and crystal growth are important in material synthesis, climate modeling, biomineralization, and pharmaceutical formulation. Despite tremendous efforts, the mechanisms and kinetics of nucleation remain elusive to both theory and experiment. Here we investigate sodium chloride (NaCl) nucleation from supersaturated brines using seeded atomistic simulations, polymorph-specific order parameters, and elements of classical nucleation theory. We find that NaCl nucleates via the common rock salt structure. Ion desolvation—not diffusion—is identified as the limiting resistance to attachment. Two different analyses give approximately consistent attachment kinetics: diffusion along the nucleus size coordinate and reaction-diffusion analysis of approach-to-coexistence simulation data from Aragones et al. (J. Chem. Phys. 2012, 136, 244508). Our simulations were performed at realistic supersaturations to enable the first direct comparison to experimental nucleation rates for this system. The computed and measured rates converge to a common upper limit at extremely high supersaturation. However, our rate predictions are between 15 and 30 orders of magnitude too fast. We comment on possible origins of the large discrepancy.

Watch a movie illustrating our seeded simulation strategy here.

Veronika Strnadová-Neeley, Aydın Buluç, Jarrod Chapman, John R. Gilbert, Joseph Gonzalez, Leonid Oliker, "Efficient Data Reduction for Large-Scale Genetic Mapping", ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM BCB), September 10, 2015,

Ariful Azad, Aydin Buluc, "Distributed-Memory Algorithms for Maximal Cardinality Matching using Matrix Algebra", IEEE Cluster, Chicago, IL, September 2015,

Alex Druinsky, Pieter Ghysels, Xiaoye S. Li, Osni Marques, Samuel Williams, Andrew Barker, Delyan Kalchev, Panayot Vassilevski, "Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures", International Conference on Parallel Processing and Applied Mathematics (PPAM), September 6, 2015, doi: 10.1007/978-3-319-32149-3_12

Sisi Duan, Jingtao Sun, Sean Peisert, "Towards a Self-Adaptive Middleware for Building Reliable Publish/Subscribe Systems", Proceedings of the 8th International Conference on Internet and Distributed Computing Systems (IDCS), Berkshire, United Kingdom, Springer, September 2015, 157-168, doi: 10.1007/978-3-319-23237-9_14

Prabhat, Suren Byna, Venkat Vishwanath, Eli Dart, Michael Wehner, and William Collins,, "TECA: Petscale Pattern Recognition for Climate Science", 16th International Conference on Computer Analysis of Images and Patterns (CAIP) 2015, 2015,

Zhangpeng Guo, Nicolas Zweibaum, Meiyue Shao, Lakshana R. Huddar, Per F. Peterson, and Suizheng Qiu, "Development of the FHR advanced natural circulation analysis (FANCY) code", 16th International Topical Meeting On Nuclear Reactor Thermal Hydraulics (NURETH-16), American Nuclear Society, 2015, 1629–1644,

Aydin Buluç, Scott Beamer, Kamesh Madduri, Krste Asanović, David Patterson., "Distributed-memory breadth-first search on massive graphs.", In D. Bader (editor), Parallel Graph Algorithms. CRC Press/Taylor-Francis, ( 2015)

Sean Peisert, et al., "ASCR Cybersecurity for Scientific Computing Integrity - Research Pathways and Ideas", U.S. Department of Energy Office of Science report, September 2015, LBNL 191105, doi: 10.2172/1236181

Babak Behzad, Suren Byna, Stefan Wild, Prabhat and Marc Snir, "Dynamic Model-driven Parallel I/O Performance Tuning", IEEE Cluster 2015, 2015,

S. L. Cornford, D. F. Martin, A. J. Payne, E. G. Ng, A. M. Le Brocq, R. M. Gladstone, T. L. Edwards, S. R. Shannon, C. Agosta, M. R. van den Broeke, H. H. Hellmer, G. Krinner, S. R. M. Ligtenberg, R. Timmermann, D. G. Vaughan, "Century-scale simulations of the response of the West Antarctic Ice Sheet to a warming climate", The Cryosphere, August 18, 2015, doi: 10.5194/tc-9-1579-2015, 2015

Nathan Hanford, Vishal Ahuja, Mehmet Balman, Matthew. Farrens, Dipak Ghosal, Eric Pouyoul, Brian Tierney, "Improving Network Performance on Multicore Systems: Impact of Core Affinities on High Throughput Flows", The International Journal of eScience, Elsevier, 2015, doi: doi:10.1016/j.future.2015.09.012

Network throughput is scaling-up to higher data rates while end-system processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, techniques such as network adaptor offloads and performance tuning have received a great deal of attention. Furthermore, several methods of multi-threading the network receive process have been proposed. However, thus far attention has been focused on how to set the tuning parameters and which offloads to select for higher performance, and little has been done to understand why the various parameter settings do (or do not) work. In this paper, we build on previous research to track down the sources of the end-system bottleneck for high-speed TCP flows. We define protocol processing efficiency to be the amount of system resources (such as CPU and cache) used per unit of achieved throughput (in Gbps). The amount of various system resources consumed are measured using low-level system event counters. In a multicore end-system, affinitization, or core binding, is the decision regarding how the various tasks of network receive process including interrupt, network, and application processing are assigned to the different processor cores. We conclude that affinitization has a significant impact on protocol processing efficiency, and that the performance bottleneck of the network receive process changes significantly with different affinitization.

Brian Tierney, Mehmet Balman, Cees de Laat, "Special section on high-performance networking for distributed data-intensive science", Future Generation Computer Systems, The International Journal of eScience, Elsevier, 2015, doi: doi:10.1016/j.future.2015.10.006

Mahantesh Halappanavar, Alex Pothen, Ariful Azad, Fredrik Manne, Johannes Langguth, Arif Khan, "Codesign Lessons Learned from Implementing Graph Matching on Multithreaded Architectures", IEEE Computer, August 2015,

cover copy parco

Aydin Buluc, John Gilbert, Leonid Oliker, "Special Issue: Graph Analysis for Scientific Discovery", Parallel Computing Journal Special Issue Editors, August 1, 2015,

David H. Bailey, Jonathan M. Borwein, "Experimental applied mathematics", Princeton Companion for Applied mathematics, edited by N. J. Higham, M. R. Dennis, P. Glendinning, P. A. Martin, F. Santosa and J. Tanner, (Princeton University Press: August 1, 2015) Pages: 925-933

Anshu Dubey, Daniel T. Graves, "A Design Proposal for a Next Generation Scientific Software Framework", EuroPar 2015, July 31, 2015,

M. Dorf, M. Dorr, J. Hittinger, T. Rognlien, P. Colella, P. Schwartz,R. Cohen, W. Lee, "Modeling Edge Plasma with the Continuum Kinetic Code COGENT", 2015,

Xiaocheng (Chris) Zou, Suren Byna, Hans Johansen, Daniel Martin, Nagiza F. Samatova, Arie Shoshani, John Wu, "Six-fold Speedup of Ice Calving Detection Achieved by AMR-aware Parallel Connected Component Labeling", SciDAC PI Meeting, July 2015, 2015,

Štěpán Timr, Jiří Brabec, Alexey Bondar, Tomáš Ryba, Miloš Železný, Josef Lazar, Pavel Jungwirth, "Non-Linear Optical Properties of Fluorescent Dyes Allow for Accurate Determination of Their Molecular Orientations in Phospholipid Membranes", The Journal of Physical Chemistry, July 6, 2015,

Several methods based on single- and two-photon fluorescence detected linear dichroism have recently been used to determine the orientational distributions of fluorescent dyes in lipid membranes. However, these determinations relied on simplified descriptions of non-linear anisotropic properties of the dye molecules, using a transition dipole moment-like vector instead of an absorptivity tensor. To investigate the validity of the vector approximation, we have now carried out a combination of computer simulations and polarization microscopy experiments on two representative fluorescent dyes (DiI and F2N12S) embedded in aqueous phosphatidylcholine bilayers. Our results indicate that a simplified vector-like treatment of the two-photon transition tensor is applicable for molecular geometries sampled in the membrane at ambient conditions. Furthermore, our results allow evaluation of several distinct polarization microscopy techniques. In combination, our results point to a robust and accurate experimental and computational treatment of orientational distributions of DiI, F2N12S and related dyes (including Cy3, Cy5, and others), with implications to monitoring physiologically relevant processes in cellular membranes in a novel way.

Abhinav Sarje, Computing Nanostructures at Scale, OLCF User Meeting, June 2015,

The inverse modeling, or structural fitting, problem of recovering nanostructures from X-ray scattering data obtained through experiments at light-source synchrotrons is an ideal example of a Big Data and Big Compute application. X-ray scattering based extraction of structural information from material samples is an important tool for nanostructure prediction through characterization of macromolecules and nanoparticle systems, applicable to numerous applications such as design of energy-relevant nano-devices. At Berkeley Lab, we are developing high-performance solutions for analysis of such raw data. In our work we exploit the use of massive parallelism available in clusters of GPUs, such as the Titan supercomputer, to gain efficiency in the reconstruction process. We explore the application of various numerical optimization algorithms ranging from simple gradient-based quasi-Newton methods, derivativefree trust-region-based methods, to the stochastic algorithms of Particle Swarm Optimization in a massively parallel fashion. https://vimeo.com/133558018

Abhinav Sarje, Xiaoye Li, Dinesh Kumar, Singanallur Venkatakrishnan, Alexander Hexemer, "Reconstructing Nanostructures from X-Ray Scattering Data", OLCF User Meeting, June 2015,

M. Ulbrich, Z. Wen, C. Yang, D. Klockner, Z. Lu, "A proximal gradient method for ensemble density functional theory", SIAM J. Sci. Comp., June 20, 2015, 37:A1975--A20, doi: 10.1137/14098973X

Sean Peisert, Security Research Using Cyber-Physical Systems, IT Security Symposium, June 16, 2015,

H. Luu, M. Winslett, W. Gropp, R. Ross, P. Carns, K. Harms, Prabhat, S. Byna, Y. Yao,, "A Multi-platform Study of I/O Behavior on Petascale Supercomputers", The 24th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) 2015, 2015,

P. McCorquodale, P.A. Ullrich, H. Johansen, P. Colella, "An adaptive multiblock high-order finite-volume method for solving the shallow-water equations on the sphere", Comm. App. Math. and Comp. Sci., 2015, 10:121-162, doi: 10.2140/camcos.2015.10.121

Abhinav Sarje, Parallel Performance Optimizations on Unstructured Mesh-Based Simulations, International Conference on Computational Science, June 2015,

Abhinav Sarje, Sukhyun Song, Douglas Jacobsen, Kevin Huck, Jeffrey Hollingsworth, Allen Malony, Samuel Williams, and Leonid Oliker, "Parallel Performance Optimizations on Unstructured Mesh-Based Simulations", Procedia Computer Science, 1877-0509, June 2015, 51:2016-2025, doi: 10.1016/j.procs.2015.05.466

This paper addresses two key parallelization challenges the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intra- node data movement and maximize data reuse. Our techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2×. Additionally, many of these solutions can be broadly applied to a wide variety of unstructured grid-based computations.

Mathias Jacquelin, Lin Lin, Chao Yang, "A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case", To appear in ACM Transactions on Mathematical Software (TOMS), May 28, 2015,

Protonu Basu, Samuel Williams, Brian Van Straalen, Mary Hall, Leonid Oliker, Phillip Colella, "Compiler-Directed Transformation for Higher-Order Stencils", International Parallel and Distributed Processing Symposium (IPDPS), May 2015,

Evangelos Georganas, Aydın Buluç, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine Yelick, "merAligner: A Fully Parallel Sequence Aligner", IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2015,

Scott French, Yili Zheng, Barbara Romanowicz, Katherine Yelick, "Parallel Hessian Assembly for Seismic Waveform Inversion Using Global Updates", International Parallel and Distributed Processing Symposium (IPDPS), May 2015,

Ariful Azad, Aydin Buluc, John Gilbert, "Parallel Triangle Counting and Enumeration using Matrix Algebra", Workshop on Graph Algorithms Building Blocks (GABB), in conjunction with IPDPS, IEEE, May 2015,

Ariful Azad, Aydin Buluç, Alex Pothen, "A Parallel Tree Grafting Algorithm for Maximum Cardinality Matching in Bipartite Graphs", International Parallel and Distributed Processing Symposium (IPDPS), May 2015,

C. Yang, Absorption Spectrum Estimation via Linear Response TDDFT, Applied Math Seminar, Stanford University, May 13, 2015,

David H. Bailey, Jonathan M. Borwein, "High-precision arithmetic in mathematical physics", Mathematics, May 12, 2015, 3:337-367,

David H. Bailey, Jonathan M. Borwein, Amir Salehipour, Marcos Lopez de Prado, Qiji Jim Zhu, "Backtest overfitting demonstration tool: An online interface", May 12, 2015,

K. Hu, J. Choi, A. Sim, J. Jiang, "Best Predictive Generalized Linear Mixed Model with Predictive Lasso for High-Speed Network Data Analysis", International Journal of Statistics and Probability, 2015,

P. McCorquodale, M.R. Dorr, J.A.F. Hittinger, P. Colella, "High-order finite-volume methods for hyperbolic conservation laws on mapped multiblock grids", J. Comput. Phys., May 1, 2015, 288:181-195, doi: 10.1016/j.jcp.2015.01.006

Xiaocheng Zou, Kesheng Wu, David A. Boyuka, Daniel F. Martin, Suren Byna, Houjun, Kushal Bansal, Terry J. Ligocki, Hans Johansen, and Nagiza F. Samatova, "Parallel In Situ Detection of Connected Components Adaptive Mesh Refinement Data", Proceedings of the Cluster, Cloud and Grid Computing (CCGrid) 2015, 2015,

Wehner, M., Prabhat, K. A. Reed, D. Stone, W. D. Collins, and J. Bacmeister, "Resolution dependence of future tropical cyclone projections of CAM5.1 in the US CLIVAR Hurricane Working Group idealized configurations", Journal of Climate, 2015, 28:3905-3925, doi: 10.1175/JCLI-D-14-00311.1

C. Yang, Fast Numerical Algorithms for Large-scale Electronic Structure Calculations, DOE BES Computational and Theoretical Chemistry PI Meeting, April 28, 2015,

Suren Byna, Robert Sisneros, Kalyana Chadalavada, Quincey Koziol, "Tuning Parallel I/O on Blue Waters for Writing 10 Trillion Particles", Cray User Group (CUG) meeting 2015, 2015,

Suren Byna, Brian Austin, "Evaluation of Parallel I/O Performance and Energy Consumption with Frequency Scaling on Cray XC30", Cray User Group (CUG) meeting 2015, 2015,

C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Math Colloquium, Michigan Tech University, April 24, 2015,

Abhinav Sarje, Performance Profiling of Parallel Codes, April 2015,

Xuehai Qian, Koushik Sen, Paul Hargrove, Costin Iancu, "OPR: Partial Deterministic Record and Replay for One-Sided Communication", LBNL TR, April 17, 2015,

Nicholas Chaimov, Khaled Ibrahim, Samuel Williams, Costin Iancu, "Exploiting Communication Concurrency on High Performance Computing Systems", IJHPCA, April 17, 2015,

Cindy Rubio-Gonz ́alez, Cuong Nguyen, James Demmel, William Kahan, Koushik Sen, Wim Lavrijsen, Costin Iancu, "Floating Point Precision Tuning Using Blame Analysis", LBNL TR, April 17, 2015,

Daniel Martin, Xylar Asay-Davis, Stephen Cornford, Stephen Price, Esmond Ng, William Collins, A Tale of Two Forcings: Present-Day Coupled Antarctic Ice-sheet/Southern Ocean dynamics using the POPSICLES model., European Geosciences Union General Assembly 2015, April 16, 2015,

D. Devendran, D. T. Graves, H. Johansen, "A higher-order finite-volume discretization method for Poisson's equation in cut cell geometries", submitted to SIAM Journal on Scientific Computing (preprint on arxiv), 2015,

C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Applied math & PDE seminar, UC Davis, April 14, 2015,

S. Shannigrahi, A. J. Barczyk, C. Papadopoulos, A. Sim, I. Monga, H. Newman, K. Wu, E. Yeh, "Named Data Networking in Climate Research and HEP Applications", 21st International Conference on Computing in High Energy and Nuclear Physics (CHEP2015), 2015,

Sean Peisert, Models of Secure and Private Information Sharing, University of California, San Diego School of Medicine, Division of Biomedical Informatics Seminar Series, April 10, 2015,

Fang Liu, Lin Lin , Derek Vigil-Fowlerd , Johannes Lischnerd, Alexander F. Kemper, , Sahar Sharifzadehe, Felipe H. da Jornadad, Jack Deslippef, Chao Yangc, Jeffrey B. Neaton, Steven G. Louied,, "Numerical integration for ab initio many-electron self energy calculations within the GW approximation", Journal of Computational Physics, April 1, 2015,

James Demmel, Costin Iancu, Kouhsik Sen, "Corvette Progress Report 2015", April 1, 2015,

Elaheh Pourabbas, Arie Shoshani, "The Composite Data Model: A Unified Approach for Combining and Querying Multiple Data Models", IEEE Trans. Knowl. Data Eng, 2015, 27(5):1424-1437,

Peter Schwartz, Julie Percelay, Terry J. Ligocki, Hans Johansen, Daniel T. Graves, Dharshi Devendran, Phillip Colella, Eli Ateljevich, "High-accuracy embedded boundary grid generation using the divergence theorem", Communications in Applied Mathematics and Computational Science 10-1 (2015), 83--96. DOI 10.2140/camcos.2015.10.83, March 31, 2015,

Subrata Banik , Lalitha Ravichandran , Jiri Brabec , Ivan Hubac , Karol Kowalski , Jiri Pittner, "Iterative universal state selective correction for the Brillouin-Wigner multireference coupled-cluster theory", J. Chem. Phys., March 21, 2015, 142:114106,

Abhinav Sarje, Recovering Structural Information about Nanoparticle Systems, Nvidia GPU Technology Conference, March 19, 2015,

The inverse modeling problem of recovering nanostructures from X-ray scattering data obtained through experiments at light-source synchrotrons is an ideal example of a Big Data and Big Compute application. This session will give an introduction and overview to this problem and its solutions as being developed at the Berkeley Lab. X-ray scattering based extraction of structural information from material samples is an important tool applicable to numerous applications such as design of energy-relevant nano-devices. We exploit the use of parallelism available in clusters of GPUs to gain efficiency in the reconstruction process. To develop a solution, we apply Particle Swarm Optimization (PSO) in a massively parallel fashion, and develop high-performance codes and analyze the performance.

Abhinav Sarje, Xiaoye S. Li, Dinesh Kumar, Alexander Hexemer, "Recovering Nanostructures from X-Ray Scattering Data", Nvidia GPU Technology Conference (GTC), March 2015,

We consider the inverse modeling problem of recovering nanostructures from X-ray scattering data obtained through experiments at synchrotrons. This has been a primary bottleneck problem in such data analysis. X-ray scattering based extraction of structural information from material samples is an important tool for the characterization of macromolecules and nano-particle systems applicable to numerous applications such as design of energy-relevant nano-devices. We exploit massive parallelism available in clusters of graphics processors to gain efficiency in the reconstruction process. To solve this numerical optimization problem, here we show the application of the stochastic algorithms of Particle Swarm Optimization (PSO) in a massively parallel fashion. We develop high-performance codes for various flavors of the PSO class of algorithms and analyze their performance with respect to the application at hand. We also briefly show the use of two other optimization methods as solutions.

Daniel Martin, Peter O. Schwartz, Esmond G. Ng, Improving Grounding Line Discretization using an Embedded-Boundary Approach in BISICLES, 2015 SIAM Conference on Computational Science and Engineering, March 14, 2015,

R.M. Cox, P.B. Armentrout, W.A. de Jong, "Activation of CH4 by Th+ as Studied by Guided Ion Beam Mass Spectrometry and Quantum Chemistry", Inorganic Chemistry, March 13, 2015, 54:3584, doi: 10.1021/acs.inorgchem.5b00137

C. Yang, Fast Numerical Methods for Computational Materials Science and Chemistry, CRD All-hands meeting, March 4, 2015,

Marc Baboulin, Xiaoye S. Li, Francois-Henry Rouet, "Using random butterfly transformations to avoid pivoting in sparse direct methods", High Performance Computing for Computational Science - VECPAR 2014, Lecture Notes in Computer Science, Springer. Preprint, 2015,

Patrick R. Amestoy, Jean-Yves L'Excellent, François-Henry Rouet, M. Wissam Sid-Lakhdar, "Modeling 1D distributed-memory dense kernels for an asynchronous multifrontal sparse solver", High Performance Computing for Computational Science - VECPAR 2014, Lecture Notes in Computer Science, Springer. Preprint, 2015,

E. Vecharynski, C. Yang, J. E. Pask, "A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix", Journal of Computational Physics, Vol. 290, pp. 73–89, 2015,

We present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimal block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.

Sean Peisert, et al., "ASCR Cybersecurity for Scientific Computing Integrity", U.S. Department of Energy Office of Science report, February 27, 2015, LBNL 6953E, doi: 10.2172/1223021

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, Qiji Jim Zhu, "The probability of backtest overfitting", Journal of Computational Finance, February 27, 2015,

Hansen, G., D. Stone, M. Auffhammer, C. Huggel, W. Cramer, "Linking local impacts to changes in climate: a guide to attribution", Regional Environmental Change, 2015, doi: 10.1007/s10113-015-0760-y

W. Yoo, A. Sim, "Network Bandwidth Utilization Forecast Model on High Bandwidth Networks", IEEE International Conference on Computing, Networking and Communications (ICNC’15), 2015,

Wei Hu, Lin Lin and Chao Yang, "Edge reconstruction in armchair phosphorene nanoribbons revealed by discontinuous Galerkin density functional theory", Phys. Chem. Chem. Phys., 2015, Advance Article, February 11, 2015, doi: 10.1039/C5CP00333D

With the help of our recently developed massively parallel DGDFT (Discontinuous Galerkin Density Functional Theory) methodology, we perform large-scale Kohn–Sham density functional theory calculations on phosphorene nanoribbons with armchair edges (ACPNRs) containing a few thousands to ten thousand atoms. The use of DGDFT allows us to systematically achieve a conventional plane wave basis set type of accuracy, but with a much smaller number (about 15) of adaptive local basis (ALB) functions per atom for this system. The relatively small number of degrees of freedom required to represent the Kohn–Sham Hamiltonian, together with the use of the pole expansion the selected inversion (PEXSI) technique that circumvents the need to diagonalize the Hamiltonian, results in a highly efficient and scalable computational scheme for analyzing the electronic structures of ACPNRs as well as their dynamics. The total wall clock time for calculating the electronic structures of large-scale ACPNRs containing 1080–10 800 atoms is only 10–25 s per self-consistent field (SCF) iteration, with accuracy fully comparable to that obtained from conventional planewave DFT calculations. For the ACPNR system, we observe that the DGDFT methodology can scale to 5000–50 000 processors. We use DGDFT based ab initio molecular dynamics (AIMD) calculations to study the thermodynamic stability of ACPNRs. Our calculations reveal that a 2 × 1 edge reconstruction appears in ACPNRs at room temperature.

C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Workshop on High Performance and Parallel Computing Methods and Algorithms for Materials Defects, Singapore, February 9, 2015,

Hongzhang Shan, Samuel Williams, Wibe de Jong, Leonid Oliker, "Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture", Programming Models and Applications for Multicores and Manycores (PMAM), February 2015,

Costin Iancu, Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, "Exploiting Communication Concurrency on High Performance Computing Systems", Programming Models and Applications for Multicores and Manycores (PMAM), February 2015,

Milind Chabbi, Wim Lavrijsen, Wibe de Jong, Koushik Sen, John Mellor Crummey, Costin Iancu, "Barrier Elision for Production Parallel Programs", PPOPP 2015, February 5, 2015,

Daniel F. Martin, Response of the Antarctic Ice Sheet to Ocean Forcing using the POPSICLES Coupled Ice sheet-ocean model, Joint Land Ice Working Group/Polar Climate Working Group Meeting, Boulder, CO, February 3, 2015,

Sang-Yun Oh, Bala Rajaratnam, Joong-Ho Won, "On the Solution Path of Regularized Covariance Estimators", (submitted), 2015,

Aydin Buluç, Henning Meyerhenke, Ilya Safro, Peter Sanders, Christian Schulz., "Recent advances in graph partitioning", ArXiv, ( 2015)

Zarija Lukić, Casey Stark, Peter Nugent, Martin White, Avery Meiksin, Ann Almgren, "The Lyman α forest in optically thin hydrodynamical simulations", Monthly Notices of the Royal Astronomical Society, 2015, 446:3697,

Lawal, K., D. Stone, T. Aina, C. Rye, B. Abiodun, "Trends in the potential spread of seasonal climate simulations over South Africa", International Journal of Climatology, 2015, doi: 10.1002/joc.4234

Thorsten Kurth, Andrew Pochinsky, Abhinav Sarje, Sergey Syritsyn, Andre Walker-Loud, "High-Performance I/O: HDF5 for Lattice QCD", arXiv:1501.06992, January 2015,

Practitioners of lattice QCD/QFT have been some of the primary pioneer users of the state-of-the-art high-performance-computing systems, and contribute towards the stress tests of such new machines as soon as they become available. As with all aspects of high-performance-computing, I/O is becoming an increasingly specialized component of these systems. In order to take advantage of the latest available high-performance I/O infrastructure, to ensure reliability and backwards compatibility of data files, and to help unify the data structures used in lattice codes, we have incorporated parallel HDF5 I/O into the SciDAC supported USQCD software stack. Here we present the design and implementation of this I/O framework. Our HDF5 implementation outperforms optimized QIO at the 10-20% level and leaves room for further improvement by utilizing appropriate dataset chunking.

shalenx1920step4290vel2boundary

David Trebotich, Daniel T. Graves, "An Adaptive Finite Volume Method for the Incompressible Navier-Stokes Equations in Complex Geometries", Communications in Applied Mathematics and Computational Science, January 15, 2015, 10-1:43-82, doi: 10.2140/camcos.2015.10.43

David H. Bailey, Jonathan M. Borwein, "Experimental mathematics in the society of the future", January 11, 2015,

M. Adams, P. Colella, D. T. Graves, J.N. Johnson, N.D. Keen, T. J. Ligocki. D. F. Martin. P.W. McCorquodale, D. Modiano. P.O. Schwartz, T.D. Sternberg, B. Van Straalen, "Chombo Software Package for AMR Applications - Design Document", Lawrence Berkeley National Laboratory Technical Report LBNL-6616E, January 9, 2015,

P. Colella, D. T. Graves, T. J. Ligocki, G.H. Miller , D. Modiano, P.O. Schwartz, B. Van Straalen, J. Pillod, D. Trebotich, M. Barad, "EBChombo Software Package for Cartesian Grid, Embedded Boundary Applications", Lawrence Berkeley National Laboratory Technical Report LBNL-6615E, January 9, 2015,

J. Chapman, M. Mascher, A. Buluç, K. Barry, E. Georganas, A. Session, V. Strnadova, J. Jenkins, S. Sehgal, L. Oliker, J Schmutz, K. Yelick, U. Scholz, R. Waugh, J. Poland, G. Muehlbauer, N. Stein, D. Rokhsar, "A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome", Genome biology, 2015,

D. Zuev, E. Vecharynski, C. Yang, N. Orms, and A.I. Krylov, "New algorithms for iterative matrix-free eigensolvers in quantum chemistry", Journal of Computational Chemistry, Vol. 36, Issue 5, pp. 273–284, 2015,

New algorithms for iterative diagonalization procedures that solve for a small set of eigen-states of a large matrix are described. The performance of the algorithms is illustrated by calculations of low and high-lying ionized and electronically excited states using equation-of-motion coupled-cluster methods with single and double substitutions (EOM-IP-CCSD and EOM-EE-CCSD). We present two algorithms suitable for calculating excited states that are close to a specified energy shift (interior eigenvalues). One solver is based on the Davidson algorithm, a diagonalization procedure commonly used in quantum-chemical calculations. The second is a recently developed solver, called the “Generalized Preconditioned Locally Harmonic Residual (GPLHR) method.” We also present a modification of the Davidson procedure that allows one to solve for a specific transition. The details of the algorithms, their computational scaling, and memory requirements are described. The new algorithms are implemented within the EOM-CC suite of methods in the Q-Chem electronic structure program.

M. Minion, R. Speck, M. Bolten, M. Emmett, and D. Ruprecht, "Interweaving PFASST and parallel multigrid", submitted for publication, 2015,

A. Amato, M. Day, R. K. Cheng, J. Bell, T. Lieuwen, "Topology and Burning Rates of Turbulent, Lean, H2-Air Flames", submitted for publication, 2015,

A. G. Kim, N. Padmanabhan, G. Aldering, S. W. Allen, C., R. N. Cahn, C. B. D Andrea, N. Dalal, K. S., K. D. Denney, D. J. Eisenstein, D. A., W. L. Freedman, S. Ho, D. E. Holz, D., S. M. Kent, R. Kessler, S. Kuhlmann, E. V., P. Martini, P. E. Nugent, S. Perlmutter, B. M., A. G. Riess, D. Rubin, M. Sako, N. V., N. Suzuki, R. C. Thomas, W. M. Wood-Vasey, S. E. Woosley, "Distance probes of dark energy", Astroparticle Physics, 2015, 63:2-22, doi: 10.1016/j.astropartphys.2014.05.007

P.A.R. Ade, N. Aghanim, D. Alina, M.I.R. Alves, G. Aniano, C. Armitage-Caplan, M. Arnaud, D. Arzoumanian, M. Ashdown, F. Atrio-Barandela, J. Aumont, C. Baccigalupi, A.J. Banday, R.B. Barreiro, E. Battaner, K. Benabed, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, J.R. Bond, J. Borrill, F.R. Bouchet, F. Boulanger, A. Bracco, C. Burigana, J.-F. Cardoso, A. Catalano, A. Chamballu, H.C. Chiang, P.R. Christensen, S. Colombi, L.P.L. Colombo, C. Combet, F. Couchot, A. Coulais, B.P. Crill, A. Curto, F. Cuttaia, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, A. De Rosa, G. De Zotti, J. Delabrouille, C. Dickinson, J.M. Diego, S. Donzelli, O. Doré, M. Douspis, X. Dupac, G. Efstathiou, T.A. Enßlin, H.K. Eriksen, E. Falgarone, L. Fanciullo, K. Ferrière, F. Finelli, O. Forni, M. Frailis, A.A. Fraisse, E. Franceschi, S. Galeotta, K. Ganga, T. Ghosh, M. Giard, Y. Giraud-Héraud, J. González-Nuevo, K.M. Górski, A. Gregorio, A. Gruppuso, V. Guillet, F.K. Hansen, D.L. Harrison, G. Helou, C. Hernández-Monteagudo, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, K.M. Huffenberger, A.H. Jaffe, T.R. Jaffe, W.C. Jones, M. Juvela, E. Keihänen, R. Keskitalo, T.S. Kisner, R. Kneissl, J. Knoche, M. Kunz, H. Kurki-Suonio, G. Lagache, J.-M. Lamarre, A. Lasenby, C.R. Lawrence, R. Leonardi, F. Levrier, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, D. Maino, N. Mandolesi, M. Maris, D.J. Marshall, P.G. Martin, E. Martínez-González, S. Masi, S. Matarrese, P. Mazzotta, A. Melchiorri, L. Mendes, A. Mennella, M. Migliaccio, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, D. Mortlock, D. Munshi, J.A. Murphy, P. Naselsky, F. Nati, P. Natoli, C.B. Netterfield, F. Noviello, D. Novikov, I. Novikov, C.A. Oxborrow, L. Pagano, F. Pajot, D. Paoletti, F. Pasian, V.-M. Pelkonen, O. Perdereau, L. Perotto, F. Perrotta, F. Piacentini, M. Piat, D. Pietrobon, S. Plaszczynski, E. Pointecouteau, G. Polenta, L. Popa, G.W. Pratt, S. Prunet, J.-L. Puget, J.P. Rachen, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, T. Riller, I. Ristorcelli, G. Rocha, C. Rosset, G. Roudier, B. Rusholme, M. Sandri, D. Scott, J.D. Soler, L.D. Spencer, V. Stolyarov, R. Stompor, R. Sudiwala, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, L. Terenzi, L. Toffolatti, M. Tomasi, M. Tristram, M. Tucci, G. Umana, L. Valenziano, J. Valiviita, B. Van Tent, P. Vielva, F. Villa, L.A. Wade, B.D. Wandelt, A. Zonca, "Planck intermediate results. XX. Comparison of polarized thermal emission from Galactic dust with simulations of MHD turbulence", Astronomy and Astrophysics, 2015, 576, doi: 10.1051/0004-6361/201424086

Benjamin Edwards, Steven Hofmeyr, Stephanie Forrest, "Hype and heavy tails: A closer look at data breaches", The Workshop on the Economics of Information Security (WEIS), 2015,

S. Shannigrahi, A. Barczuk, C. Papadopoulos, A. Sim, I. Monga, H. Newman, K. Wu, E., Named Data Networking in Climate Research and HEP, 21st International Conference on Computing in High and Nuclear Physics (CHEP2015), Okinawa Japan, 2015,

R. Speck, D. Ruprecht, M. Emmett, M. Minion, and R. Krause, "Inexact spectral deferred corrections using single-cycle multigrid", submitted for publication, 2015,

A Chien, P Balaji, P Beckman, N Dun, A Fang, H Fujita, K Iskra, Z Rubenstein, Z Zheng, R Schreiber, others, "Versioned Distributed Arrays for Resilience in Scientific Applications: Global View Resilience", Journal of Computational Science, 2015,

P.A.R. Ade, N. Aghanim, D. Alina, G. Aniano, C. Armitage-Caplan, M. Arnaud, M. Ashdown, F. Atrio-Barandela, J. Aumont, C. Baccigalupi, A.J. Banday, R.B. Barreiro, E. Battaner, C. Beichman, K. Benabed, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, J.J. Bock, J.R. Bond, J. Borrill, F.R. Bouchet, F. Boulanger, C. Burigana, J.-F. Cardoso, A. Catalano, A. Chamballu, R.-R. Chary, H.C. Chiang, P.R. Christensen, S. Colombi, L.P.L. Colombo, C. Combet, F. Couchot, A. Coulais, B.P. Crill, A. Curto, F. Cuttaia, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, A. De Rosa, G. De Zotti, J. Delabrouille, F.-X. Désert, C. Dickinson, J.M. Diego, S. Donzelli, O. Doré, M. Douspis, J. Dunkley, X. Dupac, G. Efstathiou, T.A. Enßlin, H.K. Eriksen, E. Falgarone, L. Fanciullo, F. Finelli, O. Forni, M. Frailis, A.A. Fraisse, E. Franceschi, S. Galeotta, K. Ganga, T. Ghosh, M. Giard, Y. Giraud-Héraud, J. González-Nuevo, K.M. Górski, A. Gregorio, A. Gruppuso, V. Guillet, F.K. Hansen, D.L. Harrison, G. Helou, C. Hernández-Monteagudo, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, K.M. Huffenberger, A.H. Jaffe, T.R. Jaffe, W.C. Jones, M. Juvela, E. Keihänen, R. Keskitalo, T.S. Kisner, R. Kneissl, J. Knoche, M. Kunz, H. Kurki-Suonio, G. Lagache, A. Lähteenmäki, J.-M. Lamarre, A. Lasenby, C.R. Lawrence, R. Leonardi, F. Levrier, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, B. Maffei, A.M. Magalhães, D. Maino, N. Mandolesi, M. Maris, D.J. Marshall, P.G. Martin, E. Martínez-González, S. Masi, S. Matarrese, P. Mazzotta, A. Melchiorri, L. Mendes, A. Mennella, M. Migliaccio, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, D. Mortlock, D. Munshi, J.A. Murphy, P. Naselsky, F. Nati, P. Natoli, C.B. Netterfield, F. Noviello, D. Novikov, I. Novikov, C.A. Oxborrow, L. Pagano, F. Pajot, R. Paladini, D. Paoletti, F. Pasian, O. Perdereau, L. Perotto, F. Perrotta, F. Piacentini, M. Piat, D. Pietrobon, S. Plaszczynski, F. Poidevin, E. Pointecouteau, G. Polenta, L. Popa, G.W. Pratt, S. Prunet, J.-L. Puget, J.P. Rachen, W.T. Reach, R. Rebolo, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, T. Riller, I. Ristorcelli, G. Rocha, C. Rosset, G. Roudier, B. Rusholme, M. Sandri, G. Savini, D. Scott, L.D. Spencer, V. Stolyarov, R. Stompor, R. Sudiwala, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, L. Terenzi, L. Toffolatti, M. Tomasi, M. Tristram, M. Tucci, G. Umana, L. Valenziano, J. Valiviita, B. Van Tent, P. Vielva, F. Villa, L.A. Wade, B.D. Wandelt, A. Zonca, "Planck intermediate results. XXI. Comparison of polarized thermal emission from Galactic dust at 353 GHz with interstellar polarization in the visible", Astronomy and Astrophysics, 2015, 576, doi: 10.1051/0004-6361/201424087

Benjamin Edwards, Steven Hofmeyr, Stephanie Forrest, Michel Van Eeten, "Analyzing and modeling longitudinal security data: Promise and pitfalls", Proceedings of the 31st Annual Computer Security Applications Conference, 2015, 391--400,

R. Speck, D. Ruprecht, M. Emmett, M. Bolten, and R. Krause, "A space-time parallel solver for the three-dimensional heat equation", submitted for publication, 2015,

K.N. Abazajian, K. Arnold, J. Austermann, B.A. Benson, C. Bischoff, J. Bock, J.R. Bond, J. Borrill, E. Calabrese, J.E. Carlstrom, C.S. Carvalho, C.L. Chang, H.C. Chiang, S. Church, A. Cooray, T.M. Crawford, K.S. Dawson, S. Das, M.J. Devlin, M. Dobbs, S. Dodelson, O. Doré, J. Dunkley, J. Errard, A. Fraisse, J. Gallicchio, N.W. Halverson, S. Hanany, S.R. Hildebrandt, A. Hincks, R. Hlozek, G. Holder, W.L. Holzapfel, K. Honscheid, W. Hu, J. Hubmayr, K. Irwin, W.C. Jones, M. Kamionkowski, B. Keating, R. Keisler, L. Knox, E. Komatsu, J. Kovac, C.-L. Kuo, C. Lawrence, A.T. Lee, E. Leitch, E. Linder, P. Lubin, J. McMahon, A. Miller, L. Newburgh, M.D. Niemack, H. Nguyen, H.T. Nguyen, L. Page, C. Pryke, C.L. Reichardt, J.E. Ruhl, N. Sehgal, U. Seljak, J. Sievers, E. Silverstein, A. Slosar, K.M. Smith, D. Spergel, S.T. Staggs, A. Stark, R. Stompor, A.G. Vieregg, G. Wang, S. Watson, E.J. Wollack, W.L.K. Wu, K.W. Yoon, O. Zahn, "Neutrino physics from the cosmic microwave background and large scale structure", Astroparticle Physics, 2015, 63:66-80, doi: 10.1016/j.astropartphys.2014.05.014

Jarrod R McClean, Alan Aspuru-Guzik, "Clock quantum Monte Carlo technique: An imaginary-time method for real-time quantum dynamics", Physical Review A, 2015, 91:012311,

David H. Bailey, David Borwein, Jonathan M. Borwein, "On Eulerian log-gamma integrals and Tornheim-Witten zeta functions", Ramanujan Journal, January 1, 2015, 36:43-68,

C. Saunders, G. Aldering, P. Antilogus, C. Aragon, S., C. Baltay, S. Bongard, C. Buton, A., F. Cellier-Holzem, M. Childress, N., Y. Copin, H. K. Fakhouri, U. Feindt, E., J. Guy, M. Kerschhaggl, A. G. Kim, M., J. Nordin, P. Nugent, K. Paech, R., E. Pecontal, R. Pereira, S. Perlmutter, D., M. Rigault, D. Rubin, K. Runge, R., G. Smadja, C. Tao, R. C. Thomas, B. A. Weaver, C. Wu, Nearby Supernova Factory, "Type Ia Supernova Distance Modulus Bias and Dispersion from K-correction Errors: A Direct Measurement Using Light Curve Fits to Observed Spectral Time Series", Astrophysical Journal, 2015, 800:57, doi: 10.1088/0004-637X/800/1/57

K.N. Abazajian, K. Arnold, J. Austermann, B.A. Benson, C. Bischoff, J. Bock, J.R. Bond, J. Borrill, I. Buder, D.L. Burke, E. Calabrese, J.E. Carlstrom, C.S. Carvalho, C.L. Chang, H.C. Chiang, S. Church, A. Cooray, T.M. Crawford, B.P. Crill, K.S. Dawson, S. Das, M.J. Devlin, M. Dobbs, S. Dodelson, O. Doré, J. Dunkley, J.L. Feng, A. Fraisse, J. Gallicchio, S.B. Giddings, D. Green, N.W. Halverson, S. Hanany, D. Hanson, S.R. Hildebrandt, A. Hincks, R. Hlozek, G. Holder, W.L. Holzapfel, K. Honscheid, G. Horowitz, W. Hu, J. Hubmayr, K. Irwin, M. Jackson, W.C. Jones, R. Kallosh, M. Kamionkowski, B. Keating, R. Keisler, W. Kinney, L. Knox, E. Komatsu, J. Kovac, C.-L. Kuo, A. Kusaka, C. Lawrence, A.T. Lee, E. Leitch, A. Linde, E. Linder, P. Lubin, J. Maldacena, E. Martinec, J. McMahon, A. Miller, V. Mukhanov, L. Newburgh, M.D. Niemack, H. Nguyen, H.T. Nguyen, L. Page, C. Pryke, C.L. Reichardt, J.E. Ruhl, N. Sehgal, U. Seljak, L. Senatore, J. Sievers, E. Silverstein, A. Slosar, K.M. Smith, D. Spergel, S.T. Staggs, A. Stark, R. Stompor, A.G. Vieregg, G. Wang, S. Watson, E.J. Wollack, W.L.K. Wu, K.W. Yoon, O. Zahn, M. Zaldarriaga, "Inflation physics from the cosmic microwave background and large scale structure", Astroparticle Physics, 2015, 63:55-65, doi: 10.1016/j.astropartphys.2014.05.013

Ryan Babbush, Jarrod McClean, Dave Wecker, Alan Aspuru-Guzik, Nathan Wiebe, "Chemical basis of Trotter-Suzuki errors in quantum chemistry simulation", Physical Review A, 2015, 91:022311,

Y.-C. Pan, M. Sullivan, K. Maguire, A. Gal-Yam, I. M., D. A. Howell, P. E. Nugent, P. A. Mazzali, "Type Ia supernova spectral features in the context of their host galaxy properties", Monthly Notices of the RAS, 2015, 446:354-368, doi: 10.1093/mnras/stu2121

Patrick R. Amestoy, Iain S. Duff, Jean-Yves L'Excellent, François-Henry Rouet, "Parallel computation of entries of A-1", Siam Journal on Scientific Computing, 2015. To appear., 2015,

Max Duarte, Ann Almgren, John Bell, "A Low Mach Number Model for Moist Atmospheric Flows", Journal of Atmospheric Sciences, to appear, 2015,

C. J. White, M. M. Kasliwal, P. E. Nugent, A., D. A. Howell, M. Sullivan, A. Goobar, A. L., J. S. Bloom, S. R. Kulkarni, R. R. Laher, F., E. O. Ofek, J. Surace, S. Ben-Ami, Y., S. B. Cenko, I. M. Hook, J. J\ onsson, T., A. Sternberg, R. M. Quimby, O. Yaron, Slow-speed Supernovae from the Palomar Transient Factory: Two Channels, Astrophysical Journal, Pages: 52 2015, doi: 10.1088/0004-637X/799/1/52

Andrew Tranter, Sarah Sofia, Jake Seeley, Michael Kaicher, Jarrod McClean, Ryan Babbush, Peter V Coveney, Florian Mintert, Frank Wilhelm, Peter J Love, "The Bravyi--Kitaev transformation: Properties and applications", International Journal of Quantum Chemistry, 2015, 115:1431--1441,

Max Duarte, Matthew Emmett, "High order schemes based on operator splitting and deferred corrections for stiff time-dependent PDEs", submitted for publication, 2015,

JC Dolence, A Burrows, W Zhang, "Two-Dimensional core-collapse supernova models with multi-dimensional transport", Astrophysical Journal Letters, 2015, 800, doi: 10.1088/0004-637X/800/1/10

Jarrod R. McClean, Jonathan Romero, Ryan Babbush, Alan Aspuru-Guzik, "The theory of variational hybrid quantum-classical algorithms", arXiv:1509.04279 [quant-ph], 2015,

D Unat, C Chan, W Zhang, S Williams, J Bachan, J Bell, J Shalf, "ExaSAT: An exascale co-design tool for performance modeling", International Journal of High Performance Computing Applications, January 2015, 29:209--232, doi: 10.1177/1094342014568690

L. Wu, K. Wu, A. Sim, M. Churchill, J. Y. Choi, A. Stathopoulos, C.S. Chang, S. Klasky, "Towards Real-Time Detection and Tracking of Blob-Filaments in Fusion Plasma Big Data", WM-CS-2015-01, Department of Computer Science, College of William and Mary, 2015,

M. Zingale, C. M. Malone, A. Nonaka, A. S. Almgren, and J. B. Bell, "Comparisons of Two- and Three-Dimensional Convection in Type I X-ray Bursts", submitted for publication, 2015,

A. Nonaka, Y. Sun, J. B. Bell, and A. Donev, "Low Mach Number Fluctuating Hydrodynamics of Binary Liquid Mixtures", submitted for publication, 2015,

A. Donev, A. Nonaka, A. K. Bhattacharjee, A. L. Garcia, J. B. Bell, "Low Mach Number Fluctuating Hydrodynamics of Multispecies Liquid Mixtures", submitted for publication, 2015,

P.A.R. Ade, N. Aghanim, C. Armitage-Caplan, M. Arnaud, M. Ashdown, F. Atrio-Barandela, J. Aumont, H. Aussel, C. Baccigalupi, A.J. Banday, R.B. Barreiro, R. Barrena, M. Bartelmann, J.G. Bartlett, E. Battaner, K. Benabed, A. Benoît, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, I. Bikmaev, J. Bobin, J.J. Bock, H. Böhringer, A. Bonaldi, J.R. Bond, J. Borrill, F.R. Bouchet, M. Bridges, M. Bucher, R. Burenin, C. Burigana, R.C. Butler, J.-F. Cardoso, P. Carvalho, A. Catalano, A. Challinor, A. Chamballu, R.-R. Chary, X. Chen, H.C. Chiang, L.-Y. Chiang, G. Chon, P.R. Christensen, E. Churazov, S. Church, D.L. Clements, S. Colombi, L.P.L. Colombo, B. Comis, F. Couchot, A. Coulais, B.P. Crill, A. Curto, F. Cuttaia, A. Da Silva, H. Dahle, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, A. De Rosa, G. De Zotti, J. Delabrouille, J.-M. Delouis, J. Démoclès, F.-X. Désert, C. Dickinson, J.M. Diego, K. Dolag, H. Dole, S. Donzelli, O. Doré, M. Douspis, X. Dupac, G. Efstathiou, T.A. Enßlin, H.K. Eriksen, F. Feroz, A. Ferragamo, F. Finelli, I. Flores-Cacho, O. Forni, M. Frailis, E. Franceschi, S. Fromenteau, S. Galeotta, K. Ganga, R.T. Génova-Santos, M. Giard, G. Giardino, M. Gilfanov, Y. Giraud-Héraud, J. González-Nuevo, K.M. Górski, K.J.B. Grainge, S. Gratton, A. Gregorio, N.E. Groeneboom, A. Gruppuso, F.K. Hansen, D. Hanson, D. Harrison, A. Hempel, S. Henrot-Versillé, C. Hernández-Monteagudo, D. Herranz, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, W. Hovest, K.M. Huffenberger, G. Hurier, N. Hurley-Walker, A.H. Jaffe, T.R. Jaffe, W.C. Jones, M. Juvela, E. Keihänen, R. Keskitalo, I. Khamitov, T.S. Kisner, R. Kneissl, J. Knoche, L. Knox, M. Kunz, H. Kurki-Suonio, G. Lagache, A. Lähteenmäki, J.-M. Lamarre, A. Lasenby, R.J. Laureijs, C.R. Lawrence, J.P. Leahy, R. Leonardi, J. León-Tavares, J. Lesgourgues, C. Li, A. Liddle, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, C.J. Mactavish, B. Maffei, D. Maino, N. Mandolesi, M. Maris, D.J. Marshall, P.G. Martin, E. Martínez-González, S. Masi, M. Massardi, S. Matarrese, F. Matthai, P. Mazzotta, S. Mei, P.R. Meinhold, A. Melchiorri, J.-B. Melin, L. Mendes, A. Mennella, M. Migliaccio, K. Mikkelsen, S. Mitra, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, D. Mortlock, D. Munshi, J.A. Murphy, P. Naselsky, A. Nastasi, F. Nati, P. Natoli, N.P.H. Nesvadba, C.B. Netterfield, H.U. Nørgaard-Nielsen, F. Noviello, D. Novikov, I. Novikov, I.J. O Dwyer, M. Olamaie, S. Osborne, C.A. Oxborrow, F. Paci, L. Pagano, F. Pajot, D. Paoletti, F. Pasian, G. Patanchon, T.J. Pearson, O. Perdereau, L. Perotto, Y.C. Perrott, F. Perrotta, F. Piacentini, M. Piat, E. Pierpaoli, D. Pietrobon, S. Plaszczynski, E. Pointecouteau, G. Polenta, N. Ponthieu, L. Popa, T. Poutanen, G.W. Pratt, G. Prézeau, S. Prunet, J.-L. Puget, J.P. Rachen, W.T. Reach, R. Rebolo, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, T. Riller, I. Ristorcelli, G. Rocha, C. Rosset, G. Roudier, M. Rowan-Robinson, J.A. Rubiño-Martín, C. Rumsey, B. Rusholme, M. Sandri, D. Santos, R.D.E. Saunders, G. Savini, M.P. Schammel, D. Scott, M.D. Seiffert, E.P.S. Shellard, T.W. Shimwell, L.D. Spencer, J.-L. Starck, V. Stolyarov, R. Stompor, A. Streblyanska, R. Sudiwala, R. Sunyaev, F. Sureau, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, D. Tavagnacco, L. Terenzi, L. Toffolatti, M. Tomasi, D. Tramonte, M. Tristram, M. Tucci, J. Tuovinen, M. Türler, G. Umana, L. Valenziano, J. Valiviita, B. Van Tent, L. Vibert, P. Vielva, F. Villa, N. Vittorio, L.A. Wade, B.D. Wandelt, M. White, S.D.M. White, D. Yvon, A. Zacchei, A. Zonca, "Planck 2013 results. XXXII. the updated Planck catalogue of Sunyaev-Zeldovich sources", Astronomy and Astrophysics, 2015, 581, doi: 10.1051/0004-6361/201525787

P.A.R. Ade, N. Aghanim, M. Arnaud, M. Ashdown, J. Aumont, C. Baccigalupi, A.J. Banday, R.B. Barreiro, E. Battaner, K. Benabed, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, J.R. Bond, J. Borrill, F.R. Bouchet, C. Burigana, R.C. Butler, E. Calabrese, A. Chamballu, H.C. Chiang, P.R. Christensen, D.L. Clements, L.P.L. Colombo, F. Couchot, A. Curto, F. Cuttaia, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, A. De Rosa, G. De Zotti, J. Delabrouille, J.M. Diego, H. Dole, O. Doré, X. Dupac, T.A. Enßlin, H.K. Eriksen, O. Fabre, F. Finelli, O. Forni, M. Frailis, E. Franceschi, S. Galeotta, S. Galli, K. Ganga, M. Giard, J. González-Nuevo, K.M. Górski, A. Gregorio, A. Gruppuso, F.K. Hansen, D. Hanson, D.L. Harrison, S. Henrot-Versillé, C. Hernández-Monteagudo, D. Herranz, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, W. Hovest, K.M. Huffenberger, A.H. Jaffe, W.C. Jones, E. Keihänen, R. Keskitalo, R. Kneissl, J. Knoche, M. Kunz, H. Kurki-Suonio, J.-M. Lamarre, A. Lasenby, C.R. Lawrence, R. Leonardi, J. Lesgourgues, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, N. Mandolesi, M. Maris, P.G. Martin, E. Martínez-González, S. Masi, S. Matarrese, P. Mazzotta, P.R. Meinhold, A. Melchiorri, L. Mendes, E. Menegoni, A. Mennella, M. Migliaccio, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, A. Moss, D. Munshi, J.A. Murphy, P. Naselsky, F. Nati, P. Natoli, H.U. Nørgaard-Nielsen, F. Noviello, D. Novikov, I. Novikov, C.A. Oxborrow, L. Pagano, F. Pajot, D. Paoletti, F. Pasian, G. Patanchon, O. Perdereau, L. Perotto, F. Perrotta, F. Piacentini, M. Piat, E. Pierpaoli, D. Pietrobon, S. Plaszczynski, E. Pointecouteau, G. Polenta, N. Ponthieu, L. Popa, G.W. Pratt, S. Prunet, J.P. Rachen, R. Rebolo, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, I. Ristorcelli, G. Rocha, G. Roudier, B. Rusholme, M. Sandri, G. Savini, D. Scott, L.D. Spencer, V. Stolyarov, R. Sudiwala, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, D. Tavagnacco, L. Terenzi, L. Toffolatti, M. Tomasi, M. Tristram, M. Tucci, J.-P. Uzan, L. Valenziano, J. Valiviita, B. Van Tent, P. Vielva, F. Villa, L.A. Wade, D. Yvon, A. Zacchei, A. Zonca, "Planck intermediate results: XXIV. Constraints on variations in fundamental constants", Astronomy and Astrophysics, 2015, 580, doi: 10.1051/0004-6361/201424496

A. Amato, M. Day, R. K. Cheng, J. Bell, T. Lieuwen, "Leading Edge Statistics of Turbulent, Lean, H2-Air Flames", submitted for publication, 2015,

J. Borrill, R. Keskitalo, T. Kisner, "Big bang, big data, big iron: Fifteen years of cosmic microwave background data analysis at NERSC", Computing in Science and Engineering, 2015, 17:22-29, doi: 10.1109/MCSE.2015.1

Y Gong, WA De Jong, JK Gibson, "Gas Phase Uranyl Activation: Formation of a Uranium Nitrosyl Complex from Uranyl Azide", Journal of the American Chemical Society, January 1, 2015, 137:5911--5915, doi: 10.1021/jacs.5b02420

P.A.R. Ade, M.I.R. Alves, G. Aniano, C. Armitage-Caplan, M. Arnaud, F. Atrio-Barandela, J. Aumont, C. Baccigalupi, A.J. Banday, R.B. Barreiro, E. Battaner, K. Benabed, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, J.J. Bock, J.R. Bond, J. Borrill, F.R. Bouchet, F. Boulanger, C. Burigana, J.-F. Cardoso, A. Catalano, A. Chamballu, H.C. Chiang, L.P.L. Colombo, C. Combet, F. Couchot, A. Coulais, B.P. Crill, A. Curto, F. Cuttaia, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, G. De Zotti, J. Delabrouille, F.-X. Désert, C. Dickinson, J.M. Diego, S. Donzelli, O. Doré, M. Douspis, J. Dunkley, X. Dupac, T.A. Enßlin, H.K. Eriksen, E. Falgarone, F. Finelli, O. Forni, M. Frailis, A.A. Fraisse, E. Franceschi, S. Galeotta, K. Ganga, T. Ghosh, M. Giard, J. González-Nuevo, K.M. Górski, A. Gregorio, A. Gruppuso, V. Guillet, F.K. Hansen, D.L. Harrison, G. Helou, C. Hernández-Monteagudo, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, A.H. Jaffe, T.R. Jaffe, W.C. Jones, E. Keihänen, R. Keskitalo, T.S. Kisner, R. Kneissl, J. Knoche, M. Kunz, H. Kurki-Suonio, G. Lagache, J.-M. Lamarre, A. Lasenby, C.R. Lawrence, J.P. Leahy, R. Leonardi, F. Levrier, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, B. Maffei, A.M. Magalhães, D. Maino, N. Mandolesi, M. Maris, D.J. Marshall, P.G. Martin, E. Martínez-González, S. Masi, S. Matarrese, P. Mazzotta, A. Melchiorri, L. Mendes, A. Mennella, M. Migliaccio, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, D. Mortlock, D. Munshi, J.A. Murphy, P. Naselsky, F. Nati, P. Natoli, C.B. Netterfield, F. Noviello, D. Novikov, I. Novikov, N. Oppermann, C.A. Oxborrow, L. Pagano, F. Pajot, D. Paoletti, F. Pasian, O. Perdereau, L. Perotto, F. Perrotta, F. Piacentini, D. Pietrobon, S. Plaszczynski, E. Pointecouteau, G. Polenta, L. Popa, G.W. Pratt, J.P. Rachen, W.T. Reach, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, T. Riller, I. Ristorcelli, G. Rocha, C. Rosset, G. Roudier, J.A. Rubiño-Martín, B. Rusholme, E. Salerno, M. Sandri, G. Savini, D. Scott, L.D. Spencer, V. Stolyarov, R. Stompor, R. Sudiwala, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, L. Terenzi, L. Toffolatti, M. Tomasi, M. Tristram, M. Tucci, L. Valenziano, J. Valiviita, B. Van Tent, P. Vielva, F. Villa, B.D. Wandelt, A. Zacchei, A. Zonca, "Planck intermediate results. XXII. Frequency dependence of thermal emission from Galactic dust in intensity and polarization", Astronomy and Astrophysics, 2015, 576, doi: 10.1051/0004-6361/201424088

David H. Bailey, Stephanie Ger, Marcos Lopez de, Alexander Sim, Kesheng Wu, "Statistical Overfitting and Backtest Performance", Quantitative Finance, 2015,

http://ssrn.com/abstract=2507040

R. E. Firth, M. Sullivan, A. Gal-Yam, D. A. Howell, K., P. Nugent, A. L. Piro, C. Baltay, U., E. Hadjiyksta, R. McKinnon, E. Ofek, D. Rabinowitz, E. S. Walker, "The rising light curves of Type Ia supernovae", Monthly Notices of the RAS, 2015, 446:3895-3910, doi: 10.1093/mnras/stu2314

M Chabbi, W Lavrijsen, W De Jong, K Sen, J Mellor-Crummey, C Iancu, "Barrier elision for production parallel programs", Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP, January 1, 2015, 2015-Jan:109--119, doi: 10.1145/2688500.2688502

P.A.R. Ade, N. Aghanim, D. Alina, M.I.R. Alves, C. Armitage-Caplan, M. Arnaud, D. Arzoumanian, M. Ashdown, F. Atrio-Barandela, J. Aumont, C. Baccigalupi, A.J. Banday, R.B. Barreiro, E. Battaner, K. Benabed, A. Benoit-Lévy, J.-P. Bernard, M. Bersanelli, P. Bielewicz, J.J. Bock, J.R. Bond, J. Borrill, F.R. Bouchet, F. Boulanger, A. Bracco, C. Burigana, R.C. Butler, J.-F. Cardoso, A. Catalano, A. Chamballu, R.-R. Chary, H.C. Chiang, P.R. Christensen, S. Colombi, L.P.L. Colombo, C. Combet, F. Couchot, A. Coulais, B.P. Crill, A. Curto, F. Cuttaia, L. Danese, R.D. Davies, R.J. Davis, P. De Bernardis, E.M. De Gouveia Dal Pino, A. De Rosa, G. De Zotti, J. Delabrouille, F.-X. Désert, C. Dickinson, J.M. Diego, S. Donzelli, O. Doré, M. Douspis, J. Dunkley, X. Dupac, G. Efstathiou, T.A. Enßlin, H.K. Eriksen, E. Falgarone, K. Ferrière, F. Finelli, O. Forni, M. Frailis, A.A. Fraisse, E. Franceschi, S. Galeotta, K. Ganga, T. Ghosh, M. Giard, Y. Giraud-Héraud, J. González-Nuevo, K.M. Górski, A. Gregorio, A. Gruppuso, V. Guillet, F.K. Hansen, D.L. Harrison, G. Helou, C. Hernández-Monteagudo, S.R. Hildebrandt, E. Hivon, M. Hobson, W.A. Holmes, A. Hornstrup, K.M. Huffenberger, A.H. Jaffe, T.R. Jaffe, W.C. Jones, M. Juvela, E. Keihänen, R. Keskitalo, T.S. Kisner, R. Kneissl, J. Knoche, M. Kunz, H. Kurki-Suonio, G. Lagache, A. Lähteenmäki, J.-M. Lamarre, A. Lasenby, C.R. Lawrence, J.P. Leahy, R. Leonardi, F. Levrier, M. Liguori, P.B. Lilje, M. Linden-Vørnle, M. López-Caniego, P.M. Lubin, J.F. Macías-Pérez, B. Maffei, A.M. Magalhães, D. Maino, N. Mandolesi, M. Maris, D.J. Marshall, P.G. Martin, E. Martínez-González, S. Masi, S. Matarrese, P. Mazzotta, A. Melchiorri, L. Mendes, A. Mennella, M. Migliaccio, M.-A. Miville-Deschênes, A. Moneti, L. Montier, G. Morgante, D. Mortlock, D. Munshi, J.A. Murphy, P. Naselsky, F. Nati, P. Natoli, C.B. Netterfield, F. Noviello, D. Novikov, I. Novikov, C.A. Oxborrow, L. Pagano, F. Pajot, R. Paladini, D. Paoletti, F. Pasian, T.J. Pearson, O. Perdereau, L. Perotto, F. Perrotta, F. Piacentini, M. Piat, D. Pietrobon, S. Plaszczynski, F. Poidevin, E. Pointecouteau, G. Polenta, L. Popa, G.W. Pratt, S. Prunet, J.-L. Puget, J.P. Rachen, W.T. Reach, R. Rebolo, M. Reinecke, M. Remazeilles, C. Renault, S. Ricciardi, T. Riller, I. Ristorcelli, G. Rocha, C. Rosset, G. Roudier, J.A. Rubiño-Martín, B. Rusholme, M. Sandri, G. Savini, D. Scott, L.D. Spencer, V. Stolyarov, R. Stompor, R. Sudiwala, D. Sutton, A.-S. Suur-Uski, J.-F. Sygnet, J.A. Tauber, L. Terenzi, L. Toffolatti, M. Tomasi, M. Tristram, M. Tucci, G. Umana, L. Valenziano, J. Valiviita, B. Van Tent, P. Vielva, F. Villa, L.A. Wade, B.D. Wandelt, A. Zacchei, A. Zonca, "Planck intermediate results. XIX. An overview of the polarized thermal emission from Galactic dust", Astronomy and Astrophysics, 2015, 576, doi: 10.1051/0004-6361/201424082

Paul H. Hargrove, "Global Address Space Networking", Programming Models for Parallel Computing, edited by Pavan Balaji, (MIT Press: 2015)

2014

Siegfried Cools, Pieter Ghysels, Wim van Aarle, Wim Vanroose, "A multi-level preconditioned Krylov method for the efficient solution of algebraic tomographic reconstruction problems", To appear in Journal of Computational and Applied Mathematics, December 28, 2014,

François-Henry Rouet, Xiaoye S. Li, Pieter Ghysels, Artem Napov, "A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization", Submitted to ACM Transactions on Mathematical Software, December 2014,

E.G. Ng, D.F. Martin, X. S. Asay-Davis , S.F. Price , W.D. Collins, "High-resolution coupled ice sheet-ocean modeling using the POPSICLES model", American Geophysical Union Fall Meeting, December 17, 2014,

D.F. Martin, X.S.Asay-Davis, S.F. Price, S.L. Cornford, M. Maltrud, E.G. Ng, W.D. Collins, "Response of the Antarctic ice sheet to ocean forcing using the POPSICLES coupled ice sheet-ocean model", AmericanGeophysical Union Fall Meeting, December 17, 2014,

Khaled Z. Ibrahim, Samuel W. Williams, Evgeny Epifanovsky, Anna I. Krylov, "Analysis and Tuning of Libtensor Framework on Multicore Architectures", High Performance Computing Conference (HIPC), December 2014,

J. Ferguson, C. Jablonowski, H. Johansen, R. English, P. McCorquodale, P. Colella, J. Benedict, W. Collins, J. Johnson, P. Ullrich, "Assessing Grid Refinement Strategies in the Chombo Adaptive Mesh Refinement Model", AGU Fall Meeting, San Francisco, CA, December 15, 2014,

Sisi Duan, Hein Meling, Sean Peisert, Haibin Zhang,, "BChain: Byzantine Replication with High Throughput and Embedded Reconfiguration", Proceedings of the 18th International Conference on Principles of Distributed Systems (OPODIS), Cortina, Italy, Springer, December 2014, 91-106, doi: 10.1007/978-3-319-14472-6_7

Farzad Fatollahi-Fard, David Donofrio, George Michelogiannakis, John Shalf, "OpenSoC Fabric: On-Chip Network Generator", Proceedings of the Workshop on Network on Chip Architectures, ACM, December 2014, 45-50, LBNL LBNL-1005675, doi: 10.1145/2685342.2685351

Samuel Williams, HPGMG-FV, FastForward2 Proxy App Presentation, December 2014,

Sang-Yun Oh, Onkar Dalal, Kshitij Khare, Bala Rajaratnam, "Optimization Methods for Sparse Pseudo-Likelihood Graphical Model Selection", Neural Information Processing Systems, 2014,

Jinlong Yang, Hongjun Xiang, Honghui Shang, Jun Dai, Wei Hu, Zi Xiong, and Xinming Qin, ONPAS: Order-N quantum chemistry pakage for large scale ab initio simulation, Quantum Chem., 2014, December 2, 2014, doi: 10.1002/qua.24837

Wei Hu, Lin Lin, Chao Yang and Jinlong Yang, "Electronic structure and aromaticity of large-scale hexagonal graphene nanoflakes", J. Chem. Phys. 141, 214704 (2014), December 2, 2014, 141:214704, doi: 10.1063/1.4902806

With the help of the recently developed SIESTA-PEXSI method [L. Lin, A. García, G. Huhs, and C. Yang, J. Phys.: Condens. Matter26, 305503 (2014)], we perform Kohn-Sham density functional theory calculations to study the stability and electronic structure of hydrogen passivated hexagonal graphene nanoflakes (GNFs) with up to 11 700 atoms. We find the electronic properties of GNFs, including their cohesive energy, edge formation energy, highest occupied molecular orbital-lowest unoccupied molecular orbital energy gap, edge states, and aromaticity, depend sensitively on the type of edges (armchair graphene nanoflakes (ACGNFs) and zigzag graphene nanoflakes (ZZGNFs)), size and the number of electrons. We observe that, due to the edge-induced strain effect in ACGNFs, large-scale ACGNFs’ edge formation energydecreases as their size increases. This trend does not hold for ZZGNFs due to the presence of many edge states in ZZGNFs. We find that the energy gaps E g of GNFs all decay with respect to 1/L, where L is the size of the GNF, in a linear fashion. But as their size increases, ZZGNFs exhibit more localized edge states. We believe the presence of these states makes their gap decrease more rapidly. In particular, when L is larger than 6.40 nm, we find that ZZGNFs exhibit metallic characteristics. Furthermore, we find that the aromatic structures of GNFs appear to depend only on whether the system has 4N or 4N + 2 electrons, where N is an integer.

packedCylindernx2048step58600vel0mag2

David Trebotich, Mark F. Adams, Sergi Molins, Carl I. Steefel, Chaopeng Shen, "High-Resolution Simulation of Pore-Scale Reactive Transport Processes Associated with Carbon Sequestration", Computing in Science and Engineering, December 2014, 16:22-31, doi: 10.1109/MCSE.2014.77

W. Yoo, A. Sim, "Efficient Changing Pattern Detection on High Bandwidth Network Measurements", 7th International Conference on Grid and Distributed Computing, 2014,

Soyoung Jeon, Christopher Paciorek, Prabhat, Surendra Byna, William Collins, Michael Wehner, "Uncertainty Quantification for Characterizing Spatial Tail Dependence under Statistical Framework", AGU, Fall Meeting 2014, 2014,

J. Choi, A. Sim, Data reduction methods, systems, and devices, U.S. Patent Pending serial no. 14/555,365, 2014,

U.S. Patent pending serial no. 14/555,365, “DATA REDUCTION METHODS, SYSTEMS, AND DEVICES”, filed on 11/26/2014. Provisional application no. 61/909,518. “An Efficient Data Reduction Method with Locally Exchangeable Measures”, J. Choi and A. Sim, filed on 11/27/2013, LBNL IB2013-133.

Mark Adams, Samuel Williams, Jed Brown, HPGMG, Birds of a Feather (BoF), Supercomputing, November 2014,

J.A. Ang, R.F. Barrett, R.E. Benner, D. Burke, C. Chan, D. Donofrio, S.D. Hammond, K.S. Hemmertand S.M. Kelly, H. Le, V.J. Leung, D.R. Resnick, A.F. Rodrigues, J. Shalf, D. Stark, andN.J. Wright D. Unat, "Abstract Machine Models and Proxy Architectures for Exascale Computing", Co--HPC2014 (to appear), New Orleans, LA, USA, IEEE Computer Society, November 17, 2014,

To achieve Exascale computing, fundamental hardware architectures must change. The most significant consequence of this assertion is the impact on the scientific applications that run on current High Performance Computing (HPC) systems, many of which codify years of scientific domain knowledge and refinements for contemporary computer systems. In order to adapt to Exascale architectures, developers must be able to reason about new hardware and determine what programming models and algorithms will provide the best blend of performance and energy efficiency into the future. While many details of the Exascale architectures are undefined, an abstract machine model is designed to allow application developers to focus on the aspects of the machine that are important or relevant to performance and code structure. These models are intended as communication aids between application developers and hardware architects during the co-design process. We use the term proxy architecture to describe a parameterized version of an abstract machine model, with the parameters added to ellucidate potential speeds and capacities of key hardware components. These more detailed architectural models are formulated to enable discussion between the developers of analytic models and simulators and computer hardware architects. They allow for application performance analysis and hardware optimization opportunities. In this report our goal is to provide the application development community with a set of models that can help software developers prepare for Exascale and through the use of proxy architectures, we can enable a more concrete exploration of how well application codes map onto the future architectures. 

J.A. Ang, R.F. Barrett, R.E. Benner, D. Burke, C. Chan, D. Donofrio, S.D. Hammond, K.S. Hemmert, S.M. Kelly, H. Le, V.J. Leung, D.R. Resnick, A.F. Rodrigues, J. Shalf, D. Stark, D. Unat, N.J. Wright, "Abstract Machine Models and Proxy Architectures for Exascale Computing", 2014 Hardware-Software Co-Design for High Performance Computing, November 17, 2014,

Evangelos Georganas, Aydin Buluç, Jarrod Chapman, Leonid Oliker, Daniel Rokhsar, Katherine Yelick, "Parallel de bruijn graph construction and traversal for de novo genome assembly", Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'14), November 2014,

Alex Druinsky, Brian Austin, Sherry Li, Osni Marques, Eric Roman, Samuel Williams, "A Roofline Performance Analysis of an Algebraic Multigrid Solver", Supercomputing (SC), November 2014,

Yu Jung Lo, Samuel Williams, Brian Van Straalen, Terry J. Ligocki, Matthew J. Cordery, Leonid Oliker, Mary W. Hall, "Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), November 2014, doi: 10.1007/978-3-319-17248-4_7

Qian Sun, Fan Zhang, Tong Jin, Hoang Bui, Kesheng Wu, Arie Shoshani, Hemanth Kolla, Scott Klasky, Jacqueline Chen and Manish Parashar, "Scalable Run-time Data Indexing and Querying for Scientific Simulations", Proceedings of the Fifth International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC’14), 2014,

Dharshi Devendran, Daniel T. Graves, Hans Johansen, "A Hybrid Multigrid Algorithm for Poisson's equation using an Adaptive, Fourth Order Treatment of Cut Cells", LBNL Report Number: LBNL-1004329, November 11, 2014,

Georgia Koutsandria, Vishak Muthukumar, Masood Parvania, Sean Peisert, Chuck McParland, Anna Scaglione, "A Hybrid Network IDS for Protective Digital Relays in the Power Transmission Grid", Proceedings of the 5th IEEE International Conference on Smart Grid Communications (SmartGridComm), Venice, Italy, IEEE, November 2014, 908-913, doi: 10.1109/SmartGridComm.2014.7007764

L. Wu, K. Wu, A. Sim, A. Stathopoulos, "Real-Time Outlier Detection Algorithm for Finding Blob-Filaments in Plasma", Super Computing 2014, ACM SRC, 2014,

L. Wu, K. Wu, A. Sim, M. Churchill, J. Y. Choi, A. Stathopoulos, CS Chang, S. Klasky, "High-Performance Outlier Detection Algorithm for Finding Blob-Filaments in Plasma", 5th International Workshop on Big Data Analytics: Challenges, and Opportunities (BDAC’14), 2014,

Veronika Strnadova, Aydın Buluç, Joseph Gonzalez, Stefanie Jegelka, Jarrod Chapman, John Gilbert, Daniel Rokhsar, Leonid Oliker, "Efficient and accurate clustering for large-scale genetic mapping", IEEE International Conference on Bioinformatics and Biomedicine (BIBM'14), November 1, 2014,

A. L. Chervenak, A. Sim, J. Gu, R. Schuler, N. Hirpathak, "Adaptation and Policy-Based Resource Allocation for Efficient Bulk Data Transfers in High Performance Computing Environments", 4th International Workshop on Network-aware Data Management (NDM'14), 2014,

Sean Peisert, Jonathan Margulies, Closing the Gap on Securing Energy Sector Control Systems [Guest editors' introduction], IEEE Security and Privacy, Pages: 13-14 November 2014, doi: 10.1109/MSP.2014.110

Sean Peisert, Jonathan Margulies, Eric Byres, Paul Dorey, Dale Peterson, Zach Tudor, Control System Security from the Front Lines (Roundtable), IEEE Security and Privacy, Pages: 55-58 November 2014, doi: 10.1109/MSP.2014.112

Chuck McParland, Sean Peisert, Anna Scaglione, "Monitoring Security of Networked Control Systems: It's the Physics", IEEE Security and Privacy, November 2014, 12(6):32-39, doi: 10.1109/MSP.2014.122

David H. Bailey, Jonathan M. Borwein, "Computation and theory of extended Mordell-Tornheim-Witten sums II", Journal of Approximation Theory, October 30, 2014,

R.H. Cohen,M Dorf, M. Dorr, D.D. Ryutov, P.Schwartz, Plans for Extending COGENT to Model Snowflake Divertors, APS-DPP Meeting, New Orleans LA, October 27, 2014,

Protonu Basu, Samuel Williams, Brian Van Straalen, Leonid Oliker, Mary Hall, "Converting Stencils to Accumulations for Communication-Avoiding Optimization in Geometric Multigrid", Workshop on Stencil Computations (WOSC), October 2014,

Field, C. B., V. R. Barros, M. D. Mastrandrea, K. J. Mach, et alii, "Summary for Policymakers", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by C. B. Field, V. R. Barros, et alii, (Cambridge University Press: 2014) Pages: 1-32

Field, C. B., V. R. Barros, K. J. Mach, M. D. Mastrandrea, et alii, "Technical Summary", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by C. B. Field, V. R. Barros, et alii, (Cambridge University Press: 2014) Pages: 35-94

Cramer, W., G. W. Yohe, M. Auffhammer, C. Huggel, U. Molau, M. A. F. da Silva Dias, A. Solow, D. A. Stone, L. Tibig, et alii, "Detection and attribution of observed impacts", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by C. B. Field, V. R. Barros, et alii, (Cambridge University Press: 2014) Pages: 979-1037

Niang, I., O. C. Ruppel, M. A. Abdrabo, A. Essel, C. Lennard, J. Padgham, P. Urquhart, et alii, "Africa", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part B: Regional Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by V. R. Barros, C. B. Field et alii, (Cambridge University Press: 2014) Pages: 1199-1265

Hijioka, Y., E. Lin, J. J. Pereira, R. T. Corlett, X. Cui, G. E. Insarov, R. D. Lasco, E. Lindgren, A. Surjan, et alii, "Asia", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part B: Regional Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by V. R. Barros, C. B. Field et alii, (Cambridge University Press: 2014) Pages: 1327-1370

Hoegh-Guldberg, O., R. Cai, E. S. Poloczanska, P. G. Brewer, S. Sundby, K. Hilmi, V. J. Fabry, S. Jung, et alii, "The Ocean", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part B: Regional Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by V. R. Barros, C. B. Field, et alii, (Cambridge University Press: 2014) Pages: 1655-1731

Diffenbaugh, N. S., D. A. Stone, P. Thorne, F. Giorgi, B. C. Hewitson, R. G. Jones, and G. J. van Oldenborgh, "Regional Climate Summary Figures", Climate Change 2014: Impacts, Adaptation, and Vulnerability. Part A: Global and Sectoral Aspects. Contribution of Working Group II to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by C. B. Field, V. R. Barros, (Cambridge University Press: 2014) Pages: 137-141

J.A. Sobota, S.-L. Yang, D. Leuenberger, A.F. Kemper, J.G. Analytis, I.R. Fisher, P.S. Kirchmann, T.P. Devereaux, Z.-X. Shen, "Distinguishing Bulk and Surface Electron-Phonon Coupling in a Photoexcited Topological Insulator", Phys. Rev. Lett. 113, 157401 (2014), October 10, 2014,

Hongzhang Shan, Amir Kamil, Samuel Williams, Yili Zheng, Katherine Yelick, "Evaluation of PGAS Communication Paradigms with Geometric Multigrid", 8th International Conference on Partitioned Global Address Space Programming Models (PGAS), October 2014, doi: 10.1145/2676870.2676874

Vivek Kumar, Yili Zheng, Vincent Cavé, Zoran Budimlic, Vivek Sarkar, "HabaneroUPC++: a Compiler-free PGAS Library", 8th International Conference on Partitioned Global Address Space Programming Models (PGAS), October 2014,

Sisi Duan, Karl Levitt, Hein Meling, Sean Peisert, Haibin Zhang, "Byzantine Fault Tolerance from Intrusion Detection", Proceedings of the 33rd IEEE International Symposium on Reliable Distributed Systems (SRDS), Nara, Japan, October 2014, 253-264, doi: 10.1109/SRDS.2014.28

David H. Bailey and Jonathan M. Borwein, "Opportunities and challenges in experimental mathematics", SIAM News, October 1, 2014, to appea,

J. Autschbach, N. Govind, R. Atta-Fynn, E.J. Bylaska, J.H. Weare, W.A. de Jong, "Computational tools for predictive modeling of properties in complex actinide systems", Computational Methods in Lanthanide and Actinide Chemistry, pp. 299-342, ed. M. Dolg, Wiley-Blackwell, ( October 1, 2014)

Abhinav Sarje, Xiaoye S Li, Alexander Hexemer, "Tuning HipGISAXS on Multi and Many Core Supercomputers", High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, Denver, CO, Springer International Publishing, 2014, 8551:217-238, doi: 10.1007/978-3-319-10214-6_11

With the continual development of multi and many-core architectures, there is a constant need for architecture-specific tuning of application-codes in order to realize high computational performance and energy efficiency, closer to the theoretical peaks of these architectures. In this paper, we present optimization and tuning of HipGISAXS, a parallel X-ray scattering simulation code [9], on various massively-parallel state-of-the-art supercomputers based on multi and many-core processors. In particular, we target clusters of general-purpose multi-cores such as Intel Sandy Bridge and AMD Magny Cours, and many-core accelerators like Nvidia Kepler GPUs and Intel Xeon Phi coprocessors. We present both high-level algorithmic and low-level architecture-aware optimization and tuning methodologies on these platforms. We cover a detailed performance study of our codes on single and multiple nodes of several current top-ranking supercomputers. Additionally, we implement autotuning of many of the algorithmic and optimization parameters for dynamic selection of their optimal values to ensure high-performance and high-efficiency.

Hongzhang Shan, Samuel Williams, Wibe de Jong, Leonid Oliker, "Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture", LBNL Technical Report, October 2014, LBNL 6806E,

David H. Bailey and Marcos Lopez de Prado, "The deflated Sharpe ratio: Correcting for selection bias, backtest overfitting and non-normality", Journal of Portfolio Management, October 1, 2014, 40:94-107,

R.H. Cohen, M. Dorf, M. Dorr, D.D. Ryutov, P.Schwartz, "Plans for Extending COGENT to Model Snowflake Divertors", ESL Team Meeting, GA, September 30, 2014,

Sean Peisert, Security for Computational Infrastructure for Financial Technology, DataLead 2014: Leading the Way in Big Data, Haas School of Business, UC Berkeley, September 30, 2014,

Funk, C., A. Hoell, D. Stone, "Examining the contribution of the observed global warming trend to the California droughts of 2012/2013 and 2013/2014", Bulletin of the American Meterological Society, 2014, 95:S11-S15,

Kshitij Khare, Sang-Yun Oh, Bala Rajaratnam, "A convex pseudo-likelihood framework for high dimensional partial correlation estimation with convergence guarantees", Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2014,

Daniel Martin, Xylar Asay-Davis, Stephen Price, Stephen Cornford, Esmond Ng, William Collins, Response of the Antarctic ice sheet to ocean forcing using the POPSICLES coupled ice sheet - ocean model, Twenty-first Annual WAIS Workshop, September 25, 2014,

George Michelogiannakis, John shalf, "Variable-Width Datapath for On-Chip Network Static Power Reduction", 8th International Symposium on Networks-on-Chip (NOCS), September 2014,

  • Download File: abn.pdf (pdf: 277 KB)

George Michelogiannakis, John Shalf, Variable-Width Datapath for On-Chip Network Static Power Reduction, 8th International Symposium on Networks-on-Chip, September 2014,

John Wu, Alex Sim, Lingfei Wu, Abraham Frankl, Scott Klasky, Jong Y Choi, CS Chang, Michael Churchill, "Exercising ICEE Framework with Fusion Blob Detection", DOE/ASCR NGNS PI meeting, 2014,

Abhinav Sarje, Xiaoye Li, Alexander Hexemer, High-Performance Inverse Modeling with Reverse Monte Carlo Simulations, International Conference on Parallel Processing (ICPP), September 2014,

Abhinav Sarje, Xiaoye S Li, Alexander Hexemer, "High-Performance Inverse Modeling with Reverse Monte Carlo Simulations", 43rd International Conference on Parallel Processing, Minneapolis, MN, IEEE, September 2014, 201-210, doi: 10.1109/ICPP.2014.29

In the field of nanoparticle material science, X-ray scattering techniques are widely used for characterization of macromolecules and particle systems (ordered, partially-ordered or custom) based on their structural properties at the micro- and nano-scales. Numerous applications utilize these, including design and fabrication of energy-relevant nanodevices such as photovoltaic and energy storage devices. Due to its size, analysis of raw data obtained through present ultra-fast light beamlines and X-ray scattering detectors has been a primary bottleneck in such characterization processes. To address this hurdle, we are developing high-performance parallel algorithms and codes for analysis of X-ray scattering data for several of the scattering methods, such as the Small Angle X-ray Scattering (SAXS), which we talk about in this paper. As an inverse modeling problem, structural fitting of the raw data obtained through SAXS experiments is a method used for extracting meaningful information on the structural properties of materials. Such fitting processes involve a large number of variable parameters and, hence, require a large amount of computational power. In this paper, we focus on this problem and present a high-performance and scalable parallel solution based on the Reverse Monte Carlo simulation algorithm, on highly-parallel systems such as clusters of multicore CPUs and graphics processors. We have implemented and optimized our algorithm on generic multi-core CPUs as well as the Nvidia GPU architectures with C++ and CUDA. We also present detailed performance results and computational analysis of our code.

Wen Shen, A.F. Kemper, T.P. Devereaux, J.K. Freericks, "Exact solution for high harmonic generation and the response to an AC driving field for a charge density wave insulator", Phys. Rev. B 90, 115113 (2014), September 5, 2014,

Adam Lugowski, Shoaib Kamil, Aydın Buluç, Samuel Williams, Erika Duriakova, Leonid Oliker, Armando Fox, John R. Gilbert,, "Parallel processing of filtered queries in attributed semantic graphs", Journal of Parallel and Distributed Computing (JPDC), September 2014, doi: 10.1016/j.jpdc.2014.08.010

Sean Peisert, Jonathan Margulies, David M. Nicol, Himanshu Khurana, Chris Sawall,, Designed-in Security for Cyber-Physical Systems (Roundtable), IEEE Security and Privacy, Pages: 9-12 September 2014, doi: 10.1109/MSP.2014.90

David H. Bailey and Jonathan M. Borwein and Olga Caprotti and Ursula Martin and Bruno Salvy and Michela Taufer,, "Opportunities and challenges in 21st century mathematical computation: ICERM workshop report", September 1, 2014,

Wenqi Xia, Wei Hu, Zhenyu Li and Jinlong Yang, "A first-principles study of gas adsorption on germanene", Phys. Chem. Chem. Phys., 2014,16, 22495-22498, August 29, 2014, doi: 10.1039/C4CP03292F

The adsorption of common gas molecules (N2, CO, CO2, H2O, NH3, NO, NO2, and O2) on germanene is studied with density functional theory. The results show that N2, CO, CO2, and H2O are physisorbed on germanene via van der Waals interactions, while NH3, NO, NO2, and O2 are chemisorbed on germanene via strong covalent (Ge–N or Ge–O) bonds. The chemisorption of gas molecules on germanene opens a band gap at the Dirac point of germanene. NO2 chemisorption on germanene shows strong hole doping in germanene. O2 is easily dissociated on germanene at room temperature. Different adsorption behaviors of common gas molecules on germanene provide a feasible way to exploit chemically modified germanene.

Didem Unat, George Michelogiannakis, John Shalf, The Role of Modeling in Locality Optimizations, Modeling and simulation workshop (MODSIM), August 2014,

N. Hanford, V. Ahuja, M. Farrens, D. Ghosal, M. Balman, E. Pouyoul, B. Tierney, "Analysis of the effect of core affinity on high-throughput flows", NDM'14, ACM, 2014, doi: 10.1109/NDM.2014.10

Network throughput is scaling-up to higher data rates while end-system processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, techniques such as network adapter offloads and performance tuning have received a great deal of attention. Furthermore, several methods of multithreading the network receive process have been proposed. However, thus far attention has been focused on how to set the tuning parameters and which offloads to select for higher performance, and little has been done to understand why the settings do (or do not) work. In this paper we build on previous research to track down the source(s) of the end-system bottleneck for high-speed TCP flows. For the purposes of this paper, we consider protocol processing efficiency to be the amount of system resources used (such as CPU and cache) per unit of achieved throughout (in Gbps). The amount of various system resources consumed are measured using low-level system event counters. Affinitization, or core binding, is the decision about which processor cores on an end system are responsible for interrupt, network, and application processing. We conclude that affinitization has a significant impact on protocol processing efficiency, and that the performance bottleneck of the network receive process changes drastically with three distinct affinitization scenarios. 

N. Hanford, V. Ahuja, M. Farrens, D. Ghosal, M. Balman, E. Pouyoul, B. Tierney, "Impact of the end-system and affinities on the throughput of high-speed flows", ANCS '14: Proceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems, ACM, 2014, doi: 10.1145/2658260.2661772

Network throughput is scaling-up to higher data rates while processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, network adapter offloads and performance tuning have received a great deal of attention. However, much of this attention is focused on how to set the tuning parameters and which offloads to select for higher performance and not why they do (or do not) work. In this study we have attempted to address two issues that impact the data transfer performance. First is the impact of the processor core affinity (or core binding) which determines the choice of which processor core or cores handle certain tasks in a network- or I/O-heavy application running on a multicore end-system. Second issue is the impact of Ethernet pause frames which provides a link layer flow control in addition to the end-to-end flow control provided by TCP. The goal of our research is to delve deeper into why these tuning suggestions and this offload exist, and how they affect the end-to-end performance and efficiency of a single, large TCP flow. 

A.F. Kemper, M.A. Sentef, B. Moritz, J.K. Freericks, T.P. Devereaux, "Effect of dynamical spectral weight distribution on effective interactions in time- resolved spectroscopy", Phys. Rev. B 90, 075126 (2014), August 1, 2014,

Schizophrenia Working Group of the Psychiatric Genomics Consortium, "Biological insights from 108 schizophrenia-associated genetic loci", July 24, 2014, 511:421-427, doi: doi:10.1038/nature13595

Folland, C., D. Stone, C. Frederiksen, D. Karoly, and J. Kinter, "The International CLIVAR Climate of the 20th Century Plus (C20C+) Project: Report of the Sixth Workshop", CLIVAR Exchanges, 2014, 19:57-59,

W.A. de Jong, L. Lin, H. Shan, C. Yang and L. Oliker, "Towards modelling complex mesoscale molecular environments", International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE), 2014,

Bin Dong, Xiuqiao Li, Limin Xiao, Li Ruan, "Towards minimizing disk I/O contention: A partitioned file assignment approach", Future Generation Computer Systems, Volume 37, July 2014, Pages 178-190, 2014,

Mark Adams, Jed Brown, Matt Knepley, Ravi Samtaney, "Segmental Refinement: A Multigrid Technique for Data Locality", Submitted to SISC, June 30, 2014,

Babak Behzad, Surendra Byna, Stefan M. Wild, Mr. Prabhat, Marc Snir, "Improving Parallel I/O Autotuning with Performance Modeling", ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2014), New York, NY, USA, ACM, 2014, 253--256, doi: 10.1145/2600212.2600708

Masood Parvania, Georgia Koutsandria, Vishak Muthukumar, Sean Peisert, Chuck McParland, Anna Scaglione, "Hybrid Control Network Intrusion Detection Systems for Automated Power Distribution Systems", Proceedings of the 1st International Workshop on Trustworthiness of Smart Grids (ToSG), Atlanta, GA, IEEE Computer Society, June 23, 2014, 774-779, doi: 10.1109/DSN.2014.81

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani, "Parallel data analysis directly on scientific file formats", Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14)., June 23, 2014, doi: 10.1145/2588555.2612185

George Michelogiannakis, Collective Memory Transfers for Multi-Core Chips, International Conference on Supercomputing (ICS), June 2014,

Amir Kamil, Yili Zheng, Katherine Yelick, "A Local-View Array Library for Partitioned Global Address Space C++ Programs", ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, June 2014,

Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.
Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.
Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.
Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.
Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.

Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.

Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.Multidimensional arrays are an important data structure in many scientific applications. Unfortunately, built-in support for such arrays is inadequate in C++, particularly in the distributed setting where bulk communication operations are required for good performance. In this paper, we present a multidimensional library for partitioned global address space (PGAS) programs, supporting the one-sided remote access and bulk operations of the PGAS model. The library is based on Titanium arrays, which have proven to provide good productivity and performance. These arrays provide a local view of data, where each rank constructs its own portion of a global data structure, matching the local view of execution common to PGAS programs and providing maximum flexibility in structuring global data. Unlike Titanium, which has its own compiler with array-specific analyses, optimizations, and code generation, we implement multidimensional arrays solely through a C++ library. The main goal of this effort is to provide a library-based implementation that can match the productivity and performance of a compiler-based approach. We implement the array library as an extension to UPC++, a C++ library for PGAS programs, and we extend Titanium arrays with specializations to improve performance. We evaluate the array library by porting four Titanium benchmarks to UPC++, demonstrating that it can achieve up to 25% better performance than Titanium without a significant increase in programmer effort.

George Michelogiannakis, Alexander Williams, Samuel Williams, John Shalf, "Collective Memory Transfers for Multi-Core Chips", International Conference on Supercomputing (ICS), June 2014, doi: 10.1145/2597652.2597654

Gunther H. Weber, Hans Johansen, Daniel T. Graves, Terry J. Ligocki, "Simulating Urban Environments for Energy Analysis", Proceedings Visualization in Environmental Sciences (EnvirVis), 2014, LBNL 6652E,

Pieter Ghysels, Xiaoye S. Li, Artem Napov, François-Henry Rouet, Jianlin Xia, Hierarchically Low-Rank Structured Sparse Factorization with Reduced Communication and Synchronization, Householder Symposium XIX, June 2014,

Pieter Ghysels, Wim Vanroose, Karl Meerbergen, High Performance Implementation of Deflated Preconditioned Conjugate Gradients with Approximate Eigenvectors, Householder Symposium XIX June 8-13, Spa Belgium, Pages: 84 June 2014,

thumbnail

Sergi Molins, David Trebotich, Li Yang, Jonathan B. Ajo-Franklin, Terry J. Ligocki, Chaopeng Shen and Carl Steefel, "Pore-Scale Controls on Calcite Dissolution Rates from Flow-through Laboratory and Numerical Experiments", Environmental Science and Technology, May 27, 2014, 48:7453-7460, doi: 10.1021/es5013438

Tiancheng Chang, Sisi Duan, Hein Meling, Sean Peisert, Haibin Zhang, "P2S: A Fault-Tolerant Publish/Subscribe Infrastructure", Proceedings of the 8th ACM International Conference on Distributed Event Based Systems (DEBS), Mumbai, India, ACM Press, May 2014, 189-197, doi: 10.1145/2611286.2611305

Yili Zheng, Amir Kamil, Michael B. Driscoll, Hongzhang Shan, Katherine Yelick, "UPC++: A PGAS Extension for C++", International Parallel and Distributed Processing Symposium (IPDPS), May 2014,

Matt Bishop, Heather Conboy, Huong Phan, Borislava I. Simidchieva, George Avrunin, Lori Clarke, Lee Osterweil, Sean Peisert,, "Insider Detection by Process Analysis", Proceedings of the 2014 Workshop on Research for Insider Threat (WRIT), IEEE Computer Society Security and Privacy Workshops, San Jose, CA, IEEE Computer Society, May 18, 2014, doi: 10.1109/SPW.2014.40

Sean Peisert, Challenges in Insider Threat Research, Workshop on Research for Insider Threat (WRIT), IEEE Security and Privacy Workshops (SPW), May 18, 2014,

Robert Saye, "High-order methods for computing distances to implicitly defined surfaces", Communications in Applied Mathematics and Computational Science, May 17, 2014, doi: 10.2140/camcos.2014.9.107

Dáithí Stone, Michael Wehner, Shreyas Cholia, Harinarayan Krishnan, Piotr Wolski, Mark Tadross, Chris Folland, Nikos Christidis, Hideo Shiogama, "The C20C+ Detection and Attribution Project", Integrated Climate Modeling Principal Investigator Meeting 2014, 2014,

Mark F. Adams, Jed Brown, John Shalf, Brian Van Straalen, Erich Strohmaier, Samuel Williams, "HPGMG 1.0: A Benchmark for Ranking High Performance Computing Systems", LBNL Technical Report, 2014, LBNL 6630E,

Peter G. Neumann, Sean Peisert, Marvin Schaefer, "The IEEE Symposium on Security and Privacy, in Retrospect", IEEE Security and Privacy, May 2014, 12(3):15-17, doi: 10.1109/MSP.2014.59

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, Qiji Jim Zhu, "Pseudo-mathematics and financial charlatanism: The effects of backtest over fitting on out-of-sample performance", Notices of the American Mathematical Society, May 1, 2014, 458-471,

Recent computational advances allow investment managers to search for profitable investment strategies. In many instances, that search involves a pseudo-mathematical argument, which is spuriously validated through a simulation of its historical performance (also called backtest).

We prove that high performance is easily achievable after backtesting a relatively small number of alternative strategy configurations, a practice we denote “backtest overfitting”. The higher the number of configurations tried, the greater is the probability that the backtest is overfit. Because financial analysts rarely report the number of configurations tried for a given backtest, investors cannot evaluate the degree of overfitting in most investment proposals.

The implication is that investors can be easily misled into allocating capital to strategies that appear to be mathematically sound and empirically supported by an outstanding backtest. This practice is particularly pernicious, because due to the nature of financial time series, backtest overfitting has a detrimental effect on the future strategy’s performance.

Massively-Parallel Simulations Verify Carbon Dioxide Sequestration Experiments, FY15 DOE ASCR Budget Request to Congress, May 1, 2014,

H. M. Aktulga, A. Buluc, S. Williams, C. Yang, "Optimizing Sparse Matrix-Multiple Vector Multiplication for Nuclear Configuration Interaction Calculations", International Parallel and Distributed Processing Symposium (IPDPS 2014), May 2014, doi: 10.1109/IPDPS.2014.125

Adrian Tate, Amir Kamil, Anshu Dubey, Armin Größlinger, Brad Chamberlain, Brice Goglin, Carter Edwards, Chris J. Newburn, David Padua, Didem Unat, Emmanuel Jeannot, Frank Hannig, Gysi Tobias, Hatem Ltaief, James Sexton, Jesus Labarta, John Shalf, Karl Fuerlinger, Kathryn O’Brien, Leonidas Linardakis, Maciej Besta, Marie-Christine Sawley, Mark Abraham, Mauro Bianco, Miquel Pericàs, Naoya Maruyama, Paul Kelly, Peter Messmer, Robert B. Ross, Romain Cledat, Satoshi Matsuoka, Thomas Schulthess, Torsten Hoefler, Vitus Leung, "Programming Abstractions for Data Locality", 2014 Workshop on Programming Abstractions for Data Locality, April 29, 2014,

The goal of the workshop and this report is to identify common themes and standardize concepts for locality-preserving abstractions for exascale programming models. Current software tools are built on the premise that computing is the most expensive component, we are rapidly moving to an era that computing is cheap and massively parallel while data movement dominates energy and performance costs. In order to respond to exascale systems (the next generation of high performance computing systems), the scientific computing community needs to refactor their applications to align with the emerging data-centric paradigm. Our applications must be evolved to express information about data locality. Unfortunately current programming environments offer few ways to do so. They ignore the incurred cost of communication and simply rely on the hardware cache coherency to virtualize data movement. With the increasing importance of task-level parallelism on future systems, task models have to support constructs that express data locality and affinity. At the system level, communication libraries implicitly assume all the processing elements are equidistant to each other. In order to take advantage of emerging technologies, application developers need a set of programming abstractions to describe data locality for the new computing ecosystem. The new programming paradigm should be more data centric and allow to describe how to decompose and how to layout data in the memory.
Fortunately, there are many emerging concepts such as constructs for tiling, data layout, array views, task and thread affinity, and topology aware communication libraries for managing data locality. There is an opportunity to identify commonalities in strategy to enable us to combine the best of these concepts to develop a comprehen- sive approach to expressing and managing data locality on exascale programming systems. These programming model abstractions can expose crucial information about data locality to the compiler and runtime system to en- able performance-portable code. The research question is to identify the right level of abstraction, which includes techniques that range from template libraries all the way to completely new languages to achieve this goal.

The goal of the workshop and this report is to identify common themes and standardize concepts for locality-preserving abstractions for exascale programming models. Current software tools are built on the premise that computing is the most expensive component, we are rapidly moving to an era that computing is cheap and massively parallel while data movement dominates energy and performance costs. In order to respond to exascale systems (the next generation of high performance computing systems), the scientific computing community needs to refactor their applications to align with the emerging data-centric paradigm. Our applications must be evolved to express information about data locality. Unfortunately current programming environments offer few ways to do so. They ignore the incurred cost of communication and simply rely on the hardware cache coherency to virtualize data movement. With the increasing importance of task-level parallelism on future systems, task models have to support constructs that express data locality and affinity. At the system level, communication libraries implicitly assume all the processing elements are equidistant to each other. In order to take advantage of emerging technologies, application developers need a set of programming abstractions to describe data locality for the new computing ecosystem. The new programming paradigm should be more data centric and allow to describe how to decompose and how to layout data in the memory.

Fortunately, there are many emerging concepts such as constructs for tiling, data layout, array views, task and thread affinity, and topology aware communication libraries for managing data locality. There is an opportunity to identify commonalities in strategy to enable us to combine the best of these concepts to develop a comprehensive approach to expressing and managing data locality on exascale programming systems. These programming model abstractions can expose crucial information about data locality to the compiler and runtime system to enable performance-portable code. The research question is to identify the right level of abstraction, which includes techniques that range from template libraries all the way to completely new languages to achieve this goal.

Amir Kamil, Managing Hierarchy with Teams in the SPMD Programming Model, Workshop on Programming Abstractions for Data Locality (PADAL'14), April 28, 2014,

The single program, multiple data (SPMD) model of parallelism is the dominant programming model for large-scale distributed-memory machines. Its simple structure maps well to such machines: it exposes the actual degree of available parallelism, leads to good locality, and can be implemented by efficient runtime systems. However, its simplicity also makes it difficult to manage hierarchy, both at the algorithmic level (e.g. divide-and-conquer algorithms) and in addressing the communication characteristics of hierarchical machines. In this talk, we present a hierarchical team mechanism that allows SPMD programs to manage hierarchy. We show that it allows divide-and-conquer algorithms such as sorting to be expressed in SPMD and that it enables optimizations for hierarchical machines, increasing the scalability and/or performance of multiple benchmarks. We also explore how hierarchical teams may prove useful in other programming abstractions, such as expressing hierarchical distribution of data.

US Patent 8,705,342 B2. “Co-scheduling of network resource provisioning and host-to-host bandwidth reservation on high-performance network and storage systems”, D. Yu, D. Katramatos, A. Sim, and A. Shoshani, Apr. 22, 2014, prior publication No. US 2012/0268053 A1 issued on Oct. 25, 2012, provisional application No. 61/393,750, filed on Oct. 15, 2010, LBNL IB-3152, BNL BSA 11-02.

Khaled Ibrahim, Paul Hargrove, Costin Iancu, Katherine Yelick, "A Performance Evaluation of One-Sided and Two-Sided Communication Paradigms on Relaxed-Ordering Interconnect", IPDPS 2014, April 17, 2014,

Khaled Z. Ibrahim, Steven Hofmeyr, Costin Iancu, "The Case for Partitioning Virtual Machines on Manycore Architectures", IEEE TPDS, April 17, 2014,

Andrew Myers, Richard Klein, Mark Krumholz, Christopher McKee, "Star cluster formation in turbulent, magnetized dense clumps with radiative and outflow feedback", Monthly Notices of the Royal Astronomical Society, Volume 439, Issue 4, p.3420-3438, April 1, 2014,

E. Vecharynski and Y. Saad, "Fast updating algorithms for latent semantic indexing", SIAM Journal on Matrix Analysis and Applications, Vol. 35, Issue 3, pp. 1105–1131, 2014,

This paper discusses a few algorithms for updating the approximate singular value decomposition (SVD) in the context of information retrieval by latent semantic indexing (LSI) methods. A unifying framework is considered which is based on Rayleigh–Ritz projection methods. First, a Rayleigh–Ritz approach for the SVD is discussed and it is then used to interpret the Zha and Simon algorithms [SIAM J. Sci. Comput., 21 (1999), pp. 782–791]. This viewpoint leads to a few alternatives whose goal is to reduce computational cost and storage requirement by projection techniques that utilize subspaces of much smaller dimension. Numerical experiments show that the proposed algorithms yield accuracies comparable to those obtained from standard ones at a much lower computational cost. 

Abhinav Sarje, Towards Real-Time Nanostructure Prediction with GPUs, GPU Technology Conference, March 2014,

In the field of nanoparticle materials science, synchrotron light-sources play a
crucial role where X-ray scattering techniques are used for nanostructure prediction
through characterization of macromolecules and nanoparticle systems based on their
structural properties. Applications of these are widespread, including artificial
photosynthesis, solar cell membranes, photovoltaics and energy storage devices, smart
windows, high-density data storage media and drug discovery. Current state-of-the-art
high-throughput beamlines at light sources worldwide are capable of generating
terabytes of raw scattering data per week, and is continually growing. This has
created a big gap between data generation and data analysis. Consequently, the
beamline scientists and users have been faced with an extremely inefficient
utilization of the light sources, and they are expressing a growing need for real-time
data analysis tools to bridge this gap.

X-ray scattering comes in many flavors such as the widely used small angle X-ray
scattering (SAXS) and grazing incidence SAXS (GISAXS) which will be the case studies
in this session. Efforts are underway at Berkeley Lab to bring scattering data
analysis up to the speed of data generation through high-performance and parallel
computing. Such analysis is generally composed of two steps: 1) Forward Simulation,
and 2) Structural Fitting, which uses forward simulation as a building block. Forward
simulation of X-ray scattering experiments is an embarrassingly parallel computational
problem, making it an ideal candidate for implementation on many-core architectures
such as graphics processors and massively parallel computing. An example of such a
simulation code developed under these efforts is HipGISAXS, which is a
high-performance and massively parallel code capable of harnessing the computational
power offered by clusters of GPUs. HipGISAXS is a step towards real-time scattering
data analysis as it has already brought simulation times down to the order of
milliseconds and seconds from hours and days through the power of GPUs. The second
component, structural fitting, can be described as an inverse modeling and
computational optimization problem, involving a large number of variable parameters,
making it highly compute-intensive. An example of inverse modeling code, also
developed at Berkeley Lab, is HipRMC. This is a Reverse Monte Carlo (RMC), a popular
method to extract information from SAXS data, based implementation which utilizes GPU
computing to provide fast results.

Although GPUs are able to deliver high computational power through naive
implementations, they require intensive architecture-aware code tuning in order to
attain performance rates closer to their theoretical peaks. Such optimizations involve
mapping of computations and data transfers perfectly on to the architecture. HipGISAXS
and HipRMC include optimizations which enable them to perform significantly better
than other processor architectures.

O. Angélil, D. A. Stone, M. Tadross, F. Tummon, M. Wehner, R. Knutti, "Attribution of extreme weather to anthropogenic greenhouse gas emissions: sensitivity to spatial and temporal scales", Geophysical Research Letters, 2014, 41:2150-2155, doi: 10.1002/2014GL059234

Abhinav Sarje, Xiaoye Li, Slim Chourou, Alexander Hexemer, "Petascale X-Ray Scattering Simulations With GPUs", GPU Technology Conference, March 2014,

Abhinav Sarje, Xiaoye Li, Alexander Hexemer, "Inverse Modeling of X-Ray Scattering Data With Reverse Monte Carlo Simulations", GPU Technology Conference, March 2014,

Sisi Duan, Sean Peisert, and Karl Levitt, "hBFT: Speculative Byzantine Fault Tolerance With Minimum Cost", IEEE Transactions on Dependable and Secure Computing (TDSC), March 19, 2014, 12(1):58-70, doi: 10.1109/TDSC.2014.2312331

Richard L. Martin, Cory M. Simon, Berend Smit, Maciej Haranczyk, "In-silico design of porous polymer networks: high-throughput screening for methane storage materials", Journal of the American Chemical Society, March 10, 2014,

Porous polymer networks (PPNs) are a class of advanced porous materials that combine the advantages of cheap and stable polymers with the high surface areas and tunable chemistry of metal-organic frameworks. They are of particular interest for gas separation or storage applications, for instance as methane adsorbents for a vehicular natural gas tank or other portable applications.

D. Oryspayev, P. Maris, J. P. Vary, M. Sosonkina, H. M. Aktulga, "A Memory-efficient Scalable Symmetric Sparse Matrix Vector Multiplication Algorithm for Distributed Memory Architectures", Concurrency and Computation: Practice and Experience, in preparation, March 8, 2014,

S. B. Kylasa, H. M. Aktulga, A. Y. Grama, "PG-PuReMD: A Parallel-GPU Reactive Molecular Dynamics Package", Computer Physics Communications, in preparation, March 7, 2014,

Richard L. Martin, Maciej Haranczyk, "Construction and Characterization of Structure Models of Crystalline Porous Polymers", Crystal Growth & Design, March 6, 2014,

Metal-organic frameworks (MOFs) and covalent organic frameworks (COFs) are examples of advanced porous polymeric materials that have emerged in recent years. Their crystalline structure and modular synthesis offer unmatched versatility in their design. By exchanging chemical building blocks, one can both explore the unlimited space of possible structural chemistry within an isoreticular (same crystal topology) series, as well as achieve a wide range of alternative topologies.

S. B. Kylasa, H. M. Aktulga, A. Y. Grama, "PuReMD-GPU: A Reactive Molecular Dynamics Simulation Package for GPUs", Journal of Computational Physics, under revision, March 5, 2014,

David H. Bailey, Jonathan M. Borwein, "Pi day is upon us again, and we still do not know if pi is normal", American Mathematical Monthly, March 1, 2014, 191-206,

David H. Bailey, Jonathan M. Borwein, "High-precision arithmetic: Progress and challenges", March 1, 2014,

Xiaoye S. Li, Artem Napov, Francois-Henry Rouet, Designing multifrontal solvers using hierarchically semiseparable structures, SIAM Conference on Parallel Processing for Scientific Computing (PP12), Portland, OR, USA, February 2014,

A. L. Chervenak, A. Sim, J. Gu, R. Schuler, N. Hirpathak, "Efficient Data Staging Using Performance-Based Adaptation and Policy-Based Resource Allocation", 22nd Euromicro International Conference on Parallel, Distributed and Network-based Processing, 2014,

E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, U. V. Catalyurek, "An Out-of-core Task-based Middleware for Data Intensive Scientific Computing", Handbook on Data Centers, in press, (Springer: February 1, 2014)

Lev Sarkisov, Richard L. Martin, Maciej Haranczyk, Berend Smit, "On the Flexibility of Metal-Organic Frameworks", Journal of the American Chemical Society, January 24, 2014,

Occasional, large amplitude flexibility in metal-organic frameworks (MOFs) is one of the most intriguing recent discoveries in chemistry and material science. Yet, there is at present no theoretical framework that permits the identification of flexible structures in the rapidly expanding universe of MOFs. Here, we propose a simple method to predict whether a MOF is flexible, based on treating it as a system of rigid elements, connected by hinges. This proposition is correct in application to MOFs based on rigid carboxylate linkers.

Wei Hu, Nan Xia, Xiaojun Wu, Zhenyu Li and Jinlong Yang, "Silicene as a highly sensitive molecule sensor for NH3, NO and NO2", Phys. Chem. Chem. Phys., 2014,16, 6957-6962, January 23, 2014, doi: 10.1039/C3CP55250K

On the basis of first-principles calculations, we demonstrate the potential application of silicene as a highly sensitive molecule sensor for NH3, NO, and NO2 molecules. NH3, NO and NO2 molecules chemically adsorb on silicene via strong chemical bonds. With distinct charge transfer from silicene to molecules, silicene and chemisorbed molecules form charge-transfer complexes. The adsorption energy and charge transfer in NO2-adsorbed silicene are larger than those of NH3- and NO-adsorbed silicones. Depending on the adsorbate types and concentrations, the silicene-based charge-transfer complexes exhibit versatile electronic properties with tunable band gap opening at the Dirac point of silicene. The calculated charge carrier concentrations of NO2-chemisorbed silicene are 3 orders of magnitude larger than intrinsic charge carrier concentration of graphene at room temperature. The results present a great potential of silicene for application as a highly sensitive molecule sensor.

J.A. Sobota, S.-L. Yang, D. Leuenberger, A.F. Kemper, J.G. Analytis, I.R. Fisher, P.S. Kirchmann, T.P. Devereaux, Z.-X. Shen, "Ultrafast electron dynamics in the topological insulator Bi2Se3 studied by time-resolved photoemission spectroscopy", Journal of Electron Spectroscopy and Related Phenomena, January 22, 2014,

We characterize the topological insulator Bi2Se3 using time- and angle-resolved photoemission spectroscopy. By employing two-photon photoemission, a complete picture of the unoccupied electronic structure from the Fermi level up to the vacuum level is obtained. We demonstrate that the unoccupied states host a second Dirac surface state which can be resonantly excited by 1.5 eV photons. We then study the ultrafast relaxation processes following optical excitation. We find that they culminate in a persistent non-equilibrium population of the first Dirac surface state, which is maintained by a meta-stable population of the bulk conduction band. Finally, we perform a temperature-dependent study of the electron–phonon scattering processes in the conduction band, and find the unexpected result that their rates decrease with increasing sample temperature. We develop a model of phonon emission and absorption from a population of electrons, and show that this counter-intuitive trend is the natural consequence of fundamental electron–phonon scattering processes. This analysis serves as an important reminder that the decay rates extracted by time-resolved photoemission are not in general equal to single electron scattering rates, but include contributions from filling and emptying processes from a continuum of states.

M.A. Sentef, M. Claassen, A.F. Kemper, B. Moritz, T. Oka, J.K. Freericks, T.P. Devereaux, "Theory of pump-probe photoemission in graphene and the generation of light-induced Haldane multilayers", arXiv pre-print, January 20, 2014,

The combination of time-reversal and inversion symmetry protects massless Dirac fermions in graphene and on the surface of topological insulators. In a milestone paper, Haldane envisioned that breaking either or both of these symmetries would open a gap at the Dirac points, allowing one to tune between a trivial insulator and a Chern insulator. While equilibrium band gap engineering has become a major theme since the first synthesis of monolayer graphene, it was only recently proposed that circularly polarized laser light could turn trivial equilibrium bands into topological nonequilibrium bands. Here we observe ultrafast band gap openings and paradoxical gap closings at a critical field strength. Importantly, the gap openings are accompanied by nontrivial changes of the band topology, realizing a photo-induced Haldane multilayer system. We show that pump-probe photoemission spectroscopy can track these transitions in real time via energy gaps exceeding 100 meV. The analogy with Haldane multilayers is revealed by nontrivial pseudospin textures, going from a monolayer p-wave to a bilayer d-wave symmetry at the critical field strength. We thus predict a nonequilibrium realization of a tunable Haldane multilayer model with a Berry curvature that can be tipped optically by small changes in external fields on femtosecond time scales. Since we are focused on the physics of chiral Dirac fermions, these results apply equally to all systems possessing Dirac points, such as surface states of topological insulators.

Marek Pederzoli, Lukáš Sobek, Jiří Brabec, Karol Kowalski, Lukasz Cwiklik, Jiří Pittner, "Fluorescence of PRODAN in water: A computational QM/MM MD study", Chemical Physics Letters, 2014, 597:57-62, doi: 10.1016/j.cplett.2014.02.031

Elif Dede, Zacharia Fadika, Madhusudhan Govindaraju, Lavanya Ramakrishnan, "Benchmarking MapReduce Implementations Under Different Application Scenarios", Future Generation Computer Systems, 2014,

Daniel Kressner, Marija Miloloža Pandur, Meiyue Shao, "An indefinite variant of LOBPCG for definite matrix pencils", Numerical Algorithms, 2014, 66:681--703, doi: 10.1007/s11075-013-9754-3

A. G. Kim, G. Aldering, P. Antilogus, C. Aragon, S., C. Baltay, S. Bongard, C. Buton, A., F. Cellier-Holzem, M. Childress, N., Y. Copin, H. K. Fakhouri, U. Feindt, M., E. Gangler, P. Greskovic, J. Guy, M., S. Lombardo, J. Nordin, P. Nugent, R., E. Pecontal, R. Pereira, S. Perlmutter, D., M. Rigault, K. Runge, C. Saunders, R., G. Smadja, C. Tao, R. C. Thomas, B. A. Weaver, "Type Ia Supernova Hubble Residuals and Host-galaxy Properties", Astrophysical Journal, 2014, 784:51, doi: 10.1088/0004-637X/784/1/51

Jarrod R McClean, Alan Aspuru-Guzik, "Compact wavefunctions from compressed imaginary time evolution", arXiv preprint arXiv:1409.7358, 2014,

Elif Dede, Zacharia Fadika, Madhusudhan Govindaraju, Lavanya Ramakrishnan, "MARIANE: Using MApReduce In HPC Environments", Future Generation Computer Systems, 2014,

Stephane Descombes, Max Duarte, Thierry Dumont, Frederique Laurent, Violaine Louvet, Marc Massot, "Analysis of operator splitting in the non-asymptotic regime for nonlinear reaction-diffusion equations. Application to the dynamics of premixed flames", SIAM J. Num. Anal., 2014, 52:1311-1334,

Meiyue Shao, "On the finite section method for computing exponentials of doubly-infinite skew-Hermitian matrices", Linear Algebra and its Applications, 2014, 451:65--92, doi: 10.1016/j.laa.2014.03.021

Joonsuk Huh, Gian Giacomo Guerreschi, Borja Peropadre, Jarrod R McClean, Al\ an Aspuru-Guzik, "Boson Sampling for Molecular Vibronic Spectra", arXiv preprint arXiv:1412.8427, 2014,

E. Vecharynski, Y. Saad, and M. Sosonkina, "Graph partitioning using matrix values for preconditioning symmetric positive definite systems", SIAM Journal on Scientific Computing Vol. 36, Issue 1, pp. A63-A87, 2014,

Prior to the parallel solution of a large linear system, it is required to perform a partitioning of its equations/unknowns. Standard partitioning algorithms are designed using the considerations of the efficiency of the parallel matrix-vector multiplication, and typically disregard the information on the coefficients of the matrix. This information, however, may have a significant impact on the quality of the preconditioning procedure used within the chosen iterative scheme. In the present paper, we suggest a spectral partitioning algorithm, which takes into account the information on the matrix coefficients and constructs partitions with respect to the objective of enhancing the quality of the nonoverlapping additive Schwarz (block Jacobi) preconditioning for symmetric positive definite linear systems. For a set of test problems with large variations in magnitudes of matrix coefficients, our numerical experiments demonstrate a noticeable improvement in the convergence of the resulting solution scheme when using the new partitioning approach. 

Meiyue Shao, Weiguo Gao, Jungong Xue, "Aggressively truncated Taylor series method for accurate computation of exponentials of essentially nonnegative matrices", SIAM Journal on Matrix Analysis and Applications, 2014, 35:317--338, doi: 10.1137/120894294

M Scot Breitenfeld, Kalyana Chadalavada, Robert Sisneros, Surendra Byna, Quincey Koziol, Neil Fortner, Prabhat, Venkat Vishwanath, "Recent Progress in Tuning Performance of Large-scale I/O with Parallel HDF5", The 9th Parallel Data Storage Workshop (PDSW) held in conjunction with SC14, 2014,

Anshu Dubey, Ann Almgren, John Bell, Martin Berzins, Steve Brandt, Greg Bryan, Phillip Colella, Daniel Graves, Michael Lijewski, Frank L\ offler, others, "A survey of high level frameworks in block-structured adaptive mesh refinement packages", Journal of Parallel and Distributed Computing, 2014, 74:3217--3227, doi: 10.1016/j.jpdc.2014.07.001

Elif Dede, Bedri Sendir, Pinar Kuzlu, Madhusudhan Govindaraju, Lavanya Ramakrishnan, "A Processing Pipeline for Cassandra Datasets Based on Hadoop Streaming", IEEE International Congress on Big Data, 2014,

H. Braun, W. Schmidt, J.C. Niemeyer, and A.S. Almgren, "Large-eddy simulations of isolated disk galaxies with thermal and turbulent feedback", Monthly Notices of the Royal Astronomical Society, January 2014, 442:3407-3426,

Hsuan-Te Chiu, Jerry Chou, Venkat Vishwanath, Surendra Byna, Kesheng Wu, "Simplifying index file structure to improve I/O performance of parallel indexing", Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on, 2014, 576-583, doi: 10.1109/PADSW.2014.7097856

Tonglin Hawk, Ioan Raicu, Lavanya Ramakrishnan, "Scalable State Management for Scientific Applications in the Cloud", IEEE International Congress on Big Data, 2014,

A. Donev, A. Nonaka, Y. Sun, T. Fai, A. Garcia and J. Bell, "Low Mach Number Fluctuating Hydrodynamics of Diffusively Mixing Fluids", Comm. App. Math. and Comp. Sci., 2014, 9(1),

Ted Habermann, Andrew Collette, Steve Vincena, Jay Jay Billings, Matt Gerring, Konrad Hinsen, Werner Benger, Filipe RNC Maia, Suren Byna, Pierre de Buyl, "The Hierarchical Data Format (HDF): A Foundation for Sustainable Data and Software", 2nd Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), in conjunction with Supercomputing 2014 (SC14), 2014,

Spyros Blanas, Kesheng Wu, Surendra Byna, Bin Dong, Arie Shoshani, "Parallel Data Analysis Directly on Scientific File Formats", SIGMOD 14, 2014, 385--396, doi: 10.1145/2588555.2612185

Lavanya Ramakrishnan, Sarah Poon, Gilberto Z. Pastorello, Daniel Gunter, Valerie Hendrix, Deborah Agarwal, "Scientist-Centered Design for eScience:A Tigres Case Study", IEEE eScience, 2014,

C. M. Malone, M. Zingale, A. Nonaka, A. S. Almgren, and J. B. Bell, "Multidimensional Modeling of Type I X-ray Bursts. II. Two-Dimensional Convection in a Mixed H/He Accretor", Astrophysical Journal, 2014, 788:115,

Surendra Byna Jialin Liu, Yong Chen, "Model-driven Data Layout Selection for Improving Read Performance", In The Proceedings of The 2014 International Workshop on High Performance Data Intensive Computing (HPDIC2014), in conjunction with the 28th IEEE International Parallel & Distributed Processing Symposium (IPDPS 14), 2014,

P. M. Vreeswijk, S. Savaglio, A. Gal-Yam, A. Cia, R. M. Quimby, M. Sullivan, S. B. Cenko, D. A., A. V. Filippenko, K. I. Clubb, F., J. Sollerman, G. Leloudas, I. Arcavi, A., M. M. Kasliwal, Y. Cao, O. Yaron, D., E. O. Ofek, J. Capone, A. S. Kutyrev, V., P. E. Nugent, R. Laher, J. Surace, S. R. Kulkarni, "The Hydrogen-poor Superluminous Supernova iPTF 13ajg and its Host Galaxy in Absorption and Emission", Astrophysical Journal, 2014, 797:24, doi: 10.1088/0004-637X/797/1/24

Max Duarte, Ann S Almgren, Kaushik Balakrishnan, John B. Bell, David M. Romps, "A Numerical Study of Methods for Moist Atmospheric Flows: Compressible Equations", Monthly Weather Review, 2014, 142:4269-4283,

W. Schmidt, A.S. Almgren, H. Braun, J.F. Engels, J.C. Niemeyer, R.R. Mekuria, A.J. Aspden, J.B. Bell, "Cosmological Fluid Mechanics with Adaptively Refined Large Eddy Simulations", Monthly Notices of the Royal Astronomical Society, 2014, 440:3051-3077,

U Lourderaj, R Sun, SC Kohale, GL Barnes, WA De Jong, TL Windus, WL Hase, "The VENUS/NWChem software package. Tight coupling between chemical dynamics simulations and electronic structure theory", Computer Physics Communications, 2014, 185:1074--1080, doi: 10.1016/j.cpc.2013.11.011

R. A. Scalzo, M. Childress, B. Tucker, F. Yuan, B., P. J. Brown, C. Contreras, N. Morrell, E., C. Burns, M. M. Phillips, A. Campillay, C., K. Krisciunas, M. Stritzinger, M. L., J. Parrent, S. Valenti, C. Lidman, B., N. Scott, M. Fraser, A. Gal-Yam, C., K. Maguire, S. J. Smartt, J. Sollerman, M., F. Taddia, O. Yaron, D. R. Young, S., C. Baltay, N. Ellman, U. Feindt, E., R. McKinnon, P. E. Nugent, D. Rabinowitz, E. S. Walker, "Early ultraviolet emission in the Type Ia supernova LSQ12gdj: No evidence for ongoing shock interaction", Monthly Notices of the RAS, 2014, 445:30-48, doi: 10.1093/mnras/stu1723

J. González-Domínguez, O. Marques, M. J. Martín and J. Touriño, "A 2D Algorithm with Asymmetric Workload for the UPC Conjugate Gradient Method", The Journal of Supercomputing, 2014, 70:816-829,

David H. Bailey, Jonathan M. Borwein, Alexander D. Kaiser, "Automated simplification of large symbolic expressions", Journal of Symbolic Computation, January 1, 2014, 60:120-136,

DP Schissel, Gheni Abla, SM Flanagan, M Greenwald, X Lee, A Romosan, A Shoshani, J Stillerman, J Wright, "Automated metadata, provenance cataloging and navigable interfaces: Ensuring the usefulness of extreme-scale data", Fusion Engineering and Design, North-Holland, 2014,

Anuj Chaudhri, John Bell, Alejandro Garcia, Aleksandar Donev, "Modeling Multi-Phase Flow using Fluctuating Hydrodynamics", Phys. Rev. E, 2014, 90(3):033014,

K. Balakrishnan, A. Garcia, A. Donev, and J. Bell, "Fluctuating hydrodynamics of multispecies nonreactive mixtures", Physical Review E,, 2014, 89(1),

K. Maguire, M. Sullivan, Y.-C. Pan, A. Gal-Yam, I. M., D. A. Howell, P. E. Nugent, P. Mazzali, N., K. I. Clubb, A. V. Filippenko, M. M., M. T. Kandrashoff, D. Poznanski, C. M., J. M. Silverman, E. Walker, D. Xu, "Exploring the spectral diversity of low-redshift Type Ia supernovae using the Palomar Transient Factory", Monthly Notices of the RAS, 2014, 444:3258-3274, doi: 10.1093/mnras/stu1607

A. Fujii, O. Marques, "Axis Communication Method for Algebraic Multigrid Solver", IEICE Transactions on Information and Systems, 2014, E97-D:2955-2958,

John C Wright, Martin Greenwald, Joshua Stillerman, Gheni Abla, Bobby Chanthavong, Sean Flanagan, David Schissel, Xia Lee, Alex Romosan, Arie Shoshani, The MPO API: A tool for recording scientific workflows, Fusion Engineering and Design, 2014,

W. Schmidt, J. Schulz, L. Iapichino, A.S. Almgren,, "Influence of adaptive mesh refinement and the hydro solver on shear-induced mass stripping in a minor merger scenario,", Astronomy and Computing, 2014,

C. M. Malone, A. Nonaka, S. E. Woosley, A. S. Almgren, J. B. Bell, S. Dong, and M. Zingale, "The Deflagration Stage of Chandrasekhar Mass Models for Type Ia Supernovae: I. Early Evolution", Astrophysical Journal, 2014, 782:11,

M. Nicholl, S. J. Smartt, A. Jerkstrand, C. Inserra, J. P., C. Baltay, S. Benetti, T.-W. Chen, N., U. Feindt, M. Fraser, A. Gal-Yam, E., D. A. Howell, R. Kotak, A. Lawrence, G., S. Margheim, S. Mattila, M. McCrum, R., A. Mead, P. Nugent, D. Rabinowitz, A., K. W. Smith, J. Sollerman, M. Sullivan, F., S. Valenti, E. S. Walker, D. R. Young, "Superluminous supernovae from PESSTO", Monthly Notices of the RAS, 2014, 444:2096-2113, doi: 10.1093/mnras/stu1579

J Langguth, A Azad, M Halappanavar, F Manne, "On parallel push–relabel based algorithms for bipartite maximum matching", Parallel Computing, January 2014,

Gunther H. Weber, Helwig Hauser, "Interactive Visual Exploration and Analysis", Mathematics and Visualization, (Springer-Verlag: 2014) Pages: 161--174, LBNL 6655E,

Ke-Jung Chen, Alexander Heger, Stan Woosley, Ann Almgren, and Daniel Whalen, "Pair Instability Supernovae of Very Massive Population III Stars", Astrophysical Journal, January 2014, 792:44,

K.-G. Lee, J. F. Hennawi, C. Stark, J. X. Prochaska, M., D. J. Schlegel, A.-C. Eilers, A. Arinyo-i-Prats, N., R. A. C. Croft, K. I. Caputi, P. Cassata, O., B. Garilli, A. M. Koekemoer, V. Le Brun, O. Fevre, D. Maccagni, P. Nugent, Y. Taniguchi, L. A. M., L. Tresse, G. Zamorani, E. Zucca, Lyman-alpha Forest Tomography from Background Galaxies: The First Megaparsec-resolution Large-scale Structure Map at z > 2, Astrophysical Journal, Pages: L12 2014, doi: 10.1088/2041-8205/795/1/L12

Patrick Oesterling, Christian Heine, Gunther H. Weber, Gerik Scheuermann, "A Topology-Based Approach to Visualize the Thematic Composition of Document Collections", Theory and Applications of Natural Language Processing, (Springer International Publishing: 2014) Pages: 63-85 doi: 10.1007/978-3-319-12655-5_4

Antonio Valles, Weiqun Zhang, "Optimizing For Reacting Navier-Stokes Equations", High Performance Parallelism Pearls: Multicore And Many-Core Programming Approaches, edited by James Reinder, Jim Jeffers, 2014,

Ke-Jung Chen, Alexander Heger, S.E. Woosley, Ann S. Almgren, and Daniel J. Whalen, "Two-Dimensional Simulations of Pulsational Pair-Instability Supernova", Astrophysical Journal, 2014, 792:28,

G. Ballard, J. Demmel, L. Grigori, M. Jacquelin, Hong Diep Nguyen, E. Solomonik, "Reconstructing Householder Vectors from Tall-Skinny QR", Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, 2014, 1159-1170, doi: 10.1109/IPDPS.2014.120

E. S. Walker, P. A. Mazzali, E. Pian, K. Hurley, I., S. B. Cenko, A. Gal-Yam, A. Horesh, M., D. Poznanski, J. M. Silverman, M., J. S. Bloom, A. V. Filippenko, S. R., P. E. Nugent, E. Ofek, S. Barthelmy, W., J. Goldsten, S. Golenetskii, M. Ohno, M. S. Tashiro, K. Yamaoka, X. L.-. Zhang, "Optical follow-up observations of PTF10qts, a luminous broad-lined Type Ic supernova found by the Palomar Transient Factory", Monthly Notices of the RAS, 2014, 442:2768-2779, doi: 10.1093/mnras/stu1017

Kenes Beketayev, Damir Yeliussizov, Dmitriy Morozov, Gunther H. Weber, Bernd Hamann, "Measuring the Distance Between Merge Trees", Mathematics and Visualization, (Springer-Verlag: 2014) Pages: 151--166

David H. Bailey, Stephanie Ger, Marcos L\ opez Prado, Alexander Sim, Kesheng Wu, "Statistical Overfitting and Backtest Performance", http://ssrn.com/abstract2507040, ( January 1, 2014)

ISBN 978-1-78548-008-9

Samuel Williams, Mike Lijewski, Ann Almgren, Brian Van Straalen, Erin Carson, Nicholas Knight, James Demmel, "s-step Krylov subspace methods as bottom solvers for geometric multigrid", Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, January 2014, 1149--1158, doi: 10.1109/IPDPS.2014.119

A.J. Aspden, M. S. Day, and J. B. Bell, "Turbulence-Chemistry Interaction in Lean Premixed Hydrogen Combustion", Proc. Comb. Inst., 2014,

Ke-Jung Chen, Alexander Heger, S.E. Woosley, Ann Almgren, and Daniel J. Whalen, and Jarrett L. Johnson, "The General Relativitistic Instability Supernova of a Supermassive Population III Star", Astrophysical Journal, 2014, 790:162,

Laura Grigori, Mathias Jacquelin, Amal Khabou, "Performance predictions of multilevel communication optimal LU and QR factorizations on hierarchical platforms", 29th IEEE International Supercomputing Conference (ISC'2014), Springer, 2014, 76--92,

R. Amanullah, A. Goobar, J. Johansson, D. P. K. Banerjee, V., V. Joshi, N. M. Ashok, Y. Cao, M. M., S. R. Kulkarni, P. E. Nugent, T. Petrushevska, V. Stanishev, "The Peculiar Extinction Law of SN 2014J Measured with the Hubble Space Telescope", Astrophysical Journal Letters, 2014, 788:L21, doi: 10.1088/2041-8205/788/2/L21

F. Rusu, P. Nugent, K. Wu, "Implementing the Palomar Transient Factory Real-Time Pipeline in GLADE: Results and", Lecture Notes in Computer Science, ( 2014) Pages: 53--66

M-H Yung, J Casanova, Antonio Mezzacapo, J McClean, L Lamata, A Aspuru-Guzik, E Solano, "From transistor to trapped-ion computers for quantum chemistry", Scientific reports, 2014, 4,

Dmitriy Morozov, Gunther H. Weber, "Distributed Contour Trees", Mathematics and Visualization, (Springer-Verlag: 2014) Pages: 89-102

P. A. Mazzali, M. Sullivan, S. Hachinger, R. S., P. E. Nugent, D. A. Howell, A. Gal-Yam, K., J. Cooke, R. Thomas, K. Nomoto, E. S. Walker, "Hubble Space Telescope spectra of the Type Ia supernova SN 2011fe: a tail of low-density, high-velocity material with Z < Z⊙", Monthly Notices of the RAS, 2014, 439:1959-1979, doi: 10.1093/mnras/stu077

M. Cai, A. Nonaka, B. E. Griffith, J. B. Bell, and A. Donev, "Efficient Variable-Coefficient Finite-Volume Stokes Solvers", Commun. Comput. Phys., 2014, 16:1263-1297,

E. O. Ofek, I. Arcavi, D. Tal, M. Sullivan, A., S. R. Kulkarni, P. E. Nugent, S., D. Bersier, Y. Cao, S. B. Cenko, A. Cia, A. V. Filippenko, C. Fransson, M. M., R. Laher, J. Surace, R. Quimby, O. Yaron, "Interaction-powered Supernovae: Rise-time versus Peak-luminosity Correlation and the Shock-breakout Velocity", Astrophysical Journal, 2014, 788:154, doi: 10.1088/0004-637X/788/2/154

Jung Heon Song, Kesheng Wu, Horst D Simon, "Parameter Analysis of the VPIN (Volume synchronized of Informed Trading) Metric", Quantitative Financial Risk Management: Theory and, 2014,

A. Corsi, E. O. Ofek, A. Gal-Yam, D. A. Frail, S. R., D. B. Fox, M. M. Kasliwal, M., A. Horesh, J. Carpenter, K. Maguire, I., S. B. Cenko, Y. Cao, K. Mooley, Y.-C., B. Sesar, A. Sternberg, D. Xu, D., P. James, J. S. Bloom, P. E. Nugent, "A Multi-wavelength Investigation of the Radio-loud Supernova PTF11qcj and its Circumstellar Environment", Astrophysical Journal, 2014, 782:42, doi: 10.1088/0004-637X/782/1/42

Ann Almgren, John Bell, Andy Nonaka and Michael Zingale, "Low Mach Number Modeling of Stratified Flows", Finite Volumes for Complex Applications VII -- Methods and Theoretical Apsects, Springer Proceedings in Mathematics and Statistics, edited by J. Fuhrmann, M. Ohlberger, C. Rohde, ( 2014)

M Emmett, W Zhang, JB Bell, "High-order algorithms for compressible reacting flow with complex chemistry", Combustion Theory and Modelling, January 2014, 18:361--387, doi: 10.1080/13647830.2014.919410

A. Gal-Yam, I. Arcavi, E. O. Ofek, S. Ben-Ami, S. B., M. M. Kasliwal, Y. Cao, O. Yaron, D., J. M. Silverman, A. Horesh, A. De Cia, F., J. Sollerman, D. Perley, P. M. Vreeswijk, S. R., P. E. Nugent, A. V. Filippenko, J. C. Wheeler, "A Wolf-Rayet-like progenitor of SN 2013cu from spectral observations of a stellar wind", Nature, 2014, 509:471-474, doi: 10.1038/nature13304

Jung Heon Song, Marcos L\ opez de Prado, Horst Simon, Kesheng Wu, "Exploring Irregular Time Series Through Non-uniform Fourier Transform", WHPCF 14, Piscataway, NJ, USA, IEEE Press, 2014, 37--44, doi: 10.1109/WHPCF.2014.8

Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J Love, Alan Aspuru-Guzik, Jeremy L O’Brien, "A variational eigenvalue solver on a photonic quantum processor", Nature communications, 2014, 5,

Bin Dong, S. Byna, Kesheng Wu, "Parallel query evaluation as a Scientific Data Service", Cluster Computing (CLUSTER), 2014 IEEE International Conference on, January 1, 2014, 194-202, doi: 10.1109/CLUSTER.2014.6968765

R. Speck, D. Ruprecht, M. Emmett, M. Minion, M. Bolten, and R. Krause, "A multi-level spectral deferred correction method", BIT Numerical Mathematics, 2014,

S. Tang, L. Bildsten, W. M. Wolf, K. L. Li, A. K. H., Y. Cao, S. B. Cenko, A. De Cia, M. M., S. R. Kulkarni, R. R. Laher, F., P. E. Nugent, D. A. Perley, T. A. Prince, J. Surace, "An Accreting White Dwarf near the Chandrasekhar Limit in the Andromeda Galaxy", Astrophysical Journal, 2014, 786:61, doi: 10.1088/0004-637X/786/1/61

Jialin Liu, S. Byna, Bin Dong, Kesheng Wu, Chen, "Model-Driven Data Layout Selection for Improving Read", Parallel Distributed Processing Symposium Workshops 2014 IEEE International, 2014, 1708--1716, doi: 10.1109/IPDPSW.2014.190

Jarrod R McClean, Ryan Babbush, Peter J Love, Alan Aspuru-Guzik, "Exploiting locality in quantum computation for quantum chemistry", The Journal of Physical Chemistry Letters, 2014, 5:4368--4380,

Gage Eads, Juan Colmenares, Steven Hofmeyr, Sarah Bird, Davide Bartolini, David Chou, Brian Glutzman, Krste Asanovic, John D Kubiatowicz, "Building an Adaptive Operating System for Predictability and Efficiency", 2014,

2013

Michael Sentef, Alexander F. Kemper, Brian Moritz, James K. Freericks, Zhi-Xun Shen, and Thomas P. Devereaux, "Examining Electron-Boson Coupling Using Time-Resolved Spectroscopy", Phys. Rev. X 3, 041033 (2013), December 26, 2013,

Nonequilibrium pump-probe time-domain spectroscopies can become an important tool to disentangle degrees of freedom whose coupling leads to broad structures in the frequency domain. Here, using the time-resolved solution of a model photoexcited electron-phonon system, we show that the relaxational dynamics are directly governed by the equilibrium self-energy so that the phonon frequency sets a window for “slow” versus “fast” recovery. The overall temporal structure of this relaxation spectroscopy allows for a reliable and quantitative extraction of the electron-phonon coupling strength without requiring an effective temperature model or making strong assumptions about the underlying bare electronic band dispersion.

Daniel T. Graves, Phillip Colella, David Modiano, Jeffrey Johnson, Bjorn Sjogreen, Xinfeng Gao, "A Cartesian Grid Embedded Boundary Method for the Compressible Navier Stokes Equations", Communications in Applied Mathematics and Computational Science, December 23, 2013,

In this paper, we present an unsplit method for the time-dependent
  compressible Navier-Stokes equations in two and three dimensions.
  We use a a conservative, second-order Godunov algorithm.
  We use a Cartesian grid, embedded boundary method to resolve complex
  boundaries.  We solve for viscous and conductive terms with a
  second-order semi-implicit algorithm.  We demonstrate second-order
  accuracy in solutions of smooth problems in smooth geometries and
  demonstrate robust behavior for strongly discontinuous initial
  conditions in complex geometries.

H. M. Aktulga, L. Lin, C. Haine, E. G. Ng, C. Yang, "Parallel Eigenvalue Calculation based on Multiple Shift-invert Lanczos and Contour Integral based Spectral Projection Method", Parallel Computing, December 6, 2013, in press,

Cory M. Simon, Jihan Kim, Li-Chiang Lin, Richard L. Martin, Maciej Haranczyk, Berend Smit, "Optimizing nanoporous materials for gas storage", Physical Chemistry Chemical Physics, December 4, 2013,

Natural gas, mostly methane, is an attractive replacement of petroleum fuels for automotive vehicles because of its economic and environmental advantages. The technological obstacle to using methane as a vehicular fuel is its comparatively low volumetric energy density, necessitating densification strategies to yield reasonable driving ranges from a reasonably sized tank.

Protonu Basu, Anand Venkat, Mary Hall, Samuel Williams, Brian Van Straalen, Leonid Oliker, "Compiler generation and autotuning of communication-avoiding operators for geometric multigrid", 20th International Conference on High Performance Computing (HiPC), December 2013, 452--461,

N. Plonka, A. F. Kemper, S. Graser, A. P. Kampf, T. P. Devereaux, "Tunneling spectroscopy for probing orbital anisotropy in iron pnictides", Phys. Rev. B 88, 174518 (2013), November 27, 2013,

Using realistic multiorbital tight-binding Hamiltonians and the T-matrix formalism, we explore the effects of a nonmagnetic impurity on the local density of states in Fe-based compounds. We show that scanning tunneling spectroscopy (STS) has very specific anisotropic signatures that track the evolution of orbital splitting (OS) and antiferromagnetic gaps. Both anisotropies exhibit two patterns that split in energy with decreasing temperature, but for OS these two patterns map onto each other under 90 rotation. STS experiments that observe these signatures should expose the underlying magnetic and orbital order as a function of temperature across various phase transitions.

George Michelogiannakis, Channel Reservation Protocol for Over-Subscribed Channels and Destinations, Conference on High Performance Computing Networking, Storage and Analysis, 2013,

George Michelogiannakis, Nan Jiang, Daniel U. Becker, William J. Dally, "Channel Reservation Protocol for Over-Subscribed Channels and Destinations", Conference on High Performance Computing Networking, Storage and Analysis, ACM, 2013,

Abhinav Sarje, Xiaoye S Li, Alexander Hexemer, Tuning HipGISAXS on Multi and Many Core Supercomputers, Performance Modeling, Benchmarking and Simulations of High Performance Computer Systems at Supercomputing (SC'13), November 18, 2013,

M. Jung, E. H. Wilson III, W. Choi, J. Shalf, H. M. Aktulga, C. Yang, E. Saule, U. V. Catalyurek, M. Kandemir, "Exploring the Future of Out-of-core Computing with Compute-Local Non-Volatile Memory", International Conference for High Performance Computing, Networking, Storage and Analysis 2013 (SC13), NY, USA, ACM New York, November 17, 2013, doi: 10.1145/2503210.2503261

Babak Behzad, Huong Vu Thanh Luu, Joseph Huchette, Surendra Byna, Prabhat, Ruth Aydt, Quincey Koziol, and Marc Snir, "Taming parallel I/O complexity with auto-tuning", In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13), 2013,

Nathan Hanford, Vishal Ahuja, Mehmet Balman, Matthew. Farrens, Dipak Ghosal, Eric Pouyoul, Brian Tierney, "Characterizing the Impact of End-System Affinities On the End-to-End Performance of High-Speed Flows", SC13 workshop, ACM, 2013, doi: 10.1145/2534695.2534697

Multi-core end-systems use Receive Side Scaling (RSS) to parallelize protocol processing. RSS uses a hash function on the standard flow descriptors and an indirection table to as- sign incoming packets to receive queues which are pinned to specific cores. This ensures flow affinity in that the interrupt processing of all packets belonging to a specific flow is pro- cessed by the same core. A key limitation of standard RSS is that it does not consider the application process that con- sumes the incoming data in determining the flow affinity. In this paper, we carry out a detailed experimental anal- ysis of the performance impact of the application affinity in a 40 Gbps testbed network with a dual hexa-core end- system. We show, contrary to conventional wisdom, that when the application process and the flow are affinitized to the same core, the performance (measured in terms of end- to-end TCP throughput) is significantly lower than the line rate. Near line rate performance is observed when the flow and the application process are affinitized to different cores belonging to the same socket. Furthermore, affinitizing the application and the flow to cores on different sockets results in significantly lower throughput than the line rate. These results arise due to the memory bottleneck, which is demon- strated using preliminary correlational data on the cache hit rate in the core that services the application process. 

Hongzhang Shan, Brian Austin, Wibe de Jong, Leonid Oliker, Nick Wright, Edoardo Apra, "Performance Tuning of Fock Matrix and Two Electron Integral Calculations for NWChem on Leading HPC Platforms", Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), November 2013, doi: 10.1007/978-3-319-10214-6_13

Lizzie Coles-Kemp, Carrie Gates, Dieter Gollmann, Sean Peisert, Christian Probst, Organizational Processes for Supporting Sustainable Security, Report from Dagstuhl Seminar 120501, Pages: 37-48 November 4, 2013, doi: 10.4230/DagRep.2.12.37

L. Oliker and R. Vuduc, "Introduction for Special Issue on Autotuning", International Journal of High Performance Computing Applications (IJHPCA), 2013,

Salman Habib, Vitali Morozov, Nicholas Frontiere, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Venkat Vishwanath, Tom Peterka, Joe Insley, David Daniel, Patricia Fasel, Zarija Lukić, "HACC: Extreme Scaling and Performance Across Diverse Architectures", Supercomputing, 2013, 6,

Bei Wang, Stephane Ethier, William Tang, Timothy Williams, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker, "Kinetic Turbulence Simulations at Extreme Scale on Leadership-Class Systems", Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), November 2013, doi: 10.1145/2503210.2503258

Jong Y. Choi, Kesheng Wu, Jacky C. Wu, Alex Sim, Qing G. Liu, Matthew Wolf, CS Chang, Scott Klasky, "ICEE: Wide-area In Transit Data Processing Framework For Near Real-Time Scientific Applications", The 4th International Workshop on Big Data Analytics: Challenges and Opportunities (BDAC-13), 2013,

calcite512x512x6144saturationslicestep3500cropped

C. Steefel, S. Molins, D. Trebotich, "Pore scale processes associated with subsurface CO2 injection and sequestration”, Reviews in Mineralogy and Geochemistry", Reviews in Mineralogy and Geochemistry, November 1, 2013,

Slim T. Chourou, Abhinav Sarje, Xiaoye Li, Elaine Chan and Alexander Hexemer, "HipGISAXS: a high-performance computing code for simulating grazing-incidence X-ray scattering data", Journal of Applied Crystallography, 2013, 46:1781-1795, doi: 10.1107/ S0021889813025843

We have implemented a flexible Grazing Incidence Small-Angle Scattering (GISAXS) simulation code in the framework of the Distorted Wave Born Approximation (DWBA) that effectively utilizes the parallel processing power provided by graphics processors and multicore processors. This constitutes a handy tool for experimentalists facing a massive flux of data, allowing them to accurately simulate the GISAXS process and analyze the produced data. The software computes the diffraction image for any given superposition of custom shapes or morphologies in a user-defined region of the reciprocal space for all possible grazing incidence angles and sample orientations. This flexibility then allows to easily tackle a wide range of possible sample structures such as nanoparticles on top of or embedded in a substrate or a multilayered structure. In cases where the sample displays regions of significant refractive index contrast, an algorithm has been implemented to perform a slicing of the sample and compute the averaged refractive index profile to be used as the reference geometry of the unperturbed system. Preliminary tests show good agreement with experimental data for a variety of commonly encountered nanostrutures.

Maciej Haranczyk, Li-Chiang Lin, Kyuho Lee, Richard L. Martin, Jeffrey B. Neaton, Berend Smit, "Methane storage capabilities of diamond analogues", Physical Chemistry Chemical Physics, October 31, 2013,

Methane can be an alternative fuel for vehicular usage provided that new porous materials are developed for its efficient adsorption-based storage. Herein, we search for materials for this application within the family of diamond analogues. We used density functional theory to investigate structures in which tetrahedral C atoms of diamond are separated by-CC-or-BN-groups, as well as ones involving substitution of tetrahedral C atoms with Si and Ge atoms.

George Michelogiannakis, Hardware Support for Collective Memory Transfers in Stencil Computations, Workshop on Optimizing Stencil Computations, October 2013,

George Michelogiannakis, Extending Summation Precision for Distributed Network Operations, 25th International Symposium on Computer Architecture and High Performance Computing, October 2013,

George Michelogiannakis, Xiaoye S. Li, David H. Bailey, John Shalf, "Extending Summation Precision for Network Reduction Operations", 25th International Symposium on Computer Architecture and High Performance Computing, IEEE Computer Society, October 2013,

Double precision summation is at the core of numerous important algorithms such as Newton-Krylov methods and other operations involving inner products, but the effectiveness of summation is limited by the accumulation of rounding errors, which are an increasing problem with the scaling of modern HPC systems and data sets. To reduce the impact of precision loss, researchers have proposed increased- and arbitrary-precision libraries that provide reproducible error or even bounded error accumulation for large sums, but do not guarantee an exact result. Such libraries can also increase computation time significantly. We propose big integer (BigInt) expansions of double precision variables that enable arbitrarily large summations without error and provide exact and reproducible results. This is feasible with performance comparable to that of double-precision floating point summation, by the inclusion of simple and inexpensive logic into modern NICs to accelerate performance on large-scale systems. 

Wei Hu, Zhenyu Li and Jinlong Yang, "Structural, electronic, and optical properties of hybrid silicene and graphene nanocomposite", J. Chem. Phys. 139, 154704 (2013), October 16, 2013, doi: 10.1063/1.4824887

Structural, electronic, and optical properties of hybrid silicene and graphene (S/G) nanocomposite are examined with density functional theory calculations. It turns out that weak van der Waals interactions dominate between silicene and graphene with their intrinsic electronic properties preserved. Interestingly, interlayer interactions in hybrid S/G nanocomposite induce tunable p-type and n-type doping of silicene and graphene, respectively, showing their doping carrier concentrations can be modulated by their interfacial spacing.

A. Buluç, K. Madduri, "Graph partitioning for scalable distributed graph computations", AMS Contemporary Mathematics, Graph Partitioning and Graph Clustering (Proc. 10th DIMACS Implementation Challenge), 2013,

Samuel O. Odoh, Niranjan Govind, Georg Schreckenbach, Wibe A. de Jong, "Cation-Cation Interactions in [(UO2)(2)(OH)(n)](4-n) Complexes", Inorganic Chemistry, 2013, 52:11269-1127, doi: 10.1021/ic4015338

William Gu, Jaesik Choi, Ming Gu, Horst Simon, Kesheng Wu, "Fast Change Point Detection for Electricity Market Analysis", October 6, 2013, LBNL LBNL-6388E,

Wei Hu, Zhenyu Li and Jinlong Yang, "Surface and size effects on the charge state of NV center in nanodiamonds", Computational and Theoretical Chemistry, 2013, 1021, 49-53, October 1, 2013, doi: 10.1016/j.comptc.2013.06.015

Electronic structures and stability of nitrogen–vacancy (NV) centers doped in nanodiamonds (NDs) have been investigated with large-scale density functional theory (DFT) calculations. Spin polarized defect states are not affected by the particle sizes and surface decorations, while the band gap is sensitive to these effects. Induced by the spherical surface electric dipole layer, surface functionalization has a long-ranged impact on the stability of charged NV centers doped in NDs. NV− center doped in DNs is more favorable for n-type fluorinated diamond, while NV0 is preferred for p-type hydrogenated NDs. Therefore, surface decoration provides a useful way for defect state engineering.

David H. Bailey, Jonathan M. Borwein, "Normal numbers and pseudorandom generators", Computational and Analytical Mathematics in Honour of Jonathan Borwein's 60th Birthday, edited by David H. Bailey, Heinz H. Bauschke, Peter Borwein, Frank Garvan, Michel Thera, Jon D. Vanderwerff and Henry, Wolkowicz, (Springer: October 2013)

Alexander Knebe, Frazer R. Pearce, Hanni Lux, Yago Ascasibar, Peter Behroozi, Javier Casado, Christine Corbett Moran, Juerg Diemand, Klaus Dolag, Rosa Dominguez-Tenreiro, Pascal Elahi, Bridget Falck, Stefan Gottlöber, Jiaxin Han, Anatoly Klypin, Zarija Lukić, Michal Maciejewski, Cameron K. McBride, Manuel E. Merchán, Stuart I. Muldrew, Mark Neyrinck, Julian Onions, Susana Planelles, Doug Potter, Vicent Quilis, Yann Rasera, Paul M. Ricker, Fabrice Roy, Andrés N. Ruiz, Mario A. Sgró, Volker Springel, Joachim Stadel, Paul M. Sutter, Dylan Tweed, Marcel Zemp, "Structure finding in cosmological simulations: the state of affairs", Monthly Notices of the Royal Astronomical Society, 2013, 435:1618,

J. Choi, K. Hu, A. Sim, "Relational Dynamic Bayesian Networks with Locally Exchangeable Measures", 2013, LBNL 6341E,

K. Hu, J. Choi, J. Jiang, A. Sim, "Best Predictive GLMM using LASSO with Application on High- Speed Network", 2013, LBNL 6327E,

Amir Kamil, Katherine Yelick, "Hierarchical Computation in the SPMD Programming Model", 26th International Workshop on Languages and Compilers for Parallel Computing, September 2013,

Large-scale parallel machines are programmed mainly with the single program, multiple data (SPMD) model of parallelism. While this model has advantages of scalability and simplicity, it does not fit well with divide-and-conquer parallelism or hierarchical machines that mix shared and distributed memory. In this paper, we define the recursive single program, multiple data model (RSPMD) that extends SPMD with a hierarchical team mechanism to support hierarchical algorithms and machines. We implement this model in the Titanium language and describe how to eliminate a class of deadlocks by ensuring alignment of collective operations. We present application case studies evaluating the RSPMD model, showing that it enables divide-and-conquer algorithms such as sorting to be elegantly expressed and that team collective operations increase performance of conjugate gradient by up to a factor of two. The model also facilitates optimizations for hierarchical machines, improving scalability of particle in cell by 8x and performance of sorting and a stencil code by up to 40% and 14%, respectively. 

Samuel Williams, At Exascale, Will Bandwidth Be Free?, DOE ModSim Workshop, 2013,

J. A. Sobota, S.-L. Yang, A. F. Kemper, J. J. Lee, F. T. Schmitt, W. Li, R. G. Moore, J. G. Analytis, I. R. Fisher, P. S. Kirchmann, T. P. Devereaux, and Z.-X. Shen, "Direct Optical Coupling to an Unoccupied Dirac Surface State in the Topological Insulator Bi2Se3", Phys. Rev. Lett. 111, 136802 (2013), September 24, 2013,

We characterize the occupied and unoccupied electronic structure of the topological insulator Bi2Se3 by one-photon and two-photon angle-resolved photoemission spectroscopy and slab band structure calculations. We reveal a second, unoccupied Dirac surface state with similar electronic structure and physical origin to the well-known topological surface state. This state is energetically located 1.5 eV above the conduction band, which permits it to be directly excited by the output of a Ti:sapphire laser. This discovery demonstrates the feasibility of direct ultrafast optical coupling to a topologically protected, spin-textured surface state.

Y. F. Kung, W.-S. Lee, C.-C. Chen, A. F. Kemper, A. P. Sorini, B. Moritz, and T. P. Devereaux, "Time-dependent charge-order and spin-order recovery in striped systems", Phys. Rev. B 88, 125114 (2013), September 24, 2013,

Using time-dependent Ginzburg-Landau theory, we study the role of amplitude and phase fluctuations in the recovery of charge-stripe and spin-stripe phases in response to a pump pulse that melts the orders. For parameters relevant to the case where charge order precedes spin order thermodynamically, amplitude recovery governs the initial time scales, while phase recovery controls behavior at longer times. In addition to these intrinsic effects, there is a longer spin reorientation time scale related to the scattering geometry that dominates the recovery of the spin phase. Coupling between the charge and spin orders locks the amplitude and similarly the phase recovery, reducing the number of distinct time scales. Our results well reproduce the major experimental features of pump-probe x-ray diffraction measurements on the striped nickelate La1.75Sr0.25NiO4. They highlight the main idea of this work, which is the use of time-dependent Ginzburg-Landau theory to study systems with multiple coexisting order parameters.

Richard L Martin, Mahdi Niknam Shahrak, Joseph A Swisher, Cory M Simon, Julian P Sculley, Hong-Cai Zhou, Berend Smit, Maciej Haranczyk, "Modeling Methane Adsorption in Interpenetrating Porous Polymer Networks", The Journal of Physical Chemistry C, September 19, 2013,

Porous polymer networks (PPNs) are a class of porous materials of particular interest in a variety of energy-related applications because of their stability, high surface areas, and gas uptake capacities. Computationally derived structures for five recently synthesized PPN frameworks, PPN-2,-3,-4,-5, and-6, were generated for various topologies, optimized using semiempirical electronic structure methods, and evaluated using classical grand-canonical Monte Carlo simulations.

Richard L. Martin, Maciej Haranczyk, "Insights into Multi-Objective Design of Metal–Organic Frameworks", Crystal Growth & Design, September 18, 2013,

Metal-organic framework (MOF) crystal topologies which permit the highest internal surface areas are identified by means of multiobjective optimization and abstract structure models. We demonstrate that MOF design efforts can be focused within five underlying nets to engineer distinct, Pareto-optimal compromises between high gravimetric and high volumetric surface area materials.

H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, "Improving the Scalability of a Symmetric Iterative Eigensolver for Multi-core Platforms", Concurrency and Computation: Practice & Experience, September 12, 2013, online, doi: 10.1002/cpe.3129

Sean Peisert, Ed Talbot, Tom Kroeger, "Principles of Authentication", Proceedings of the 2013 New Security Paradigms Workshop (NSPW), Banff, Canada, ACM, September 2013, 47-56, doi: 10.1145/2535813.2535819

Marielle Pinheiro, Richard L. Martin, Chris H. Rycroft, Maciej Haranczyk, "High accuracy geometric analysis of crystalline porous materials", CrystEngComm, September 5, 2013,

A number of algorithms to analyze crystalline porous materials and their porosity employ the Voronoi tessellation, whereby the space in the material is divided into irregular polyhedral cells that can be analyzed to determine the pore topology and structure. However, the Voronoi tessellation is only appropriate when atoms all have equal radii, and the natural generalization to structures with unequal radii leads to cells with curved boundaries, which are computationally expensive to compute.

Sean Peisert and Steven Templeton, "The Hive Mind: Applying a Distributed Security Sensor Network to GENI- GENI Spiral 2 Final Project Report", UC Davis Technical Report, September 4, 2013,

Tim Mattson, David Bader, Jon Berry, Aydin Buluc, Jack Dongarra, Christos Faloutsos, John Feo, John Gilbert, Joseph Gonzalez, Bruce
Hendrickson, Jeremy Kepner, Charles Leiserson, Andrew Lumsdaine, David Padua, Stephen Poole, Steve Reinhardt, Mike Stonebraker, Steve Wallach,
Andrew Yoo,
"Standards for Graph Algorithm Primitives", HPEC, 2013,

Bin Dong; Byna, S.; Kesheng Wu, "Expediting scientific data analysis with reorganization of data", 2013 IEEE International Conference on Cluster Computing (CLUSTER), pp.1,8, 23-27 Sept. 2013, September 1, 2013,

David H. Bailey, Jonathan M. Borwein, Richard E. Crandall, Michael G. Rose, "Expectations on fractal sets", Applied Mathematics and Computation, September 1, 2013, 220:695-721,

Alfredo Buttari, Serge Gratton, Xiaoye S. Li, Marième Ngom, François-Henry Rouet, David Titley-Peloquin, Clément Weisbecker, "Error Analysis of the Block Low-Rank LU factorization of dense matrices", IRIT-CERFACS, RT-APO-13-7, August 2013,

Emmanuel Agullo, Patrick R. Amestoy, Alfredo Buttari, Abdou Guermouche, Guillaume Joslin, Jean-Yves L'Excellent, Xiaoye S. Li, Artem Napov, François-Henry Rouet, Mohamed Sid-Lakhdar, Shen Wang, Clément Weisbecker, Ichitaro Yamazaki., "Recent Advances in Sparse Direct Solvers", 22nd Conference on Structural Mechanics in Reactor Technology, August 18, 2013,

Sean Peisert, Cyber Resilience Metrics, First International Symposium on Resilient Cyber Systems, Resilience Week 2013, August 13, 2013,

Shen Wang, Xiaoye S. Li, François-Henry Rouet, Jianlin Xia, Maarten V. de Hoop, "A parallel geometric multifrontal solver using hierarchically semiseparable structure", Submitted to ACM Transaction on Mathematical Software, 2013,

We examine electron-electron mediated relaxation following ultrafast electric field pump excitation of the fermionic degrees of freedom in the Falicov-Kimball model for correlated electrons. The results reveal a dichotomy in the temporal evolution of the system as one tunes through the Mott metal-to-insulator transition: in the metallic regime relaxation can be characterized by evolution toward a steady state well described by Fermi-Dirac statistics with an increased effective temperature; however, in the insulating regime this quasithermal paradigm breaks down with relaxation toward a nonthermal state with a complicated electronic distribution as a function of momentum. We characterize the behavior by studying changes in the energy, photoemission response, and electronic distribution as functions of time. This relaxation may be observable qualitatively on short enough time scales that the electrons behave like an isolated system not in contact with additional degrees of freedom which would act as a thermal bath, especially when using strong driving fields and studying materials whose physics may manifest the effects of correlations.

James Demmel, Samuel Williams, Katherine Yelick, "Automatic Performance Tuning (Autotuning)", The Berkeley Par Lab: Progress in the Parallel Computing Landscape, edited by David Patterson, Dennis Gannon, Michael Wrinn, (Microsoft Research: August 2013) Pages: 337-376

C. L. Morris, Jeffrey Bacon, Konstantin Borozdin, Haruo Miyadera, John Perry, Evan Rose, Scott Watson, Tim White, Derek Aberle, J. Andrew Green, George G. McDuff, Zarija Lukić, Edward C. Milner, "A new method for imaging nuclear threats using cosmic ray muons", AIP Advances, 2013, 3:082128,

P. Maris, H. M. Aktulga, S. Binder, A. Calci, U. V. Catalyurek, J. Langhammer, E. G. Ng, E. Saule, R. Roth, J. P. Vary, C. Yang, "No Core CI calculations for light nuclei with chiral 2- and 3-body forces", J. Phys. Conf. Ser., IOP Publishing, August 1, 2013, 454:012063, doi: 10.1088/1742-6596/454/1/012063

Marielle Pinheiro, Richard L. Martin, Chris H. Rycroft, Andrew Jones, Enrique Iglesia, Maciej Haranczyk, "Characterization and comparison of pore landscapes in crystalline porous materials", Journal of Molecular Graphics and Modelling, July 31, 2013,

Crystalline porous materials have many applications, including catalysis and separations. Identifying suitable materials for a given application can be achieved by screening material databases. Such a screening requires automated high-throughput analysis tools that characterize and represent pore landscapes with descriptors, which can be compared using similarity measures in order to select, group and classify materials. Here, we discuss algorithms for the calculation of two types of pore landscape descriptors.

First-Principles Study of Carbon Nanomaterials, Wei Hu and Jinlong Yang, University of Science and Technology of China, July 26, 2013, doi: 10.13140/2.1.4601.6965

Pawel Tecmer, Niranjan Govind, Karol Kowalski, Wibe A de Jong, Lucas Visscher, "Reliable modeling of the electronic spectra of realistic uranium complexes.", The Journal of Chemical Physics, 2013, 139:034301, doi: 10.1063/1.4812360

Cindy Rubio-Gonzalez, Cuong Nguyen, Hong Diep Nguyen, James Demmel, William Kahan, Koushik Sen, David H. Bailey, Costin Iancu, David Hough, "Precimonious: Tuning Assistant for Floating-Point Precision", Supercomputing 2013, July 19, 2013,

Khaled Z Ibrahim, Kamesh Madduri, Samuel Williams, Bei Wang, Stephane Ethier, Leonid Oliker, "Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms", International Journal of High Performance Computing Applications (IJHPCA), July 2013, doi: 10.1177/1094342013492446

P. Basu, A. Venkat, M. Hall, S. Williams, B. Van Straalen, L. Oliker, "Compiler Generation and Autotuning of Communication-Avoiding Operators for Geometric Multigrid", Workshop on Stencil Computations (WOSC), 2013,

Wei Hu, Xiaojun Wu, Zhenyu Li and Jinlong Yang, "Helium separation via porous silicene based ultimate membrane", Nanoscale, 2013, 5, 9062-9066, July 11, 2013, doi: 10.1039/C3NR02326E

Helium purification has become more important for increasing demands in scientific and industrial applications. In this work, we demonstrated that the porous silicene can be used as an effective ultimate membrane for helium purification on the basis of first-principles calculations. Prinstine silicene monolayer is impermeable to helium gas with a high penetration energy barrier (1.66 eV). However, porous silicene with either Stone–Wales (SW) or divacancy (555[thin space (1/6-em)]777 or 585) defect presents a surmountable barrier for helium (0.33 to 0.78 eV) but formidable for Ne, Ar, and other gas molecules. In particular, the porous silicene with divacancy defects shows high selectivity for He/Ne and He/Ar, superior to graphene, polyphenylene, and traditional membranes.

Sean Peisert, Matt Bishop, "Dynamic, Flexible, and Optimistic Access Control", UC Davis CS Technical Report CSE-2013-76, July 2013,

Laurent Bouchet, Patrick Amestoy, Alfredo Buttari, François-Henry Rouet, Maxime Chauvin, "INTEGRAL/SPI data segmentation to retrieve sources intensity variations.", Astronomy & Astrophysics, July 1, 2013, 555:A52, doi: 10.1051/0004-6361/201219605

K. Hu, A. Sim, D. Antoniades, C. Dovrolis, "Estimating and Forecasting Network Traffic Performance based on Statistical Patterns Observed in SNMP data", the 9th International Conference on Machine Learning and Data Mining (MLDM2013), 2013,

David H. Bailey, Jonathan M. Borwein, Victoria Stodden, "Set the default to 'open'", Notices of the American Mathematical Society, July 1, 2013, Jun/Jul:679,

A.F. Kemper, M. Sentef, B. Moritz, C.C. Kao, Z.X. Shen, J.K. Freericks, T.P. Devereaux, "Mapping of the unoccupied states and relevant bosonic modes via the time dependent momentum distribution", Phys. Rev. B 87, 235139 (2013), June 28, 2013,

The unoccupied states of complex materials are difficult to measure, yet they play a key role in determining their properties. We propose a technique that can measure the unoccupied states, called time-resolved Compton scattering, which measures the time-dependent momentum distribution (TDMD). Using a nonequilibrium Keldysh formalism, we study the TDMD for electrons coupled to a lattice in a pump-probe setup. We find a direct relation between temporal oscillations in the TDMD and the dispersion of the underlying unoccupied states, suggesting that both can be measured by time-resolved Compton scattering. We demonstrate the experimental feasibility by applying the method to a model of MgB2 with realistic material parameters.

Orianna DeMasi, Taghrid Samak, David H. Bailey, "Identifying HPC Codes via performance logs and machine learning", Proceedings of the First Workshop on Changing Landscapes in HPC Security, June 17, 2013,

Cy Chan, Didem Unat, Michael Lijewski, Weiqun Zhang, John Bell, John Shalf, "Software Design Space Exploration for Exascale Combustion Co-Design", International Supercomputing Conference (ISC), Leipzig, Germany, June 16, 2013,

Didem Unat, Xing Cai, Scott Baden, Optimizing the Aliev-Panfilov Model of Cardiac Excitation on Heterogeneous Systems, Para 2010: State of the Art in Scientific and Parallel Computing, June 6, 2013,

Chang-Seo Park, Koushik Sen, Costin Iancu, "Scaling Data Race Detection for Partitioned Global Address Space Programs", International Supercomputing Conference (ICS) 2013, 2013,

Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Benjamin Lipshitz, Oded Schwartz, Sivan Toledo, "Communication optimal parallel multiplication of sparse random matrices", SPAA 2013: The 25th ACM Symposium on Parallelism in Algorithms and Architectures, Montreal, Canada, 2013, 222-231, doi: 10.1145/2486159.2486196

Victoria Stodden, Jonathan M. Borwein, David H. Bailey, "'Setting the default to reproducible' in computational science research", SIAM News, June 1, 2013, 46:4-6,

Babak Behzad, Joseph Huchette, Huong Vu Thanh Luu, Ruth Aydt, Surendra Byna, Yushu Yao, Quincey Koziol, and Prabhat, "A framework for auto-tuning HDF5 applications", Proceedings of the 22nd international symposium on High-performance parallel and distributed computing (HPDC), 2013,

Y. S. Lee, S. J. Moon, Scott C. Riggs, M. C. Shapiro, I. R. Fisher, Bradford W. Fulfer, Julia Y. Chan, A. F. Kemper, and D. N. Basov, "Infrared study of the electronic structure of the metallic pyrochlore iridate Bi2Ir2O7", Phys. Rev. B 87, 195143 (2013), May 30, 2013,

We investigated the electronic properties of a single crystal of metallic pyrochlore iridate Bi2Ir2O7 by means of infrared spectroscopy. Our optical conductivity data show the splitting of t2gbands into Jeff ones due to strong spin-orbit coupling. We observed a sizable midinfrared absorption near 0.2 eV which can be attributed to the optical transition within the Jeff,1/2 bands. More interestingly, we found an abrupt suppression of optical conductivity in the very far-infrared region. Our results suggest that the electronic structure of Bi2Ir2O7 is governed by the strong spin-orbit coupling and correlation effects, which are a prerequisite for theoretically proposed nontrivial topological phases in pyrochlore iridates.

Ichitaro Yamazaki, Xiaoye S. Li, François-Henry Rouet, Bora Uçar, "On partitioning and reordering problems in a hierarchically parallel hybrid linear solver", PDSEC Workshop of the IEEE International Parallel and Distributed Processing Symposium, May 24, 2013,

Christopher D. Krieger, Michelle Mills Strout, Catherine Olschanowsky, Andrew Stone, Stephen Guzik, Xinfeng Gao, Carlo Bertolli, Paul H.J. Kelly, Gihan Mudalige, Brian Van Straalen, Sam Williams, "Loop chaining: A programming abstraction for balancing locality and parallelism", Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International, May 2013, 375--384, doi: 10.1109/IPDPSW.2013.68

Richard L. Martin, Maciej Haranczyk, "Optimization-Based Design of Metal-Organic Framework Materials", Journal of Chemical Theory and Computation, May 16, 2013,

Metal–organic frameworks (MOFs) are a class of porous materials constructed from metal or metal oxide building blocks connected by organic linkers. MOFs are highly tunable structures that can in theory be custom designed to meet the specific pore geometry and chemistry required for a given application such as methane storage or carbon capture. However, due to the sheer number of potential materials, identification of optimal MOF structures is a significant challenge.

David Camp, Hari Krishnan, David Pugmire, Christoph Garth, Ian Johnson, E. Wes Bethel, Kenneth I. Joy, and Hank Childs., "GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting", Proceedings of Eurographics Symposium on Parallel Graphics and Visualization (EGPGV), May 5, 2013,

Aydın Buluç, Erika Duriakova, Armando Fox, John Gilbert, Shoaib Kamil, Adam Lugowski, Leonid Oliker, Samuel Williams, "High-Productivity and High-Performance Analysis of Filtered Semantic Graphs", International Parallel and Distributed Processing Symposium (IPDPS), 2013, doi: 10.1145/2370816.2370897

Emmanuel Agullo, Patrick R. Amestoy, Alfredo Buttari, Abdou Guermouche, Jean-Yves L'Excellent, François-Henry Rouet, "Robust memory-aware mappings for parallel multifrontal factorizations", Submitted to SISC, 2013,

Haruo Miyadera, Konstantin N. Borozdin, Steve J. Greene, Zarija Lukić, Koji Masuda, Edward C. Milner, Christopher L. Morris, John O. Perry, "Imaging Fukushima Daiichi reactors with muons", AIP Advances, 2013, 3:052133,

E. Solomonik, A. Buluç, J. Demmel, "Minimizing communication in all-pairs shortest paths", International Parallel and Distributed Processing Symposium (IPDPS), 2013,

Scott Beamer, Aydın Buluç, Krste Asanović, David A Patterson, "Distributed Memory Breadth-First Search Revisited: Enabling Bottom-Up Search", Proc. Workshop on Multithreaded Architectures and Applications (MTAAP), in conjunction with IPDPS, 2013,

John Perry, Mara Azzouz, Jeffrey Bacon, Konstantin Borozdin, Elliott Chen, Joseph Fabritius II, Edward Milner, Haruo Miyadera, Christopher Morris, Jonathan Roybal, Zhehui Wang, Bob Busch, Ken Carpenter, Adam A. Hecht, Koji Masuda, Candace Spore, Nathan Toleman, Derek Aberle and Zarija Lukić, "Imaging a nuclear reactor using cosmic ray muons", Journal of Applied Physics, 2013, 113:184909,

Hank Childs, Berk Geveci, William J. Schroeder, Jeremy S. Meredith, Kenneth Moreland, Christopher Sewell, Torsten Kuhlen, E.Wes Bethel, "Research Challenges for Visualization Software", IEEE Computer, May 1, 2013, 46:34-43, LBNL 6239E,

A. Azad, A. Khan, B. Rajwa, S. Pyne, A. Pothen, "Classifying Immunophenotypes wtih Templates from Flow Cytometry", ACM BCB, 2013, doi: 10.1145/2506583.2506627

Nan Jiang, Daniel U. Becker, George Michelogiannakis, James Balfour, Brian Towles, John Kim, William J. Dally, "A Detailed and Flexible Cycle-Accurate Network-on-Chip Simulator", International Symposium on Performance Analysis of Systems and Software, IEEE Computer Society, April 2013,

Richard L. Martin, Li-Chiang Lin, Kuldeep Jariwala, Berend Smit, Maciej Haranczyk, "Mail-Order Metal–Organic Frameworks (MOFs): Designing Isoreticular MOF-5 Analogues Comprising Commercially Available Organic Molecules", The Journal of Physical Chemistry C, April 17, 2013,

Metal–organic frameworks (MOFs), a class of porous materials, are of particular interest in gas storage and separation applications due largely to their high internal surface areas and tunable structures. MOF-5 is perhaps the archetypal MOF; in particular, many isoreticular analogues of MOF-5 have been synthesized, comprising alternative dicarboxylic acid ligands. In this contribution we introduce a new set of hypothesized MOF-5 analogues, constructed from commercially available organic molecules.

David H. Bailey, Marcos M. Lopez de Prado and Eva del Pozo, "The strategy approval decision: A Sharpe ratio indifference curve approach", Algorithmic Finance, April 2, 2013, 2:99-109,

D. Antoniades, K. Hu, A. Sim, C. Dovrolis, "What SNMP data can tell us about Edge-to-Edge network performance", Passive and Active Measurement Conference (PAM2013), 2013,

Nils E. R. Zimmermann, Timm J. Zabel, Frerich J. Keil, "Transport into Nanosheets: Diffusion Equations Put to Test", J. Phys. Chem. C, 2013, 117:7384-7390, doi: 10.1021/jp400152q

Ultrathin porous materials, such as zeolite nanosheets, are prominent candidates for performing catalysis, drug supply, and separation processes in a highly efficient manner due to exceptionally short transport paths. Predictive design of such processes requires the application of diffusion equations that were derived for macroscopic, homogeneous surroundings to nanoscale, nanostructured host systems. Therefore, we tested different analytical solutions of Fick’s diffusion equations for their applicability to methane transport into two different zeolite nanosheets (AFI, LTA) under instationary conditions. Transient molecular dynamics simulations provided hereby concentration profiles and uptake curves to which the different solutions were fitted. Two central conclusions were deduced by comparing the fitted transport coefficients. First, the transport can be described correctly only if concentration profiles are used and the transport through the solid–gas interface is explicitly accounted for by the surface permeability. Second and most importantly, we have unraveled a size limitation to applying the diffusion equations to nanoscale objects. This is because transport-diffusion coefficients, DT, and surface permeabilities, α, of methane in AFI become dependent on nanosheet thickness. Deviations can amount to factors of 2.9 and 1.4 for DT and α, respectively, when, in the worst case, results from the thinnest AFI nanosheet are compared with data from the thickest sheet. We present a molecular explanation of the size limitation that is based on memory effects of entering molecules and therefore only observable for smooth pores such as AFI and carbon nanotubes. Hence, our work provides important tools to accurately predict and intuitively understand transport of guest molecules into porous host structures, a fact that will become the more valuable the more tiny nanotechnological objects get.

Watch a movie illustrating the transient molecular dynamics approach, which was critical for this study, here.

Andrew Myers, Christopher McKee, Andrew Cunningham, Richard Klein, Mark Krumholz, "The Fragmentation of Magnetized, Massive Star-forming Cores with Radiative Feedback", The Astrophysical Journal, Volume 766, Issue 2, article id. 97, April 1, 2013,

M. Montanari, E. Chan, K. Larson, W. Yoo, R. H. Campbell, "Distributed security policy conformance", Computers & Security, March 31, 2013,

Wei Hu, Zhenyu Li and Jinlong Yang, "Electronic and optical properties of graphene and graphitic ZnO nanocomposite structures", J. Chem. Phys. 138, 124706 (2013), March 28, 2013, doi: 10.1063/1.4796602

Electronic and optical properties of graphene and graphitic ZnO (G/g-ZnO) nanocomposites have been investigated with density functional theory. Graphene interacts overall weakly with g-ZnO monolayer via van der Waals interaction. There is no charge transfer between the graphene and g-ZnO monolayer, while a charge redistribution does happen within the graphene layer itself, forming well-defined electron-hole puddles. When Al or Li is doped in the g-ZnO monolayer, substantial electron (n-type) and hole (p-type) doping can be induced in graphene, leading to well-separated electron-hole pairs at their interfaces. Improved optical properties in graphene/g-ZnO nanocomposite systems are also observed, with potential photocatalytic and photovoltaic applications.

Mehmet Balman, "Advance Resource Provisioning in Bulk Data Scheduling", 27th IEEE International Conference on Advanced Information Networking and Applications (AINA), 2013, LBNL 6364E, doi: http://dx.doi.org/10.1109/AINA.2013.5

Today's scientific and business applications generate massive data sets that need to be transferred to remote sites for sharing, processing, and long term storage. Because of increasing data volumes and enhancement in current network technology that provide on-demand high-speed data access between collaborating institutions, data handling and scheduling problems have reached a new scale. In this paper, we present a new data scheduling model with advance resource provisioning, in which data movement operations are defined with earliest start and latest completion times. We analyze time-dependent resource assignment problem, and propose a new methodology to improve the current systems by allowing researchers and higher-level meta-schedulers to use data-placement as-a-service, so they can plan ahead and submit transfer requests in advance. In general, scheduling with time and resource conflicts is {NP-hard}. We introduce an efficient algorithm to organize multiple requests on the fly, while satisfying users' time and resource constraints. We successfully tested our algorithm in a simple benchmark simulator that we have developed, and demonstrated its performance with initial test results.

Keywords: scheduling with constraints, bulk data movement, time-dependent graphs, network reservation, Gale-Shapley algorithm

Sean Peisert, Health Informatics Minute: Aligning Organizational and Employee Computer Security Goals for Health Informatics, Seventh Annual Health Informatics Graduate Program Conference, March 22, 2013,

Abhinav Sarje, Synchrotron Light-source Data Analysis through Massively-parallel GPU Computing, NVIDIA GPU Technology Conference (GTC), March 2013,

Light scattering techniques are widely used for the characterization of macromolecules and particle systems (ordered, partially-ordered or custom) based on their properties, such as their structures and sizes, at the micro/nano-scales. One of the major applications of these is in the characterization of materials for the design and fabrication of energy-relevant nanodevices, such as photovoltaic and energy storage devices. Although current high-throughput synchrotron light-sources can generate tremendous amounts of raw data at a high rate, analysis of this data for the characterization processes remains the primary bottleneck, demanding large amounts of computational resources. In this work, we are developing high-performance parallel algorithms and codes on large-scale GPU clusters to address this bottleneck. Here, we will discuss our efforts and experiences in developing "architecture-aware" hybrid multi-GPU multi-CPU codes for two such most important analysis steps. First is simulation of X-ray scattering patterns for any given sample morphology, and second is structural fitting of such scattering patterns. Both steps involve a large number of variable parameters, and hence, require high computational power.
Our X-ray scattering pattern simulation code is based on the Distorted Wave Born Approximation (DWBA) theory, and involves a large number of compute-intensive form-factor calculations. A form-factor is computed as an integral over the shape functions of the nanoparticles in a sample. A simulated sample structure is taken as an input in the form of discretized shape-surfaces, such as a triangulated surface. Resolutions of such discretization, as well as of a spatial 3-D grid involved, also contribute toward the compute-intensity of the simulations. Our code uses hybrid GPU and multicore CPU acceleration for generation of high-resolution scattering patterns, for the given input, using various possible values of the input parameters. These parameters include a number of sample definition, experimental setup and computational parameters. The patterns obtained through the X-ray scattering simulations carry vital information about the structural properties of the materials in the sample. In order to extract meaningful structural information from the scattering patterns, structural fitting, as an inverse modeling problem, is used. Our codes implement a fast and scalable solution to this process through a Reverse Monte Carlo (RMC) simulation algorithm. This process computes structure-factors in each simulation step until a result fits the input image pattern within allowed error range. These computations require a large number of fast Fourier transform (FFTs) calculations which are also accelerated on hybrid GPU and CPU systems in our codes for high-performance.
Our codes are designed as GPU-architecture-aware implementations, and deliver high-performance through dynamic selection of the best-performing computational parameters, such as the computation decomposition parameters, block sizes, for the system being used. We perform a detailed study of the effects of these parameters on the code performance, and use this information to guide the parameter value selection process. We also carry out performance analysis of the optimal codes and study its scalings, including scaling to large GPU clusters. Our codes obtain near linear scaling with respect to the cluster size in our experiments, and we believe that these are "future-ready".

Jochen Mikosch, Jiaxu Zhang, Sebastian Trippel, Christoph Eichhorn, Rico Otto, Rui Sun, Wibe A de Jong, Matthias Weidemuller, William L Hase, Roland Wester, "Indirect dynamics in a highly exoergic substitution reaction.", Journal of the American Chemical Society, 2013, 135:4250-9, doi: 10.1021/ja308042v

Francisco J. Aragon Artacho, David H. Bailey, Jonathan M. Borwein, Peter B. Borwein, "Walking on real numbers", Mathematical Intelligencer, March 15, 2013, 35:42-60,

Anubhav Jain, Ivano E. Castelli, Geoffroy Hautier, David H. Bailey, Karsten W. Jacobsen, "Performance of genetic algorithms in search for water splitting perovskites", Journal of Materials Science, March 14, 2013,

Chang-Seo Park, Koushik Sen, Costin Iancu, "Scaling Data Race Detection for Partitioned Global Address Space Programs", Principles and Practice of Parallel Programming (PPoPP 2013), March 4, 2013,

Patrick Oesterling, Christian Heine, Gunther H. Weber, Gerik Scheuermann, "Visualizing global structure of nD point clouds as topological 1D height to enable supervised local data analysis", IEEE Transactions on Visualization and Computer Graphics, 2013, 19:514-526, LBNL 5694E,

A. S. Almgren, J. B. Bell, M.J. Lijewski, Z. Lukić, E. Van Andel, "Nyx: A Massively Parallel AMR Code for Computational Cosmology", The Astrophysical Journal, 2013, 765:39,

E. Vecharynski and A. Knyazev, "Absolute value preconditioning for symmetric indefinite linear systems", SIAM Journal on Scientific Computing Vol. 35, Issue 2, pp. A696-A718, 2013,

We introduce a novel strategy for constructing symmetric positive definite (SPD) preconditioners for linear systems with symmetric indefinite matrices. The strategy, called absolute value preconditioning, is motivated by the observation that the preconditioned minimal residual method with the inverse of the absolute value of the matrix as a preconditioner converges to the exact solution of the system in at most two steps. Neither the exact absolute value of the matrix nor its exact inverse are computationally feasible to construct in general. However, we provide a practical example of an SPD preconditioner that is based on the suggested approach. In this example we consider a model problem with a shifted discrete negative Laplacian and suggest a geometric multigrid (MG) preconditioner, where the inverse of the matrix absolute value appears only on the coarse grid, while operations on finer grids are based on the Laplacian. Our numerical tests demonstrate practical effectiveness of the new MG preconditioner, which leads to a robust iterative scheme with minimalist memory requirements. 

N. Aghaeepour, G. Finak, D. Dougall, A. Hadj-Khodabakhshi, P. Mah, G. Obermoser, J. Spidlen, I. Taylor, S. A Wuensch, J. Bramson, C. Eaves, A. P. Weng, E. Fortuno III, K. Ho, T. Kollmann, W. Rogers, S. Rosa, B. Dalal, A. Azad, A. Pothen, A. Brandes, H. Bretschneider, R. Bruggner, R. Finck, R. Jia, N. Zimmerman, M. Linderman, D. Dill, G. Nolan, C. Chan, F. Khettabi, K. Neill, M. Chikina, A. Gupta, P. Shooshtari, H. Zare, P. Jager, M. Jiang, J. Keilwagen, J. M. Maisog, P. Majek, J. Vilcek, T. Manninen, H. Huttunen, P. Ruusuvuori, M. Nykter, G. J. McLachlan, K. Wang, I. Naim, G. Sharma, R. Nikolic, S. Pyne, Y. Qian, P. Qiu, J. Quinn, A. Roth, R. Norel, G. Stolovitzky, P. Meyer, J. Saez-Rodriguez, M. Bhattacharjee, M. Biehl, P. Bucher, K. Bunte, B. .Camillo, S. Dimitrieva, J. Grau, I. Grosse, S. Posch, N. Guex, J. Keilwagen, M. Kursa, B. Liu, M. Maienschein-Cline, T. Manninen, G. J. McLachlan, P. Seifert, M. Strickert, J. Vilar, H. Hoos, T. Mosmann, R. Brinkman, R. Gottardo, and R. Scheuermann, "Critical Assessment of Automated Flow Cytometry Analysis Techniques", Nature Methods, 2013,

Han Suk Kim, Didem Unat, Scott Baden, Jurgen Schulze, "A new approach to interactive viewpoint selection for volume data sets", Information Visualization, February 25, 2013,

Wei Hu, Xiaojun Wu, Zhenyu Li and Jinlong Yang, "Porous silicene as a hydrogen purification membrane", Phys. Chem. Chem. Phys., 2013, 15, 5753-5757, February 22, 2013, doi: 10.1039/C3CP00066D

We investigated theoretically the hydrogen permeability and selectivity of a porous silicene membrane via first-principles calculations. The subnanometer pores of the silicene membrane are designed as divacancy defects with octagonal and pentagonal rings (585-divacancy). The porous silicene exhibits high selectivity comparable with graphene-based membranes for hydrogen over various gas molecules (N2, CO, CO2, CH4, and H2O). The divacancy defects in silicene are chemically inert to the considered gas molecules. Our results suggest that the porous silicene membrane is expected to find great potential in gas separation and filtering applications.

Abhinav Sarje, Srinivas Aluru, "All-pairs computations on many-core graphics processors", Parallel Computing, 2013, 39-2:79-93, doi: 10.1016/j.parco.2013.01.002

Developing high-performance applications on emerging multi- and many-core architectures requires efficient mapping techniques and architecture-specific tuning methodologies to realize performance closer to their peak compute capability and memory bandwidth. In this paper, we develop architecture-aware methods to accelerate all-pairs computations on many-core graphics processors. Pairwise computations occur frequently in numerous application areas in scientific computing. While they appear easy to parallelize due to the independence of computing each pairwise interaction from all others, development of techniques to address multi-layered memory hierarchies, mapping within the restrictions imposed by the small and low-latency on-chip memories, striking the right balanced between concurrency, reuse and memory traffic etc., are crucial to obtain high-performance. We present a hierarchical decomposition scheme for GPUs based on decomposition of the output matrix and input data. We demonstrate that a careful tuning of the involved set of decomposition parameters is essential to achieve high efficiency on the GPUs. We also compare the performance of our strategies with an implementation on the STI Cell processor as well as multi-core CPU parallelizations using OpenMP and Intel Threading Building Blocks.

Developing high-performance applications on emerging multi- and many-core
architectures requires efficient mapping techniques and architecture-specific
tuning methodologies to realize performance closer to their peak compute
capability and memory bandwidth. In this paper, we develop architecture-aware
methods to accelerate all-pairs computations on many-core graphics processors.
Pairwise computations occur frequently in numerous application areas in
scientific computing. While they appear easy to parallelize due to the
independence of computing each pairwise interaction from all others, development
of techniques to address multi-layered memory hierarchies, mapping within the
restrictions imposed by the small and low-latency on-chip memories, striking the
right balanced between concurrency, reuse and memory traffic etc., are crucial
to obtain high-performance. We present a hierarchical decomposition scheme for
GPUs based on decomposition of the output matrix and input data. We demonstrate
that a careful tuning of the involved set of decomposition parameters is
essential to achieve high efficiency on the GPUs. We also compare the
performance of our strategies with an implementation on the STI Cell processor
as well as multi-core CPU parallelizations using OpenMP and Intel Threading
Building Blocks.Developing high-performance applications on emerging multi- and many-core
architectures requires efficient mapping techniques and architecture-specific
tuning methodologies to realize performance closer to their peak compute
capability and memory bandwidth. In this paper, we develop architecture-aware
methods to accelerate all-pairs computations on many-core graphics processors.
Pairwise computations occur frequently in numerous application areas in
scientific computing. While they appear easy to parallelize due to the
independence of computing each pairwise interaction from all others, development
of techniques to address multi-layered memory hierarchies, mapping within the
restrictions imposed by the small and low-latency on-chip memories, striking the
right balanced between concurrency, reuse and memory traffic etc., are crucial
to obtain high-performance. We present a hierarchical decomposition scheme for
GPUs based on decomposition of the output matrix and input data. We demonstrate
that a careful tuning of the involved set of decomposition parameters is
essential to achieve high efficiency on the GPUs. We also compare the
performance of our strategies with an implementation on the STI Cell processor
as well as multi-core CPU parallelizations using OpenMP and Intel Threading
Building Blocks.

Richard L. Martin, Maciej Haranczyk, "Exploring frontiers of high surface area metal-organic frameworks", Chemical Science, February 6, 2013, 4:1781-1785,

Metal–organic frameworks (MOFs) have enjoyed considerable interest due to their high internal surface areas as well as tunable pore geometry and chemistry. However, design of optimal MOFs is a great challenge due to the significant number of possible structures. In this work, we present a strategy to rapidly explore the frontiers of these high surface area materials. Here, organic ligands are abstracted by geometrical (alchemical) building blocks, and an optimization of their defining geometrical parameters is performed to identify shapes of ligands which maximize gravimetric surface area of the resulting MOFs. A strength of our approach is that the space of ligands to be explored can be rigorously bounded, allowing discovery of the optimum ligand shape within any criteria, conforming to synthetic requirements or arbitrary exploratory limits. By modifying these bounds, we can project to what extent achievable surface area increases when moving beyond the present limits of organic synthesis. Projecting optimal ligand shapes onto real chemical species, we achieve blueprints for MOFs of various topologies that are predicted to achieve up to 70% higher surface area than the current benchmark materials.

David H. Bailey, Jonathan M. Borwein, Richard E. Crandall, John Zucker, "Lattice sums arising from the Poisson equation", Journal of Physics A: Mathematical and Theoretical, February 5, 2013, to appea,

E. Wes Bethel, Prabhat, Suren Byna, Oliver Rübel, K. John Wu, and Michael Wehner, "Why High Performance Visual Data Analytics is both Relevant and Difficult", Proceedings of Visualization and Data Analysis 2013, IS&T/SPIE Electronic Imaging 2013, San Francisco, CA, USA, SPIE, February 2013, LBNL LBNL-6063E,

Victoria Stodden, David H. Bailey, Jonathan M. Borwein, Randall J. LeVeque, William Rider, William Stein, "Setting the default to reproducible: Reproducibility in computational and experimental mathematics", February 2, 2013,

Wei Hu, Zhenyu Li and Jinlong Yang, "Diamond as an inert substrate of graphene", J. Chem. Phys. 138, 054701 (2013), February 1, 2013, doi: 10.1063/1.4789420

Interaction between graphene and semiconducting diamond substrate has been examined with large-scale density functional theory calculations. Clean and hydrogenated diamond (100) and (111) surfaces have been studied. It turns out that weak van der Waals interactions dominate for graphene on all these surfaces. High carrier mobility of graphene is almost not affected, except for a negligible energy gap opening at the Dirac point. No charge transfer between graphene and diamond (100) surfaces is detected, while different charge-transfer complexes are formed between graphene and diamond (111) surfaces, inducing either p-type or n-type doping on graphene. Therefore, diamond can be used as an excellent substrate of graphene, which almost keeps its electronic structures at the same time providing the flexibility of charge doping.

Sean Whalen, Sean Peisert, Matt Bishop, "Multiclass Classification of Distributed Memory Parallel Computations", Pattern Recognition Letters (PRL), February 2013, 34(3):322-329, doi: 10.1016/j.patrec.2012.10.007

Abhinav Sarje, Samuel Williams, David H. Bailey, "MPQC: Performance analysis and optimization", LBNL Technical Report, February 2013, LBNL 6076E,

Laurent Bouchet, Patrick Amestoy, Alfredo Buttari, François-Henry Rouet, Maxime Chauvin, "Simultaneous analysis of large INTEGRAL/SPI datasets: optimizing the computation of the solution and its variance using sparse matrix algorithms", Astronomy and Computing, February 1, 2013, 1:59--69, doi: 10.1016/j.ascom.2013.03.002

Kumari Gaurav Rana, Takeaki Yajima, Subir Parui, Alexander F. Kemper, Thomas P.Devereaux, Yasuyuki Hikita, Harold Y. Hwang, Tamalika Banerjee, "Hot electron transport in a strongly correlated transition-metal oxide", Nature Scientific Reports, Volume 3, id. 1274 (2013)., February 2013,

Oxide heterointerfaces are ideal for investigating strong correlation effects to electron transport, relevant for oxide-electronics. Using hot-electrons, we probe electron transport perpendicular to the La0.7Sr0.3MnO3 (LSMO)- Nb-doped SrTiO3 (Nb:STO) interface and find the characteristic hot-electron attenuation length in LSMO to be 1.48 +/- 0.10 unit cells (u.c.) at -1.9 V, increasing to 2.02 +/- 0.16 u.c. at -1.3 V at room temperature. Theoretical analysis of this energy dispersion reveals the dominance of electron-electron and polaron scattering. Direct visualization of the local electron transport shows different transmission at the terraces and at the step-edges.

K. Hu, A. Sim, D. Antoniades, C. Dovrolis, Statistical Prediction Models for Network Traffic Performance, the APAN 35 conference and the Winter 2013 ESCC/Internet2 Joint Techs meeting (TIP2013), 2013,

Taghrid Samak, Christine Morin, David H. Bailey, "Energy consumption models and predictions for large-scale systems", Proceedings of the Ninth Workshop on High-Performance, Power-Aware Computing, January 22, 2013, to appea,

M. Dandouna, N. Emad and L.A. Drummond, "A Proposed Programming Model for Writing Sustainable Numerical Libraries for Extreme Scale Computing", Conc. and Compt., January 16, 2013,

The promise of computer systems with very large orders of processing elements cannot be realized without an effective solution that targets the programming model with a suitable programming environ- ment. Nowadays, it is necessary to identify and rapidly make available robust software technologies to enable high-end computer applications to run efficiently on these emerging systems, and to enable the development of more complex and capable simulation codes for scientific and engineering applica- tions. We review some of numerical libraries that have achieved modularity, scalability and extensibility thanks to their use of object-oriented programming approaches. However, only a few of these libraries have managed to effectively implement sequential and parallel code reusability.

Here, we discuss what is currently missing from existing library implementations and propose a pro- gramming model based on a modular and multi-level parallelism approach that has a strict separation between computational operations, data management and communication. We illustrate how this model makes it possible to design more scalable libraries by exploiting better their functionalities and even enable the formulation of hybrid numerical scheme to be run efficiently on multi-level parallel systems with a large number of heterogeneous processing units without confining the parallelism to the program- ming model of the communication library. We use the multiple explicitly restarted Arnoldi method as our test case and our implementations require full reuse of serial/parallel kernels in their implementation. Our experiments include comparisons with state-of-the-art numerical libraries on high-end computing systems. 

Wei Hu, Zhenyu Li, Jinlong Yang and Jianguo Hou, "Nondecaying long range effect of surface decoration on the charge state of NV center in diamond", J. Chem. Phys. 138, 034702 (2013), January 15, 2013, doi: 10.1063/1.4775364

P. Ghysels, T. J. Ashby, K. Meerbergen, W. Vanroose, "Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines", SIAM Journal on Scientific Computing, January 8, 2013, 35:1, doi: 10.1137/12086563X

Kesheng Wu, Wes Bethel, Ming Gu, David, Oliver R\ ubel, "A Big Data Approach to Analyzing Market Volatility", Algorithmic Finance, 2013, 2:241--267, LBNL LBNL-6382E, doi: 10.3233/AF-13030

Understanding the microstructure of the financial market requires the processing of a vast amount of data related to individual trades, and sometimes even multiple levels of quotes. Analyzing such a large volume of data requires tremendous computing power that is not easily available to financial academics and regulators. Fortunately, public funded High Performance Computing (HPC) power is widely available at the National Laboratories in the US. In this paper we demonstrate that the HPC resource and the techniques for data-intensive sciences can be used to greatly accelerate the computation of an early warning indicator called Volume-synchronized Probability of Informed trading (VPIN). The test data used in this study contains five and a half year's worth of trading data for about 100 most liquid futures contracts, includes about 3 billion trades, and takes 140GB as text files. By using (1) a more efficient file format for storing the trading records, (2) more effective data structures and algorithms, and (3) parallelizing the computations, we are able to explore 16,000 different ways of computing VPIN in less than 20 hours on a 32-core IBM DataPlex machine. Our test demonstrates that a modest computer is sufficient to monitor a vast number of trading activities in real-time -- an ability that could be valuable to regulators.

Our test results also confirm that VPIN is a strong predictor of liquidity-induced volatility. With appropriate parameter choices, the false positive rates are about 7% averaged over all the futures contracts in the test data set. More specifically, when VPIN values rise above a threshold (CDF > 0.99), the volatility in the subsequent time windows is higher than the average in 93% of the cases.

J. M. Silverman, J. Vinko, M. M. Kasliwal, O. D., Y. Cao, J. Johansson, D. A. Perley, D., J. C. Wheeler, R. Amanullah, I. Arcavi, J. S., A. Gal-Yam, A. Goobar, S. R. Kulkarni, R., W. H. Lee, G. H. Marion, P. E. Nugent, I. Shivvers, "SN 2000cx and SN 2013bh: extremely rare, nearly twin Type Ia supernovae", Monthly Notices of the RAS, 2013, 436:1225-1237, doi: 10.1093/mnras/stt1647

DF Johnson, K Bhaskaran-Nair, EJ Bylaska, WA De Jong, "Thermodynamics of tetravalent thorium and uranium complexes from first-principles calculations", Journal of Physical Chemistry A, 2013, 117:4988--4995, doi: 10.1021/jp404656y

S.L. Cornford, D.F. Martin, D.T. Graves, D.F. Ranken, A.M. Le Brocq, R.M. Gladstone, A.J. Payne, E.G. Ng, W.H. Lipscomb, "Adaptive mesh, finite volume modeling of marine ice sheets", Journal of Computational Physics, 232(1):529-549, 2013,

Kesheng Wu, Wes Bethel, Ming Gu, David, Oliver R\ ubel, Testing VPIN on Big Data, Available at SSRN 2318259, 2013,

U. Feindt, M. Kerschhaggl, M. Kowalski, G. Aldering, P., C. Aragon, S. Bailey, C. Baltay, S., C. Buton, A. Canto, F. Cellier-Holzem, M., N. Chotard, Y. Copin, H. K. Fakhouri, E., J. Guy, A. Kim, P. Nugent, J., K. Paech, R. Pain, E. Pecontal, R., S. Perlmutter, D. Rabinowitz, M., K. Runge, C. Saunders, R. Scalzo, G., C. Tao, R. C. Thomas, B. A. Weaver, C. Wu, "Measuring cosmic bulk flows with Type Ia supernovae from the Nearby Supernova Factory", Astronomy and Astrophysics, 2013, 560:A90, doi: 10.1051/0004-6361/201321880

Karol Kowalski, Kiran Bhaskaran-Nair, Jiří Brabec, Jiří Pittner, "Coupled Cluster Theories for Strongly Correlated Molecular Systems", Springer Series in Solid-State Sciences, (Springer Berlin Heidelberg: 2013) Pages: 237-271 doi: 10.1007/978-3-642-35106-8_9

B. Dong, S. Byna, K. Wu, "SDS: a framework for scientific data services", Proceedings of the 8th Parallel Data Storage, January 1, 2013, doi: http://dx.doi.org/10.1145/2538542.2538563

S. Taubenberger, M. Kromer, S. Hachinger, P. A., S. Benetti, P. E. Nugent, R. A. Scalzo, R., V. Stanishev, J. Spyromilio, F. Bufano, S. A. Sim, B. Leibundgut, W. Hillebrandt, "Super-Chandrasekhar Type Ia Supernovae at nebular epochs", Monthly Notices of the RAS, 2013, 432:3117-3130, doi: 10.1093/mnras/stt668

D Unat, CP Chan, W Zhang, J Bell, J Shalf, Tiling as a Durable Abstraction for Parallelism and Data Locality, January 1, 2013,

M. Rigault, Y. Copin, G. Aldering, P. Antilogus, C., S. Bailey, C. Baltay, S. Bongard, C., A. Canto, F. Cellier-Holzem, M. Childress, N., H. K. Fakhouri, U. Feindt, M. Fleury, E., P. Greskovic, J. Guy, A. G. Kim, M., S. Lombardo, J. Nordin, P. Nugent, R., E. P\ econtal, R. Pereira, S. Perlmutter, D., K. Runge, C. Saunders, R. Scalzo, G., C. Tao, R. C. Thomas, B. A. Weaver, "Evidence of environmental dependencies of Type Ia supernovae from the Nearby Supernova Factory indicated by local H$\alpha$", Astronomy and Astrophysics, 2013, 560:A66, doi: 10.1051/0004-6361/201322104

Eugen Feller, Lavanya Ramakrishnan, Christine Morin, "On the Performance and Energy Efficiency of Hadoop Deployment Models", The IEEE International Conference on Big Data 2013 (IEEE BigData 2013), Santa Clara, U.S.A, 2013,

W Zhang, L Howell, A Almgren, A Burrows, J Dolence, J Bell, "Castro: A new compressible astrophysical solver. III. Multigroup radiation hydrodynamics", Astrophysical Journal, Supplement Series, 2013, 204, doi: 10.1088/0067-0049/204/1/7

Jan Rajczak, Pardeep Pall, Christoph Schär, "Projections of extreme precipitation events in regional climate simulations for Europe and the Alpine Region", Journal of Geophysical Research: Atmospheres, 2013,

D. J. Frame, D. A. Stone, "Assessment of the first consensus prediction on climate change", Nature Climate Change, 2013, 3:357--359,

W. Zheng, J. M. Silverman, A. V. Filippenko, D., P. E. Nugent, M. Graham, X. Wang, S., F. Ciabattari, P. L. Kelly, O. D. Fox, I., K. I. Clubb, S. B. Cenko, D. Balam, D. A., E. Hsiao, W. Li, G. H. Marion, D., J. Vinko, J. C. Wheeler, J. Zhang, "The Very Young Type Ia Supernova 2013dy: Discovery, and Strong Carbon Absorption in Early-time Spectra", Astrophysical Journal Letters, 2013, 778:L15, doi: 10.1088/2041-8205/778/1/L15

M. Zingale, A. Nonaka, A. S. Almgren, J. B. Bell, C. Malone, and R. Orvedahl, "Low Mach Number Modeling of Convection in Helium Shells on Sub-Chandrasekhar White Dwarfs. I. Methodology", Astrophysical Journal, 2013, 764:97,

D. A. Stone, C. J. Paciorek, Prabhat, P. Pall, M. F. Wehner, "Inferring the anthropogenic contribution to local temperature extremes", Proceedings of the National Academy of Sciences, 2013, 110:E1543, doi: 10.1073/pnas.1221461110

L. P. Singer, S. B. Cenko, M. M. Kasliwal, D. A., E. O. Ofek, D. A. Brown, P. E. Nugent, S. R., A. Corsi, D. A. Frail, E. Bellm, J., I. Arcavi, T. Barlow, J. S. Bloom, Y., N. Gehrels, A. Horesh, F. J. Masci, J., A. Rau, J. A. Surace, O. Yaron, "Discovery and Redshift of an Optical Afterglow in 71 deg$^2$: iPTF13bxl and GRB 130702A", Astrophysical Journal Letters, 2013, 776:L34, doi: 10.1088/2041-8205/776/2/L34

Haitao Ma, Stan Woosley, Chris Malone, Ann Almgren, and J.B. Bell, "Carbon Deflagration in Type Ia Supernovae: I. Centrally Ignited Models", Astrophysical Journal, 2013, 771:58,

R. L. Barone, P. Nugent, "Erratum: Near-infrared observations of type Ia supernovae: The known standard candle for cosmology", Monthly Notices of the Royal Astronomical Society, 2013, L90, doi: 10.1093/mnrasl/slt038

Peter A Stott, Myles Allen, Nikolaos Christidis, Randall M Dole, Martin Hoerling, Chris Huntingford, Pardeep Pall, Judith Perlwitz, Dáithí Stone, "Attribution of weather and climate-related events", Climate Science for Serving Society, (Springer: 2013) Pages: 307--337

Y. Cao, M. M. Kasliwal, I. Arcavi, A. Horesh, P., S. Valenti, S. B. Cenko, S. R. Kulkarni, A., E. Gorbikov, E. O. Ofek, D. Sand, O., M. Graham, J. M. Silverman, J. C. Wheeler, G. H., E. S. Walker, P. Mazzali, D. A. Howell, K. L., A. K. H. Kong, J. S. Bloom, P. E. Nugent, J., F. Masci, J. Carpenter, N. Degenaar, C. R. Gelino, "Discovery, Progenitor and Early Evolution of a Stripped Envelope Supernova iPTF13bvn", Astrophysical Journal Letters, 2013, 775:L7, doi: 10.1088/2041-8205/775/1/L7

C. Gilet, A.S. Almgren, J.B. Bell, A. Nonaka, S.E. Woosley and M. Zingale, "Low-Mach Number Modeling of Core Convection in Massive Stars", Astrophysical Journal, 2013, 773:137,

J. M. Silverman, P. E. Nugent, A. Gal-Yam, M., D. A. Howell, A. V. Filippenko, Y.-C. Pan, S. B. Cenko, I. M. Hook, "Late-time Spectral Observations of the Strongly Interacting Type Ia Supernova PTF11kx", Astrophysical Journal, 2013, 772:125, doi: 10.1088/0004-637X/772/2/125

Elif Dede, Madhusudhan Govindaraju, Daniel Gunter, Richard Canon, Lavanya Ramakrishnan, "Semi-Structured Data Analysis using MongoDB and MapReduce: A Performance Evaluation", Proceedings of the 4th international workshop on Scientific cloud computing, 2013,

K. Chen, A. Heger, S. Woosley, A. Almgren, and W. Zhang, "The Most Powerful Stellar Explosion", Bulletin of the American Physical Society, 2013, 58(4),

R Atta-Fynn, EJ Bylaska, WA De Jong, "Importance of counteranions on the hydration structure of the curium ion", Journal of Physical Chemistry Letters, 2013, 4:2166--2170, doi: 10.1021/jz400887a

J. M. Silverman, P. E. Nugent, A. Gal-Yam, M., D. A. Howell, A. V. Filippenko, I., S. Ben-Ami, J. S. Bloom, S. B. Cenko, Y., R. Chornock, K. I. Clubb, A. L. Coil, R. J., M. L. Graham, C. V. Griffith, A., M. M. Kasliwal, S. R. Kulkarni, D. C., W. Li, T. Matheson, A. A. Miller, M., E. O. Ofek, Y.-C. Pan, D. A. Perley, D., R. M. Quimby, T. N. Steele, A. Sternberg, D. Xu, O. Yaron, "Type Ia Supernovae Strongly Interacting with Their Circumstellar Medium", Astrophysical Journal Supplement, 2013, 207:3, doi: 10.1088/0067-0049/207/1/3

E. Masanet, A. Shehabi, L. Ramakrishnan, J. Liang, X. Ma, B. Walker, V. Hendrix, P Mantha, "The Energy Efficiency Potential of Cloud-Based Software: A U.S.Case Study", 2013, LBNL 6298E,

Ke-Jung Chen, Alexander Heger, and Ann S. Almgren, "Numerical Approaches for Multidimensional Simulations of Stellar Explosions", Astronomy and Computing, January 2013, 3-4:70-78,

George Michelogiannakis, William J. Dally, "Elastic Buffer Flow Control for On-Chip Networks", Transactions on Computers, 2013,

Networks-on-chip (NoCs) were developed to meet the communication requirements of large-scale systems. The majority of current NoCs spend considerable area and power for router buffers. In our past work, we have developed elastic buffer (EB) flow control which adds simple control logic in the channels to use pipeline flip-flops (FFs) as EBs with two storage locations. This way, channels act as distributed FIFOs and input buffers are no longer required. Removing buffers and virtual channels (VCs) significantly simplifies router design. Compared to VC networks, EB networks provide an up to 45% shorter cycle time, 16% more throughput per unit power or 22% more throughput per unit area. EB networks provide traffic classes using duplicate physical subnetworks. However, this approach negates the cost gains or becomes infeasible for a large number of traffic classes. Therefore, in this paper we propose a hybrid EB-VC router which provides an arbitrary number of traffic classes by using an input buffer to drain flits facing severe contention or deadlock. Thus, hybrid routers operate as EB routers in the common case, and as VC routers when necessary. For this reason, the hybrid EB-VC scheme offers 21% more throughput per unit power than VC networks and 12% than EB networks.

S. B. Cenko, S. R. Kulkarni, A. Horesh, A. Corsi, D. B., J. Carpenter, D. A. Frail, P. E. Nugent, D. A., D. Gruber, A. Gal-Yam, P. J. Groot, G., E. O. Ofek, A. Rau, C. L. MacLeod, A. A., J. S. Bloom, A. V. Filippenko, M. M., N. M. Law, A. N. Morgan, D. Polishook, D., R. M. Quimby, B. Sesar, K. J. Shen, J. M. Silverman, A. Sternberg, "Discovery of a Cosmological, Relativistic Outburst via its Rapidly Fading Optical Emission", Astrophysical Journal, 2013, 769:130, doi: 10.1088/0004-637X/769/2/130

Lavanya Ramakrishnan, Iwona Sakrejda, Richard Shane Canon and Nicholas Wright, "CAMP", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)

Cy Chan, Joseph Kenny, Gilbert Hendry, Didem Unat, Vincent Beckner, John Bell and John Shalf,, "An AMR Computation and Communication Dependency and Analysis Methodology", IA^3 2013 - SC13 Workshop on Irregular Applications: Architectures and Algorithms, Denver, CO, January 1, 2013,

A. J. Chorin, M. Morzfeld, X. Tu, "A survey of implicit particle filters for data assimilation", Statistics for Financial Engineering and Econometrics: State-Space Models and Applications in Economics and Finance, edited by S. Wu, (Springer. In print: 2013)

R. Pereira, R. C. Thomas, G. Aldering, P. Antilogus, C., S. Benitez-Herrera, S. Bongard, C., A. Canto, F. Cellier-Holzem, J. Chen, M., N. Chotard, Y. Copin, H. K. Fakhouri, M., D. Fouchez, E. Gangler, J. Guy, W., E. Y. Hsiao, M. Kerschhaggl, M., M. Kromer, J. Nordin, P. Nugent, K., R. Pain, E. P\ econtal, S. Perlmutter, D., M. Rigault, K. Runge, C. Saunders, G., C. Tao, S. Taubenberger, A. Tilquin, C. Wu, "Spectrophotometric time series of SN 2011fe from the Nearby Supernova Factory", Astronomy and Astrophysics, 2013, 554:A27, doi: 10.1051/0004-6361/201221008

W. Gu, J. Choi, M. Gu, H. D. Simon, K., "Fast Change Point Detection for electricity market", 2013 IEEE International Conference on Big Data, 2013, 50--57, doi: 10.1109/BigData.2013.6691733

Seasonal extreme daily precipitation is analyzed in the ensemble of NARCAPP regional climate models. Significant variation in these models’ abilities to reproduce observed precipitation extremes over the contiguous United States is found. Model performance metrics are introduced to characterize overall biases, seasonality, spatial extent and the shape of the precipitation distribution. Comparison of the models to gridded observations that include an elevation correction is found to be better than to gridded observations without this correction. A complicated model weighting scheme based on model performance in simulating observations is found to cause significant improvements in ensemble mean skill only if some of the models are poorly performing outliers. The effect of lateral boundary conditions are explored by comparing the integrations driven by reanalysis to those driven by global climate models. Projected mid-century future changes in seasonal precipitation means and extremes are presented and discussions of the sources of uncertainty and the mechanisms causing these changes are presented.

X. Rodó, M. Pascual, F. J. Doblas-Reyes, A. Gershunov, D. A. Stone, F. Giorgi, P. J. Hudson, J. Kinter, M.-À. Rodríguez-Arias, N. Ch. Stenseth, D. Alonso, J. García-Serrano, A. P. Dobson, "Climate change and infectious diseases: Can we meet the needs for better prediction?", Clim. Change, 2013, 118:625--640,

D. Levitan, T. Kupfer, P. J. Groot, S. R. Kulkarni, T. A., G. V. Simonian, I. Arcavi, J. S. Bloom, R., P. E. Nugent, E. O. Ofek, B. Sesar, J. Surace, "Five new outbursting AM CVn systems discovered by the Palomar Transient Factory", Monthly Notices of the RAS, 2013, 430:996-1007, doi: 10.1093/mnras/sts672

HJJ Van Dam, A Vishnu, WA De Jong, "A case for soft error detection and correction in computational chemistry", Journal of Chemical Theory and Computation, 2013, 9:3995--4005, doi: 10.1021/ct400489c

Kuan-Wu Lin, Surendra Byna, Jerry Chou, Wu, "Optimizing FastQuery performance on Lustre file", Proceedings of the 25th International Conference on and Statistical Database Management, 2013, 29,

A.S. Almgren, A.J. Aspden, J. B. Bell, and M. L. Minion, "On the Use of Higher-Order Projection Methods for Incompressible Turbulent Flow", SIAM J. Sci. Comput., 2013, 35(1):B35-B42,

E. O. Ofek, D. Fox, S. B. Cenko, M. Sullivan, O., D. A. Frail, A. Horesh, A. Corsi, R. M., N. Gehrels, S. R. Kulkarni, A., P. E. Nugent, O. Yaron, A. V. Filippenko, M. M., L. Bildsten, J. S. Bloom, D., I. Arcavi, R. R. Laher, D. Levitan, B. Sesar, J. Surace, "X-Ray Emission from Supernovae in Dense Circumstellar Matter Environments: A Search for Collisionless Shocks", Astrophysical Journal, 2013, 763:42, doi: 10.1088/0004-637X/763/1/42

The optical light curve of some supernovae (SNe) may be powered by the
outward diffusion of the energy deposited by the explosion shock (the
so-called shock breakout) in optically thick (

Alex Romosan, Arie Shoshani, Kesheng Wu, Markowitz, Kostas Mavrommatis, "Accelerating gene context analysis using bitmaps", Proceedings of the 25th International Conference on and Statistical Database Management, 2013, 26, LBNL 6397E, doi: 10.1145/2484838.2484856

S. Tang, Y. Cao, L. Bildsten, P. Nugent, E., S. R. Kulkarni, R. Laher, D. Levitan, F., E. O. Ofek, T. A. Prince, B. Sesar, J. Surace, "R Coronae Borealis Stars in M31 from the Palomar Transient Factory", Astrophysical Journal Letters, 2013, 767:L23, doi: 10.1088/2041-8205/767/2/L23

SO Odoh, EJ Bylaska, WA De Jong, "Coordination and hydrolysis of plutonium ions in aqueous solution using car-parrinello molecular dynamics free energy simulations", Journal of Physical Chemistry A, 2013, 117:12256--122, doi: 10.1021/jp4096248

Jarrod R McClean, John A Parkhill, Alan Aspuru-Guzik, "Feynman’s clock, a new variational principle, and parallel-in-time quantum dynamics", Proceedings of the National Academy of Sciences, 2013, 110:E3901--E39,

C. Buton, Y. Copin, G. Aldering, P. Antilogus, C., S. Bailey, C. Baltay, S. Bongard, A., F. Cellier-Holzem, M. Childress, N., H. K. Fakhouri, E. Gangler, J. Guy, E. Y., M. Kerschhaggl, M. Kowalski, S., P. Nugent, K. Paech, R. Pain, E., R. Pereira, S. Perlmutter, D., M. Rigault, K. Runge, R. Scalzo, G., C. Tao, R. C. Thomas, B. A. Weaver, C. Wu, Nearby SuperNova Factory, "Atmospheric extinction properties above Mauna Kea from the Nearby SuperNova Factory spectro-photometric data set", Astronomy and Astrophysics, 2013, 549:A8, doi: 10.1051/0004-6361/201219834

G. Hansen, D. Stone, M. Auffhammer, "Detection and attribution of climate change impacts -- is a universal discipline possible?", Proceedings of Impacts World 2013, Potsdam, Germany, 2013, http://www,

E. Y. Hsiao, G. H. Marion, M. M. Phillips, C. R., C. Winge, N. Morrell, C. Contreras, W. L., M. Kromer, E. E. E. Gall, C. L., P. H\ oflich, M. Im, Y. Jeon, R. P., P. E. Nugent, S. E. Persson, G., M. Roth, V. Stanishev, M. Stritzinger, N. B. Suntzeff, "The Earliest Near-infrared Time-series Spectroscopy of a Type Ia Supernova", Astrophysical Journal, 2013, 766:72, doi: 10.1088/0004-637X/766/2/72

David H. Bailey, "A New Kind of Science: Ten years later", Irreducibility and Computational Equivalence, edited by Hector Zenil, (Springer: 2013)

C. Huggel, D. Stone, M. Auffhammer, G. Hansen, "Loss and damage attribution", Nature Climate Change, 2013, 3:694--696,

E. O. Ofek, M. Sullivan, S. B. Cenko, M. M. Kasliwal, A., S. R. Kulkarni, I. Arcavi, L. Bildsten, J. S., A. Horesh, D. A. Howell, A. V. Filippenko, R., D. Murray, E. Nakar, P. E. Nugent, J. M., N. J. Shaviv, J. Surace, O. Yaron, "An outburst from a massive star 40days before a supernova explosion", Nature, 2013, 494:65-67, doi: 10.1038/nature11877

Jarrod McClean, Christopher Stull, Charles Farrar, David Mascarenas, "A preliminary cyber-physical security assessment of the Robot Operating System (ROS)", SPIE Defense, Security, and Sensing, 2013, 874110--87,

Juan A Colmenares, Gage Eads, Steven Hofmeyr, Sarah Bird, Miquel Moret\ o, David Chou, Brian Gluzman, Eric Roman, Davide B Bartolini, Nitesh Mor, others, "Tessellation: refactoring the OS around explicit resource containers with continuous adaptation", Proceedings of the 50th Annual Design Automation Conference, 2013, 76,

D. Stone, M. Auffhammer, M. Carey, G. Hansen, C. Huggel, W. Cramer, D. Lobell, U. Molau, A. Solow, L. Tibig, G. Yohe, "The challenge to detect and attribute effects of climate change on human and natural systems", Clim. Change, 2013, 121:381--395, doi: 10.1007/s10584-013-0873-6

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki, Nicholas J. Wright, "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman & Hall/CRC Press: 2013)

Jack Dongarra, Mathieu Faverge, Thomas Herault, Mathias Jacquelin, Julien Langou, Yves Robert, "Hierarchical QR factorization algorithms for multi-core clusters", Parallel Computing, 2013, 39:212--232,

Steven Hofmeyr, Tyler Moore, Stephanie Forrest, Benjamin Edwards, George Stelle, "Modeling internet-scale policies for cleaning up malware", Economics of Information Security and Privacy III, Springer New York, 2013, 149--170,

A. Horesh, C. Stockdale, D. B. Fox, D. A. Frail, J., S. R. Kulkarni, E. O. Ofek, A., M. M. Kasliwal, I. Arcavi, R. Quimby, S. B., P. E. Nugent, J. S. Bloom, N. M. Law, D., E. Gorbikov, D. Polishook, O. Yaron, S., K. W. Weiler, F. Bauer, S. D. Van Dyk, S., N. Panagia, D. Pooley, N. Kassim, "An early and comprehensive millimetre and centimetre wave and X-ray study of SN 2011dh: a non-equipartition blast wave expanding into a massive stellar wind", Monthly Notices of the RAS, 2013, 436:1258-1267, doi: 10.1093/mnras/stt1645

Valerie Hendrix, Lavanya Ramakrishnan, Youngryel Ryu, Catharine van Ingen, Keith R. Jackson, Deborah Agarwal, "CAMP: Community access MODIS pipeline", Future Generation Computer Systems, 2013,

WA De Jong, AM Walker, MD Hanwell, "From data to analysis: Linking NWChem and Avogadro with the syntax and semantics of Chemical Markup Language", Journal of Cheminformatics, 2013, 5, doi: 10.1186/1758-2946-5-25

2012

Lavanya Ramakrishnan, Adam Scovel, Iwona Sakrejda, Susan Coghlan, Shane Canon, Anping Liu, Devarshi Ghoshal, Krishna Muriki, Nicholas J. Wright, "Magellan - A Testbed to Explore Cloud Computing for Science", On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, (Chapman \& Hall/CRC Press: 2012)

Mitesh Meswani, Laura Carrington, Didem Unat, Allan Snavely, Scott Baden, Stephen Poole, "Modeling and predicting performance of high performance computing applications on hardware accelerators", International Journal of High Performance Computing Applications, December 28, 2012,

P. Maris, H. M. Aktulga, M. A. Caprio, U. V. Catalyurek, E. G. Ng, D. Oryspayev, H. Potter, E.
Saule, M. Sosonkina, J. P. Vary, C. Yang, Z. Zhou,
"Large-scale Ab-initio Configuration Interaction Calculations for Light Nuclei", J. Phys. Conf. Ser., IOP Publishing, December 18, 2012, 403:012019, doi: doi:10.1088/1742-6596/403/1/012019

H.-M. Eiter, M. Lavagnini, R. Hackl, E.A. Nowadnick, A.F. Kemper, T.P. Devereaux, J.-H. Chu, J.G. Analytis, I.R. Fisher, L. Degiorgi, "Alternative route to charge density wave formation in multiband systems", Proceedings of the National Academy of Sciences, 2012, doi: 10.1073/pnas.1214745110

Charge and spin density waves, periodic modulations of the electron, and magnetization densities, respectively, are among the most abundant and nontrivial low-temperature ordered phases in condensed matter. The ordering direction is widely believed to result from the Fermi surface topology. However, several recent studies indicate that this common view needs to be supplemented. Here, we show how an enhanced electron–lattice interaction can contribute to or even determine the selection of the ordering vector in the model charge density wave system ErTe3. Our joint experimental and theoretical study allows us to establish a relation between the selection rules of the electronic light scattering spectra and the enhanced electron–phonon coupling in the vicinity of band degeneracy points. This alternative proposal for charge density wave formation may be of general relevance for driving phase transitions into other broken-symmetry ground states, particularly in multiband systems, such as the iron-based superconductors.

Planck Collaboration, "Planck Intermediate Results. X. Physics of the hot gas in the Coma cluster", ArXiv e-prints, December 10, 2012,

Xiao Li, Zhifang Wang, Vishak Muthukumar, Anna Scaglione, Chuck McParland, Sean Peisert, "Networked Loads in the Distribution Grid", Proceedings of the 2012 APSIPA Annual Summit and Conference, Hollywood, CA, December 3, 2012,

David H. Bailey and Jonathan M. Borwein, "Compressed lattice sums arising from the Poisson equation: Dedicated to Professor Hari Sirvastava", Boundary Value Problems, special issue, Proceedings of the International Congress in Honour of Professor Hari M. Srivastava, December 2, 2012, to appea,

H. Hu, C. Yang, K. Zhao, "Absorption correction A* for cylindrical and spherical crystals with extended range and high accuracy calculated by Thorkildsen & Larsen analytical method", in press Acta Crystallographica, A, 2012,

Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian Van Straalen, Mikhail Smelyanskiy,
Ann Almgren, Pradeep Dubey, John Shalf, Leonid Oliker,
"Implementation and Optimization of miniGMG - a Compact Geometric Multigrid Benchmark", December 2012, LBNL 6676E,

David H. Bailey, Jonathan M. Borwein, Christian S. Calude, Michael J. Dinneen, Monica Dumitrescu, Alex Yee, "An empirical approach to the normality of pi", Experimental Mathematics, 2012, 21:375-384,

Seung-Ki Min, Xuebin Zhang, Francis Zwiers, Hideo Shiogama, Yu-Shiang Tung, and Michael Wehner, "Multi-Model Detection and Attribution of Extreme Temperature Changes", Journal of Climate (submitted), 2012,

Donald Wuebbles, Gerald Meehl, Katharine Hayhoe, Thomas R. Karl, Kenneth Kunkel, Benjamin Santer, Michael Wehner, Brian Colle, Erich M. Fischer, Rong Fu, Alex Goodman, Emily Janssen, Huikyo Lee, Wenhong Li, Lindsey N. Long, Seth Olsen, Anji Seth, Justin Sheffield, Liqiang Sun, "CMIP5 Climate Model Analyses: Climate Extremes in the United States", Bulletin of the American Meteorological Society (submitted), 2012,

C. L. Morris, Konstantin Borozdin, Jeffrey Bacon, Elliott Chen, Zarija Lukić, Edward Milner, Haruo Miyadera, John Perry, Dave Schwellenbach, Derek Aberle, Wendi Dreesen, J. Andrew Green, George G. McDuff, Kanetada Nagamine, Michael Sossong, Candace Spore, Nathan Toleman, "Obtaining material identification with cosmic ray radiography", AIP Advances, 2012, 2:042128,

Samuel Williams, Optimization of Geometric Multigrid for Emerging Multi- and Manycore Processors, Supercomputing (SC), November 2012,

Cyrus Harrison, Paul Navratil, Maysam Mossalem, Ming Jiang, Hank Childs, "Efficient Dynamic Derived Field Generation on Many-Core Architectures Using Python", Workshop on Python for High Performance and Scientific Computing (PyHPC 2012), held in conjunction with the ACM/IEEE Conference on SuperComputing (SC12), November 2012, 11-20,

Babak Behzad, Joey Huchette, Huong Luu, Ruth Aydt, Quincey Koziol, Prabhat, Suren Byna, Mohamad Chaarawi, Yushu Yao, "Auto-Tuning of Parallel IO Parameters for HDF5 Applications", Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, 2012,

Abhinav Sarje, Xiaoye S. Li, Slim Chourou, Elaine R. Chan, Alexander Hexemer, "Massively Parallel X-ray Scattering Simulations", Supercomputing, November 2012,

Although present X-ray scattering techniques can provide tremendous information on the nano-structural properties of materials that are valuable in the design and fabrication of energy-relevant nano-devices, a primary challenge remains in the analyses of such data. In this paper we describe a high-performance, flexible, and scalable Grazing Incidence Small Angle X-ray Scattering simulation algorithm and codes that we have developed on multi-core/CPU and many-core/GPU clusters. We discuss in detail our implementation, optimization and performance on these platforms. Our results show speedups of ~125x on a Fermi-GPU and ~20x on a Cray-XE6 24-core node, compared to a sequential CPU code, with near linear scaling on multi-node clusters. To our knowledge, this is the first GISAXS simulation code that is flexible to compute scattered light intensities in all spatial directions allowing full reconstruction of GISAXS patterns for any complex structures and with high-resolutions while reducing simulation times from months to minutes.

Abhinav Sarje, Xiaoye S. Li, Slim Chourou, Alexander Hexemer, Massively Parallel X-Ray Scattering Simulations, Supercomputing (SC), November 2012,

Mehmet Balman, "MemzNet: Memory-Mapped Zero-copy Network Channel for Moving Large Datasets over 100Gbps Networks", technical poster in ACM/IEEE international Conference For High Performance Computing, Networking, Storage and Analysis (SC'12), LBNL 6175E, November 13, 2012, doi: http://doi.ieeecomputersociety.org/10.1109/SC.Companion.2012.294

High-bandwidth networks are poised to provide new opportunities in tackling large data challenges in today's scientific applications. However, increasing the bandwidth is not sufficient by itself; we need careful evaluation of future high-bandwidth networks from the applications' perspective. We have experimented with current state-of-the-art data movement tools, and realized that file-centric data transfer protocols do not perform well with managing the transfer of many small files in high-bandwidth networks, even when using parallel streams or concurrent transfers. We require enhancements in current middleware tools to take advantage of future networking frameworks. To improve performance and efficiency, we develop an experimental prototype, called MemzNet: Memory-mapped Zero-copy Network Channel, which uses a block-based data movement method in moving large scientific datasets. We have implemented MemzNet that takes the approach of aggregating files into blocks and providing dynamic data channel management. In this work, we present our initial results in 100Gbps networks.
http://dx.doi.org/10.1109/SC.Companion.2012.294               
http://dx.doi.org/10.1109/SC.Companion.2012.295

S. Williams, D. Kalamkar, A. Singh, A. Deshpande, B. Van Straalen, M. Smelyanskiy, A. Almgren, P. Dubey, J. Shalf, L. Oliker, "Optimization of Geometric Multigrid for Emerging Multi- and Manycore Processors", Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), November 2012, doi: 10.1109/SC.2012.85

Michael Garland, Manjunath Kudlur, Yili Zheng, "Designing a Unified Programming Model for Heterogeneous Machines", Supercomputing (SC), November 2012,

Evangelos Georganas, Jorge González-Domínguez, Edgar Solomonik, Yili Zheng, Juan Touriño, Katherine Yelick, "Communication Avoiding and Overlapping for Numerical Linear Algebra", Supercomputing (SC), November 2012,

Devarshi Ghoshal, Lavanya Ramakrishnan, "FRIEDA: Flexible Robust Intelligent Elastic Data Management in Cloud Environments", The Third International Workshop on Data Intensive Computing in the Clouds (DataCloud 2012), 2012,

Surendra Byna, Jerry Chou, Oliver Rübel, Prabhat, Homa Karimabadi, William S. Daughton, Vadim Roytershteyn, E. Wes Bethel, Mark Howison, Ke-Jou Hsu, Kuan-Wu Lin, Arie Shoshani, Andrew Uselton, and Kesheng Wu, "Parallel I/O, Analysis, and Visualization of a Trillion Particle Simulation", SuperComputing 2012 (SC12), Salt Lake City, Utah, November 2012,

Mehmet Balman, "Streaming Exascale Data over 100Gbps Networks", IEEE Computing Now, November 8, 2012, LBNL 6173E,

C. Mendl, L. Lin, "Towards the Kantorovich dual solution for strictly correlated electrons in atoms and molecules", submitted to Phys. Rev. B, 2012,

B. Wang, S. Ethier, W. Tang, K. Ibrahim, K. Madduri, S. Williams, "Advances in gyrokinetic particle in cell simulation for fusion plasmas to Extreme scale", Supercomputing (SC), 2012,

E. Wes Bethel and Mark Howison, "Multi-core and Many-core Shared-memory Parallel Raycasting Volume Rendering Optimization and Tuning", International Journal of High Performance Computing Applications, 2012, LBNL 5362E,

Salman Habib, Vitali Morozov, Hal Finkel, Adrian Pope, Katrin Heitmann, Kalyan Kumaran, Tom Peterka, Joe Insley, David Daniel, Patricia Fasel, Nicholas Frontiere, Zarija Lukić, "The universe at extreme scale: multi-petaflop sky simulation on the BG/Q", Supercomputing, 2012, 4,

Junmin Gu, David Smith, Ann L. Chervenak, Alex Sim, "Adaptive Data Transfers that Utilize Policies for Resource Sharing", The 2nd International Workshop on Network-Aware Data Management Workshop (NDM2012), 2012,

Pak Shing Li, Andrew Myers, Christopher McKee, "Ambipolar Diffusion Heating in Turbulent Systems", The Astrophysical Journal, Volume 760, Issue 1, article id. 33, November 1, 2012,

Hank Childs, "Parallel Visualization Frameworks", High Performance Visualization---Enabling Extreme-Scale Scientific Insight, ( October 2012) Pages: 9--24

Hank Childs, Kwan-Liu Ma, Hongfeng Yu, Brad Whitlock, Jeremy Meredith, Jean Favre, Scott Klasky, Norbert Podhorszki, Karsten Schwan, Matthew Wolf, Manish Parashar, Fan Zhang, "In Situ Processing", High Performance Visualization---Enabling Extreme-Scale Scientific Insight, ( October 2012) Pages: 171--198

E. Wes Bethel, David Camp, Hank Childs, Christoph Garth, Mark Howison, Kenneth I. Joy, David Pugmire, "Hybrid Parallelism", High Performance Visualization---Enabling Extreme-Scale Scientific Insight, ( October 2012) Pages: 261--290

Hank Childs, David Pugmire, Sean Ahern, Brad Whitlock, Mark Howison, Prabhat, Gunther Weber, E. Wes Bethel, "Visualization at Extreme Scale Concurrency", High Performance Visualization---Enabling Extreme-Scale Scientific Insight, ( October 2012) Pages: 291--306

Hank Childs, Eric Brugger, Brad Whitlock, Jeremy Meredith, Sean Ahern, David Pugmire, Kathleen Biagas, Mark Miller, Cyrus Harrison, Gunther H. Weber, Hari Krishnan, Thomas Fogal, Allen Sanderson, Christoph Garth, E. Wes Bethel, David Camp, Oliver Rubel, Marc Durant, Jean M. Favre, Paul Navratil, "VisIt: An End-User Tool For Visualizing and Analyzing Very Large Data", High Performance Visualization---Enabling Extreme-Scale Scientific Insight, ( October 2012) Pages: 357--372

E. Wes Bethel, Hank Childs, Charles Hansen (editors), High Performance Visualization---Enabling Extreme-Scale Scientific Insight, Chapman & Hall, CRC Computational Science, (CRC Press/Francis--Taylor Group: October 2012)

L. Lin, S. Shao, W.E, "Efficient iterative method for solving the Dirac-Kohn-Sham density functional theory", submitted to J. Comput. Phys., 2012,

Memory and performance issues in parallel multifrontal factorizations and triangular solutions with sparse right-hand sides, François-Henry Rouet, PhD, Institut National Polytechnique de Toulouse, October 17, 2012,

S. Chourou, A. Sarje, X. Li, E. Chan, A. Hexemer, GISAXS School: The HipGISAXS Software, Advanced Light Source User Meeting, October 2012,

Tutorial session

Sophie Engle, Sean Whalen, "Visualizing Distributed Memory Computations with Hive Plots", Proceedings of the 9th ACM International Symposium on Visualization for Cyber Security (VizSec), Seattle, WA, ACM, October 15, 2012, 56-63, doi: 10.1145/2379690.2379698

Konstantin Borozdin, Steven Greene, Zarija Lukić, Edward Milner, Haruo Miyadera, Christopher Morris, John Perry, "Cosmic Ray Radiography of the Damaged Cores of the Fukushima Reactors", Physical Review Letters, 2012, 109:152501,

Bin Dong, Xiuqiao Li, Limin Xiao, Li Ruan, "A New File-Specific Stripe Size Selection Method for Highly Concurrent Data Access", The 13th ACM/IEEE International Conference on Grid Computing (Grid 2012), 2012, 2012,

Hongzhang Shan, Brian Austin, Nicholas Wright, Erich Strohmaier, John Shalf, Katherine Yelick, "Accelerating Applications at Scale Using One-Sided Communication", Santa Barbara, CA, The 6th Conference on Partitioned Global Address Programming Models, October 10, 2012,

Sean Peisert, Institute for Information Infrastructure Protection (I3P), 10th Anniversary Event, The National Press Club, October 10, 2012,

Bin Dong, Xiuqiao Li, Qimeng Wu, Limin Xiao, Li Ruan, "A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers", Journal of Parallel and Distributed Computing (JPDC), Volume 72, Issue 10, October 2012, Pages 1254-1268, 2012,

Jason Ansel, Maciej Pacula, Yee Lok Wong, Cy Chan, Marek Olszewski, Una-May O'Reilly, Saman Amarasinghe, "Siblingrivalry: online autotuning through local competitions", Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES '12), ACM, October 7, 2012, 91-100,

Gunther H. Weber, Hank Childs, Jeremy S. Meredith, "Efficient Parallel Extraction of Crack-free Isosurfaces from Adaptive Mesh Refinement (AMR) Data", Proceedings of IEEE Symposium on Large Data Analysis and Visualization (LDAV), October 2012, 31--38, LBNL 5799E,

Oliver Rübel, Cameron, G. R. Geddes, Min Chen, Estelle Cormier-Michel, and E. Wes Bethel, "Query-driven Analysis of Plasma-based Particel Acceleration Data", Poster Abstracts of IEEE VisWeek, October 2012,

K. Madduri, J. Su, S. Williams, L. Oliker, S. Ethier, K. Yelick, "Optimization of Parallel Particle-to-Grid Interpolation on Leading Multicore Platforms", Transactions on Parallel and Distributed Systems (TPDS), October 1, 2012, doi: 10.1109/TPDS.2012.28

Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally, "Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks", International Conference on Computer Design, IEEE Computer Society, 2012,

This paper introduces Adaptive Backpressure, a novel scheme that improves the utilization of dynamically man- aged router input buffers by continuously adjusting the stiffness of the flow control feedback loop in response to observed traffic conditions. Through a simple extension to the router’s flow control mechanism, the proposed scheme heuristically limits the number of credits available to individual virtual channels based on estimated downstream congestion, aiming to minimize the amount of buffer space that is occupied unproductively. This leads to more efficient distribution of buffer space and improves isolation between multiple concurrently executing workloads with differing performance characteristics.

Experimental results for a 64-node mesh network show that Adaptive Backpressure improves network stability, leading to an average 2.6× increase in throughput under heavy load across traffic patterns. In the presence of background traffic, the pro- posed scheme reduces zero-load latency by an average of 31 %. Finally, it mitigates the performance degradation encountered when latency- and throughput-optimized execution cores contend for network resources in a heterogeneous chip multi-processor; across a set of PARSEC benchmarks, we observe an average reduction in execution time of 34%.

David H. Bailey, Jonathan M. Borwein, "Ancient Indian square roots: An exercise in forensic paleo-mathematics", American Mathematical Monthly, October 2012, 119:646-657,

Paul Navratil, Bill Barth, Hank Childs, "Virtual Rheoscopic Fluids for Dense, Large-Scale Fluid Flow Visualizations", Proceedings of IEEE Symposium on Large Data Analysis and Visualization (LDAV), October 2012, 79--84,

David Camp, Hank Childs, Christoph Garth, David Pugmire, Kenneth I. Joy, "Parallel Stream Surface Computation for Large Data Sets", Proceedings of IEEE Symposium on Large Data Analysis and Visualization (LDAV), October 2012, 39--47, LBNL 5776E,

Tamay M. Ozgokmen, Andrew C. Poje, Paul F. Fischer, Hank Childs, Harinarayan Krishnan, Christoph Garth, Angelique C. Haza, Edward Ryan, "On multi-scale dispersion under the influence of surface mixed layer instabilities", Ocean Modelling, October 2012, 56:16-30,

A. Buluç, A. Fox, J. R. Gilbert, S. Kamil, A. Lugowski, L. Oliker, S. Williams, "High-performance analysis of filtered semantic graphs", PACT '12 Proceedings of the 21st international conference on Parallel architectures and compilation techniques (extended abstract), 2012, doi: 10.1145/2370816.2370897

Z. Zhou, E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, U. V. Catalyurek, "An Out-of-core Eigensolver on SSD-equipped Clusters", 2012 IEEE International Conference on Cluster Computing (CLUSTER), Beijing, China, September 26, 2012, 248 - 256, doi: 10.1109/CLUSTER.2012.76

K. Kandalla, A. Buluç, H. Subramoni, K. Tomko, J. Vienne, L. Oliker, D. K. Panda, "Can network-offload based non-blocking neighborhood MPI collectives improve communication overheads of irregular graph algorithms?", International Workshop on Parallel Algorithms and Parallel Software (IWPAPS 2012), 2012,

Sean Peisert, Ed Talbot, Matt Bishop, "Turtles All the Way Down: A Clean-Slate, Ground-Up, First-Principles Approach to Secure Systems", Proceedings of the 2012 New Security Paradigms Workshop (NSPW), ACM, September 2012, doi: 10.1145/2413296.2413299

Z. Zhou, E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, U. V. Catalyurek, "An Out-Of-Core Dataflow Middleware to Reduce the Cost of Large Scale Iterative Solvers", 2012 41st International Conference on Parallel Processing Workshops (ICPPW), Pittsburgh, PA, September 10, 2012, 71 - 80, doi: 10.1109/ICPPW.2012.13

Jun Zhou, Didem Unat, Dong Ju Choi, Clark C. Guest, Yifeng Cui, "Hands-on Performance Tuning of 3D Finite Difference Earthquake Simulation on GPU Fermi Chipset", Procedia CS, 2012, Vol 9:976-985,

Matt Bishop, Sean Peisert, "Security and Elections", IEEE Security & Privacy, September 2012, 10(5):64-67, doi: 10.1109/MSP.2012.127

Douglas F Levinson, Jianxin Shi, Kai Wang, Sang Oh, Brien Riley, Ann E Pulver, Dieter B Wildenauer, Claudine Laurent, Bryan J Mowry, Pablo V Gejman, Michael J Owen, Kenneth S Kendler, Gerald Nestadt, Sibylle G Schwab, Jacques Mallet, Deborah Nertney, Alan R Sanders, Nigel M Williams, Brandon Wormley, Virginia K Lasseter, Margot Albus, Stephanie Godard-Bauché, Madeline Alexander, Jubao Duan, Michael C O’Donovan, Dermot Walsh, Anthony O’Neill, George N Papadimitriou, Dimitris Dikeos, Wolfgang Maier, Bernard Lerer, Dominique Campion, David Cohen, Maurice Jay, Ayman Fanous, Peter Eichhammer, Jeremy M Silverman, Nadine Norton, Nancy Zhang, Hakon Hakonarson, Cynthia Gao, Ami Citri, Mark Hansen, Stephan Ripke, Frank Dudbridge, Peter A Holmans, "Genome-wide association study of multiplex schizophrenia pedigrees", The American Journal of Psychiatry, September 1, 2012, doi: doi:10.1176/appi.ajp.2012.11091423

B. Kallemov, G. H. Miller, S. Mitran and D. Trebotich, "Calculation of Viscoelastic Bead-Rod Flow Mediated by a Homogenized Kinetic Scale with Holonomic Constraints", Molecular Simulation, 2012, doi: 10.1080/08927022.2011.654206

H. M. Aktulga, C. Yang, P. Maris, J. P. Vary, E. G. Ng, "Topology-Aware Mappings for Large-Scale Eigenvalue Problems", Euro-Par 2012 Parallel Processing Conference, Rhode Island, Greece, August 31, 2012, LNCS 748:830-842, doi: 10.1007/978-3-642-32820-6_82

Nils E. R. Zimmermann, Berend Smit, Frerich J. Keil, "Predicting Local Transport Coefficients at Solid-Gas Interfaces", J. Phys. Chem. C, 2012, 116:18878-1888, doi: 10.1021/jp3059855

The regular nanoporous structure make zeolite membranes attractive candidates for separating molecules on the basis of differences in transport rates (diffusion). Since improvements in synthesis have led to membranes as thin as several hundred nanometers by now, the slow transport in the boundary layer separating bulk gas and core of the nanoporous membrane is becoming increasingly important. Therefore, we investigate the predictability of the coefficient quantifying this local process, the surface permeability α, by means of a two-scale simulation approach. Methane tracer-release from the one-dimensional nanopores of an AFI-type zeolite is employed. Besides a pitfall in determining α on the basis of tracer exchange, we, importantly, present an accurate prediction of the surface permeability using readily available information from molecular simulations. Moreover, we show that the prediction is strongly influenced by the degree of detail with which the boundary region is modeled. It turns out that not accounting for the fact that molecules aiming to escape the host structure must indeed overcome two boundary regions yields too large a permeability by a factor of 1.7–3.3, depending on the temperature. Finally, our results have far-reaching implications for the design of future membrane applications.

Watch a movie illustrating the conditions of self- or tracer-diffusion here.

Richard L. Martin, Thomas F. Willems, Li-Chiang Lin, Jihan Kim, Joseph A. Swisher, Berend Smit & Maciej Haranczyk, "Similarity-Driven Discovery of Zeolite Materials for Adsorption-Based Separations", ChemPhysChem, August 22, 2012, 13:3595-3597,

Crystalline porous materials can be exploited in many applications. Discovery of materials with optimum adsorption properties typically involves expensive brute-force characterization of large sets of materials. An alternative approach based on similarity searching that enables discovery of materials with optimum adsorption for CO2 and other molecules at a fraction of the cost of brute-force characterization is demonstrated.

This work was featured on the front cover of the journal, available here: http://onlinelibrary.wiley.com/doi/10.1002/cphc.201290074/abstract

Richard L. Martin, Thomas F. Willems, Li-Chiang Lin, Jihan Kim, Joseph A. Swisher, Berend Smit & Maciej Haranczyk, "Similarity-Driven Discovery of Zeolite Materials for Adsorption-Based Separations", ChemPhysChem, Pages: 3561 August 22, 2012,

A tool for identifying optimum zeolite frameworks for gas separations at a fraction of the cost of molecular simulation is presented on p. 3595 ff. by M. Haranczyk et al. The method is based on identifying property-determining substructure features and searching material databases for geometrically similar arrangements of framework atoms. The approach is deployed to screen a database an order of magnitude larger than has been examined in previous studies.

As performance gains in sequential programming have stagnated due to power constraints, parallel computing has become the primary tool for increasing performance. Parallel computing has long been used in scientific computing, and programmers of the future will likely face many of the same challenges that occur in programming large-scale machines. One such challenge is that of hierarchy: machines are built in a hierarchical fashion, with a wide range of communication costs between different parts of a machine, and applications such as divide-and-conquer algorithms often have hierarchical structure. Large-scale parallel machines are programmed primarily with the single program, multiple data (SPMD) model of parallelism. This model combines independent threads of execution with global collective communication and synchronization operations. Previous work has demonstrated the advantages of SPMD over other models: its simplicity enables productive programming and avoids many classes of parallel errors, and at the same time it is easy to implement and amenable to compiler analysis and optimization. Its local-view execution model allows programmers to take advantage of data locality, resulting in good performance and scalability on large-scale machines. However, it is a flat model that does not fit well with hierarchical machines or algorithms. In this dissertation, we introduce the recursive single program, multiple data (RSPMD) execution model. This model extends SPMD with hierarchical, structured teams, or groupings of threads. We design RSPMD extensions for the Titanium language, including a hierarchical team data structure and lexically-scoped constructs for operating over teams. We demonstrate that these extensions prevent erroneous use of teams that would result in deadlock. In addition, we present a runtime mechanism for ensuring proper use of both global collective operations and collectives over teams, eliminating more potential sources of deadlock. As analyzable as SPMD is, we demonstrate that RSPMD can also be analyzed precisely and efficiently. We define a hierarchical pointer analysis for determining which data a pointer can reference, as well as on which threads the referenced data may reside. We then present a series of analyses for computing the set of concurrent statements in both SPMD and RSPMD programs. We show that these analyses improve the results of multiple client analyses, including data-locality and sharing inference, race detection, and memory-model enforcement. Finally, we present application case studies demonstrating the expressiveness and performance of the RSPMD model. We show that the model enables divide-and-conquer algorithms such as sorting to be elegantly expressed, and that team collective operations increase performance of a conjugate gradient benchmark by up to a factor of two. The model also facilitates optimizations for hierarchical machines, improving scalability of a particle in cell application by 8x, performance of sorting by up to 40%, and execution time of a stencil code by as much as 14%.

Mehmet Balman, "Analyzing Data Movements and Identifying Techniques for Next-generation High-bandwidth Networks", LBNL Tech Report, 2012, LBNL 6177E,

High-bandwidth networks are poised to provide new opportunities in tackling large data challenges in today's scientific applications. However, increasing the bandwidth is not sufficient by itself; we need careful evaluation of future high-bandwidth networks from the applications’ perspective. We have investigated data transfer requirements of climate applications as a typical scientific example and evaluated how the scientific community can benefit from next generation high-bandwidth networks.  We develop a new block-based data movement method (in contrast to the current file-based methods) to improve data movement performance and efficiency in moving large scientific datasets that contain many small files. We implemented the new block-based data movement tool, which takes the approach of aggregating files into blocks and providing dynamic data channel management. One of the major obstacles in use of high-bandwidth networks is the limitation in host system resources. We have conducted a large number of experiments with our new block-based method and with current available file-based data movement tools.  In this white paper, we describe future research problems and challenges for efficient use of next-generation science networks, based on the lessons learnt and the experiences gained with 100Gbps network applications.

David H. Bailey, Jonathan M. Borwein and Richard E. Crandall, "Computation and theory of extended Mordell-Tornheim-Witten sums", Mathematics of Computation, July 31, 2012, to appea,

Rui Sun, Kyoyeon Park, Wibe A. de Jong, Hans Lischka, Theresa L. Windus, William L. Hase, "Direct dynamics simulation of dioxetane formation and decomposition via the singlet center dot O-O-CH2-CH2 center dot biradical: Non-RRKM dynamics", Journal of Chemical Physics, 2012, 137, doi: 10.1063/1.4736843

David H. Bailey and Jonathan M. Borwein, "Nonnormality of Stoneham constants", Ramanujan Journal, July 24, 2012, 29:409-422,

J. Krueger, P. Micikevicius, S. Williams, "Optimization of Forward Wave Modeling on Contemporary HPC Architectures", LBNL Technical Report, 2012, LBNL 5751E,

Paul H. Hargrove, UPC Language Full-day Tutorial, Workshop at UC Berkeley, July 12, 2012,

Jihan Kim, Li-Chiang Lin, Richard L. Martin, Joseph A. Swisher, Maciej Haranczyk & Berend Smit, "Large-Scale Computational Screening of Zeolites for Ethane/Ethene Separation", Langmuir, July 11, 2012, 28:11914–1191,

Large-scale computational screening of thirty thousand zeolite structures was conducted to find optimal structures for separation of ethane/ethene mixtures. Efficient grand canonical Monte Carlo (GCMC) simulations were performed with graphics processing units (GPUs) to obtain pure component adsorption isotherms for both ethane and ethene. We have utilized the ideal adsorbed solution theory (IAST) to obtain the mixture isotherms, which were used to evaluate the performance of each zeolite structure based on its working capacity and selectivity. In our analysis, we have determined that specific arrangements of zeolite framework atoms create sites for the preferential adsorption of ethane over ethene. The majority of optimum separation materials can be identified by utilizing this knowledge and screening structures for the presence of this feature will enable the efficient selection of promising candidate materials for ethane/ethene separation prior to performing molecular simulations.

Kelly P. Gaither, Hank Childs, Karl Schulz, Cyrus Harrison, Bill Barth, Diego Donzis, P.K. Yeung, "Using Visualization and Data Analysis to Understand Critical Structures in Massive Time Varying Turbulent Flow Simulations", IEEE Computer Graphics and Applications, July 2012, 32:34-45, LBNL 5911E,

Patrick R. Amestoy, Iain S. Duff, Jean-Yves L Excellent, Yves Robert, François-Henry Rouet, Bora Uçar, "On computing inverse entries of a sparse matrix in an out-of-core environment", SIAM Journal on Scientific Computing, July 2012, 34:1975--1999, doi: 10.1137/100799411

G. H. Miller and D. Trebotich, "An Embedded Boundary Method for the Navier-Stokes Equations on a Time-Dependent Domain", Communications in Applied Mathematics and Computational Science, 7(1):1-31, 2012,

J. K. Freericks, A. Y. Liu, A. F. Kemper, T. P. Devereaux, "Pulsed high harmonic generation of light due to pumped Bloch oscillations in noninteracting metals", Physica Scripta, 2012, T151:014062, doi: 10.1088/0031-8949/2012/T151/014062

We derive a simple theory for high-order harmonic generation due to pumping a noninteracting metal with a large amplitude oscillating electric field. The model assumes that the radiated light field arises from the acceleration of electrons due to the time-varying current generated by the pump, and also assumes that the system has a constant density of photoexcited carriers, hence it ignores the dipole excitation between bands (which would create carriers in semiconductors). We examine the circumstances under which odd harmonic frequencies would be expected to dominate the spectrum of radiated light, and we also apply the model to real materials like ZnO, for which high-order harmonic generation has already been demonstrated in experiments.

Santer, B.D., J. Painter, C. Mears, C. Doutriaux, P. Caldwell, J.M. Arblaster, P. Cameron-Smith, N.P. Gillett, P.J. Gleckler, J.R. Lanzante, J. Perlwitz, S. Solomon, P.A. Stott, K.E. Taylor, L. Terray, P.W. Thorne, M.F. Wehner, F.J. Wentz, T.M.L. Wigley, L. Wilcox and C.-Z. Zou, "Identifying Human Influences on Atmospheric Temperature: Are Results Robust to Uncertainties?", Proceedings of the National Academy of Sciences, June 22, 2012,

Mehmet Balman, Eric Pouyoul, Yushu Yao, E. Wes Bethel, Burlen Loring, Prabhat, John Shalf, Alex Sim, and Brian L. Tierney, "Experiences with 100G Network Applications", In Proceedings of the Fifth international Workshop on Data-intensive Distributed Computing, in conjunction with ACM High Performance Distributing Computing (HPDC) Conference, 2012, Delft, Netherlands, June 2012, LBNL 5603E, doi: 10.1145/2286996.2287004

100Gbps networking has finally arrived, and many research and educational in- stitutions have begun to deploy 100Gbps routers and services. ESnet and Internet2 worked together to make 100Gbps networks available to researchers at the Super- computing 2011 conference in Seattle Washington. In this paper, we describe two of the first applications to take advantage of this network. We demonstrate a visu- alization application that enables remotely located scientists to gain insights from large datasets. We also demonstrate climate data movement and analysis over the 100Gbps network. We describe a number of application design issues and host tuning strategies necessary for enabling applications to scale to 100Gbps rates. 

Zacharia Fadika, Madhusudhan Govindaraju, Shane Canon, Lavanya Ramakrishnan, "Evaluting Hadoop for Data-Intensive Scientific Operations", IEEE Cloud Computing, 2012,

W. Yoo, K. Larson, L. Baugh, S. Kim, R. H. Campbell, "ADP: automated diagnosis of performance pathologies using hardware events", SIGMETRICS '12: Proc. of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems, 2012,

Hongzhang Shan, Erich Strohmaier, James Amundson, Eric G. Stern, "Optimizing The Advanced Accelerator Simulation Framework Synergia Using OpenMP", IWOMP'12 Proceedings of the 8th International COnference on OpenMP, June 11, 2012,

Application of GPUs to the high-throughput screening of porous materials for carbon capture, 18th IEEE Real-Time Conference, Berkeley, CA, USA, June 10, 2012,

Miao Luo, Dhabaleswar K. Panda, Khaled Z. Ibrahim, Costin Iancu, "Congestion avoidance on manycore high performance computing systems", International Conference on Supercomputing (ICS), 2012,

S Hofmeyr, J Colmenares, J Kubiatowicz, C Iancu, "Juggle: Addressing Extrinsic Load Imbalances in SPMD Applications on Multicore Computer", Cluster Computing, 2012,

L. Lin, L. Ying, "Element orbitals for Kohn-Sham density functional theory", Phys. Rev. B, 2012, 85:235144,

Jessica B. Voytek, Bradley Voytek, "Automated cognome construction and semi-automated hypothesis generation", Journal of Neuroscience Methods, June 2012, 208:92–100,

Karen L. Schuchardt, Deborah A. Agarwal, Stefan A. Finsterle, Carl W. Gable, Ian Gorton, Luke J. Gosink, Elizabeth H. Keating, Carina S. Lansing, Joerg Meyer, William A.M. Moeglein, George S.H. Pau, Ellen A. Porter, Sumit Purohit, Mark L. Rockhold, Arie Shoshani, and Chandrika Sivaramakrishnan, Akuna, "Integrated Toolsets Supporting Advanced Subsurface Flow and Transport Simulations for Environmental Management", XIX International Conference on Computational Methods in Water Resources (CMWR 2012), University of Illinois at Urbana-Champaign, June 2012,

A. Buluç, J. Gilbert, "Parallel sparse matrix-matrix multiplication and indexing: Implementation and experiments", SIAM Journal on Scientific Computing (SISC), 2012,

J. Nordhaus, T.D. Brandt, A. Burrows, A. Almgren, "The Hydrodynamic Origin of Neutron Star Kicks", Monthly Notices of the Royal Astronomical Society, 2012, 423:2,

Energy-Efficient Flow-Control for On-Chip Networks, George Michelogiannakis, Stanford University, 2012,

With the emergence of on-chip networks, the power consumed by router buffers has become a primary concern. Bufferless flow control has been proposed to address this issue by removing router buffers and handling contention by dropping or deflecting flits. In this thesis, we compare virtual-channel (buffered) and deflection (packet-switched bufferless) flow control. Our study shows that unless process constraints lead to excessively costly buffers, the performance, cost and increased complexity of deflection flow control outweigh its potential gains. To provide buffering in the network but without the cost and timing overhead of router buffers, we propose elastic buffer (EB) flow control which adds simple control logic in the channels to use pipeline flip-flops (FFs) as EBs with two storage locations. This way, channels act as distributed FIFOs and input buffers as well as the complexity for virtual channels (VCs) are no longer required. Therefore, EB networks have a shorter cycle time and offer more throughput per unit power than VC networks. We also propose a hybrid EB-VC router which is used to provide traffic separation for a number of traffic classes large enough for duplicate physical channels to be inefficient. These hybrid routers offer more throughput per unit power than both EB and VC routers. Finally, this thesis proposes packet chaining, which addresses the tradeoff between allocation quality and cycle time traditionally present in routers with VCs. Packet chaining is a simple and effective method to increase allocator matching efficiency to be comparable or superior to more complex and slower allocators without extending cycle time, particularly suited to networks with short packets.

Prabhat, Oliver Rübel, Surendra Byna, Kesheng Wu, Fuyu Li, Michael Wehner and E. Wes Bethel, "TECA: A Parallel Toolkit for Extreme Climate Analysis", Procedia Computer Science, Proceedings of the International Conference on Computational Science, ICCS 2012, Presented at Third Worskhop on Data Mining in Earth System Science (DMESS 2012), Omaha, Nebraska, June 2012, 9:866–876, LBNL 5352E, doi: 10.1016/j.procs.2012.04.093

We present TECA, a parallel toolkit for detecting extreme events in large climate datasets. Modern climate datasets expose parallelism across a number of dimensions: spatial locations, timesteps and ensemble members. We design TECA to exploit these modes of parallelism and demonstrate a prototype implementation for detecting and tracking three classes of extreme events: tropical cyclones, extra-tropical cyclones and atmospheric rivers. We process a modern TB-sized CAM5 simulation dataset with TECA, and demonstrate good runtime performance for the three case studies.

"New materials could slash energy costs for CO2 capture", Jade Boyd, David Ruth, May 30, 2012,

A detailed analysis of more than 4 million absorbent minerals has determined that new materials could help electricity producers slash as much as 30 percent of the “parasitic energy” costs associated with removing carbon dioxide from power plant emissions...

When power plants begin capturing their carbon emissions to reduce greenhouse gases – and to most in the electric power industry, it’s a question of when, not if – it will be an expensive undertaking...

Li-Chiang Lin, Adam H. Berger, Richard L. Martin, Jihan Kim, Joseph A. Swisher, Kuldeep Jariwala, Chris H. Rycroft, Abhoyjit S. Bhown, Michael W. Deem, Maciej Haranczyk & Berend Smit, "In Silico Screening of Carbon Capture Materials", Nature Materials, May 27, 2012, 11:633–641,

One of the main bottlenecks to deploying large-scale carbon dioxide capture and storage (CCS) in power plants is the energy required to separate the CO2 from flue gas. For example, near-term CCS technology applied to coal-fired power plants is projected to reduce the net output of the plant by some 30% and to increase the cost of electricity by 60–80%. Developing capture materials and processes that reduce the parasitic energy imposed by CCS is therefore an important area of research. We have developed a computational approach to rank adsorbents for their performance in CCS. Using this analysis, we have screened hundreds of thousands of zeolite and zeolitic imidazolate framework structures and identified many different structures that have the potential to reduce the parasitic energy of CCS by 30–40% compared with near-term technologies.

David H. Bailey, Orianna DeMasi, Juan Meza, "Feature selection and multi-class classification using a rule ensemble method", May 25, 2012,

Mads Kristensen, Yili Zheng, Brian Vinter, "PGAS for Distributed Numerical Python Targeting Multi-core Clusters", IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2012,

Mahmoud K. F. Abouelnasr & Berend Smit, "Diffusion in confinement: kinetic simulations of self- and collective diffusion behavior of adsorbed gases", Physical Chemistry Chemical Physics, Pages: 11559 May 18, 2012,

The relationship between the self- and collective-diffusion behavior of adsorbed gases is investigated with various simulation.

W.S. Lee, Y.D. Chuang, R.G. Moore, Y. Zhu, L. Patthey, M. Trigo, D.H. Lu, P.S. Kirchmann, O. Krupin, M. Yi, M. Langner, N. Huse, J.S. Robinson, Y. Chen, S.Y. Zhou, G. Coslovich, B. Huber, D.A. Reis, R.A. Kaindl, R.W. Schoenlein, D. Doering, P. Denes, W.F. Schlotter, J.J. Turner, S.L. Johnson, M. Först, T. Sasagawa, Y.F. Kung, A.P. Sorini, A.F. Kemper, B. Moritz, T.P. Devereaux, D.-H. Lee, Z.X. Shen & Z. Hussain, "Phase fluctuations and the absence of topological defects in a photo-excited charge-ordered nickelate", Nature Communications 3, Article number: 838, May 15, 2012,

The dynamics of an order parameter's amplitude and phase determines the collective behaviour of novel states emerging in complex materials. Time- and momentum-resolved pump-probe spectroscopy, by virtue of measuring material properties at atomic and electronic time scales out of equilibrium, can decouple entangled degrees of freedom by visualizing their corresponding dynamics in the time domain. Here we combine time-resolved femotosecond optical and resonant X-ray diffraction measurements on charge ordered La1.75Sr0.25NiO4 to reveal unforeseen photoinduced phase fluctuations of the charge order parameter. Such fluctuations preserve long-range order without creating topological defects, distinct from thermal phase fluctuations near the critical temperature in equilibrium. Importantly, relaxation of the phase fluctuations is found to be an order of magnitude slower than that of the order parameter's amplitude fluctuations, and thus limits charge order recovery. This new aspect of phase fluctuations provides a more holistic view of the phase's importance in ordering phenomena of quantum matter.

Abhinav Sarje, Jack Pien, Xiaoye S. Li, Elaine Chan, Slim Chourou, Alexander Hexemer, Arthur Scholz, Edward Kramer, "Large-scale Nanostructure Simulations from X-ray Scattering Data On Graphics Processor Clusters", LBNL Tech Report, May 15, 2012, LBNL LBNL-5351E,

X-ray scattering is a valuable tool for measuring the structural properties of materials used in the design and fabrication of energy-relevant nanodevices (e.g., photovoltaic, energy storage, battery, fuel, and carbon capture and sequestration devices) that are key to the reduction of carbon emissions. Although today's ultra-fast X-ray scattering detectors can provide tremendous information on the structural properties of materials, a primary challenge remains in the analyses of the resulting data. We are developing novel high-performance computing algorithms, codes, and software tools for the analyses of X-ray scattering data. In this paper we describe two such HPC algorithm advances. Firstly, we have implemented a flexible and highly efficient Grazing Incidence Small Angle Scattering (GISAXS) simulation code based on the Distorted Wave Born Approximation (DWBA) theory with C++/CUDA/MPI on a cluster of GPUs. Our code can compute the scattered light intensity from any given sample in all directions of space; thus allowing full construction of the GISAXS pattern. Preliminary tests on a single GPU show speedups over 125x compared to the sequential code, and almost linear speedup when executing across a GPU cluster with 42 nodes, resulting in an additional 40x speedup compared to using one GPU node. Secondly, for the structural fitting problems in inverse modeling, we have implemented a Reverse Monte Carlo simulation algorithm with C++/CUDA using one GPU. Since there are large numbers of parameters for fitting in the in X-ray scattering simulation model, the earlier single CPU code required weeks of runtime. Deploying the AccelerEyes Jacket/Matlab wrapper to use GPU gave around 100x speedup over the pure CPU code. Our further C++/CUDA optimization delivered an additional 9x speedup.

A. Azad, M. Halappanavar, S. Rajamanickam, A. Khan, E. Boman, A. Pothen, "Multithreaded Algorithms for Maximum Matching in Bipartite Graphs", IPDPS, May 2012, doi: 10.1109/IPDPS.2012.82

Hank Childs, Torsten Kuhlen, Fabio Marton (editors), Proceedings of the EuroGraphics Symposium on Parallel Graphics and Visualization (EGPGV), EuroGraphics Association, (EuroGraphics Association: May 2012)

Y. Yin, S. Byna, H. Song, X.-H. Sun, and R. Thakur, "Boosting Application-Specific Parallel I/O Optimization Using IOSIG", IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Ottowa, Canada, May 13, 2012,

Didem Unat, Jun Zhou, Yifeng Cui, Scott B. Baden, Xing Cai, "Accelerating a 3D Finite Difference Earthquake Simulation with a C-to-CUDA Translator", Computing in Science and Engineering, May 2012, Vol 14:48-59,

Ariful Azad, Alex Pothen, "Multithreaded algorithms for matching in graphs with application to data analysis in flow cytometry", IPDPSW, May 2012, doi: 10.1109/IPDPSW.2012.310

Sean Whalen, Sophie Engle, Sean Peisert, Matt Bishop, "Network-Theoretic Classification of Parallel Computation Patterns", International Journal of High Performance Computing Applications (IJHPCA), May 2012, 26(2):159-169, doi: 10.1177/1094342012436618

Ariful Azad, Saumyadipta Pyne, Alex Pothen, "Matching phosphorylation response patterns of antigen-receptor-stimulated T cells via flow cytometry", BMC Bioinformatics, 2012,

Paul Navratil, Donald Fussell, Calvin Lin, Hank Childs, "Dynamic Scheduling for Large-Scale Distributed-Memory Ray Tracing", Proceedings of EuroGraphics Symposium on Parallel Graphics and Visualization (EGPGV), May 2012, 61-70, LBNL 5930E,

M. Balman, A. Sim, "Scaling the Earth System Grid to 100Gbps Networks", 2012, LBNL 5794E,

A. Lugowski, D. Alber, A. Buluç, J. Gilbert, S. Reinhardt, Y. Teng, A. Waranis, "A flexible open-source toolbox for scalable complex graph analysis", SIAM Conference on Data Mining (SDM), 2012,

Fuyu Li, Daniele Rosa, William D. Collins, and Michael F. Wehner, "“Super-parameterization”: A better way to simulate regional extreme precipitation?", Journal of Advances in Modeling Earth Systems, April 4, 2012, 4, doi: 10.1029/2011MS000106

Extreme precipitation is generally underestimated by current climate models relative to observations of present-day rainfall distributions. Possible causes of this systematic error include the convective parameterization in these models that have been designed to reproduce measurements of climatological mean precipitation. One possible approach to improve the interaction of subgrid-scale physical processes and large-scale climate is to replace the conventional convective parameterizations with a high-resolution cloud-system resolving model. A “super-parameterized” Community Atmosphere Model (SP-CAM) utilizing this approach is used in this study to investigate the distribution of extreme precipitation in the United States. Results show that SP-CAM better simulates the distributions of both light and intense precipitation compared to the standard version of CAM based upon conventional parameterizations. The improvements are mostly seen in regions dominated by convective precipitation, suggesting that super-parameterization provides a better representation of subgrid convective processes.

H. M. Aktulga, J. C. Fogarty, S. A. Pandit, A. Y. Grama, "Parallel reactive molecular dynamics: Numerical methods and algorithmic techniques", Parallel Computing, 38 (4-5), 245-259, April 2, 2012, doi: 10.1016/j.parco.2011.08.005

S. Chourou, A. Sarje, X. Li, E. Chan, A. Hexemer, "High-Performance GISAXS Code for Polymer Science", Synchrotron Radiation in Polymer Science, April 2012,

Gunther H. Weber, Peer-Timo Bremer, "In-situ Analysis: Challenges and Opportunities", Position paper for DOE Exascale Research Conference, April 2012, LBNL 5692E,

Gunther H. Weber, Kenes Beketayev, Peer-Timo Bremer, Bernd Hamann, Maciej Haranczyk, Mario Hlawitschka, Valerio Pascucci, "Comprehensible Presentation of Topological Information", Status report for DOE Exascale Research Conference, April 2012, LBNL 5693E,

E. Wes Bethel, David Camp, Hank Childs, Mark Howison, Hari Krishnan, Burlen Loring, Joerg Meyer, Prabhat, Oliver Ruebel, Daniela Ushizima, Gunther Weber, "Towards Exascale: High Performance Visualization and Analytics – Project Status Report. Technical Report", DOE Exascale Research Conference, April 2012,

Kamer Kaya, François-Henry Rouet, Bora Uçar, "On partitioning problems with complex objectives", Lecture Notes in Computer Science, Springer, April 2012, 7155:334--344, doi: 10.1007/978-3-642-29737-3_38

Gunther H. Weber, Dmitriy Morozov, Kenes Beketayev, John Bell, Peer-Timo Bremer, Marc Day, Bernd Hamann, Christian Heine, Maciej Haranczyk, Mario Hlawitschka, Valerio Pascucci, Patrick Oesterling, Gerik Scheuermann, "Topology-based Visualization and Analysis of High-dimensional Data and Time-varying Data at the Extreme Scale", DOE Exascale Research Conference, April 2012,

hetero2

S. Molins, D. Trebotich, C. I. Steefel and C. Shen, "An Investigation of the Effect of Pore Scale Flow on Average Geochemical Reaction Rates Using Direct Numerical Simulation", Water Resour. Res., 48(3) W03527, DOI:10.1029/2011WR011404, 2012,

Domain-specific Translator and Optimizer for Massive On-Chip Parallelism, Didem Unat, University of California, San Diego, March 28, 2012,

A. Lugowski, A. Buluç, J. Gilbert, S. Reinhardt, "Scalable complex graph analysis with the knowledge discovery toolbox", International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March 2012,

Erjun Kan, Wei Hu, Chuanyun Xiao, Ruifeng Lu, Kaiming Deng, Jinlong Yang and Haibin Su, "Half-Metallicity in Organic Single Porous Sheets", J. Am. Chem. Soc., 2012, 134 (13), 5718–5721, March 22, 2012, doi: 10.1021/ja210822c

The unprecedented applications of two-dimensional (2D) atomic sheets in spintronics are formidably hindered by the lack of ordered spin structures. Here we present first-principles calculations demonstrating that the recently synthesized dimethylmethylene-bridged triphenylamine (DTPA) porous sheet is a ferromagnetic half-metal and that the size of the band gap in the semiconducting channel is roughly 1 eV, which makes the DTPA sheet an ideal candidate for a spin-selective conductor. In addition, the robust half-metallicity of the 2D DTPA sheet under external strain increases the possibility of applications in nanoelectric devices. In view of the most recent experimental progress on controlled synthesis, organic porous sheets pave a practical way to achieve new spintronics.

Jihan Kim, Richard L. Martin, Oliver Rübel, Maciej Haranczyk & Berend Smit, "High-throughput Characterization of Porous Materials Using Graphics Processing Units", Journal of Chemical Theory and Computation, March 16, 2012, 8:1684–1693, LBNL 5409E, doi: 10.1021/ct200787v

We have developed a high-throughput graphics processing unit (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations, where the grid values represent interactions (i.e., Lennard-Jones + Coulomb potentials) between gas molecules (i.e., CH4 and CO2) and materials’ framework atoms. Using a parallel flood fill central processing unit (CPU) algorithm, inaccessible regions inside the framework structures are identified and blocked, based on their energy profiles. Finally, we compute the Henry coefficients and heats of adsorption through statistical Widom insertion Monte Carlo moves in the domain restricted to the accessible space. The code offers significant speedup over a single core CPU code and allows us to characterize a set of porous materials at least an order of magnitude larger than those considered in earlier studies. For structures selected from such a prescreening algorithm, full adsorption isotherms can be calculated by conducting multiple Grand Canonical Monte Carlo (GCMC) simulations concurrently within the GPU.

L. Lin, J. Lu, L. Ying and W. E, "Optimized local basis set for Kohn-Sham density functional theory", J. Comput. Phys., 2012, 231:4515,

Xiaodan Gu, Zuwei Liu, Ilja Gunkel, Slim Chourou, Sung Woo Hong, Deirdre Olynick, Thomas P. Russell, "High Aspect Ratio Sub-15nm Silicon Trenches From Block Copolymer Templates", Advanced Materials, 2012, 24:5688,

High-aspect-ratio sub-15-nm silicon trenches are fabricated directly from plasma etching of a block copolymer mask. A novel method that combines a block copolymer reconstruction process and reactive ion etching is used to make the polymer mask. Silicon trenches are characterized by various methods and used as a master for subsequent imprinting of different materials. Silicon nanoholes are generated from a block copolymer with cylindrical microdomains oriented normal to the surface.

Eliot Gann , Slim Chourou , Abhinav Sarje , Harald Ade , Cheng Wang , Elaine Chan , Xiaodong Ding , Alexander Hexemer, An Interactive 3D Interface to Model Complex Surfaces and Simulate Grazing Incidence X-ray Scatter Patterns, American Physical Society March Meeting 2012, March 2012,

Grazing Incidence Scattering is becoming critical in characterization of the ensemble statistical properties of complex layered and nano structured thin films systems over length scales of centimeters. A major bottleneck in the widespread implementation of these techniques is the quantitative interpretation of the complicated grazing incidence scatter. To fill this gap, we present the development of a new interactive program to model complex nano-structured and layered systems for efficient grazing incidence scattering calculation.

Soile V.E. Keränen, Oliver Rübel, David W. Knowles and Mark D. Biggin, "Computational modeling of cis-regulatory modules from 3D exprression data in Drosophila blastoderm atlas", Drosophila Genetics, March 2012,

Andrew Canning, Slim Chourou, Stephen Derenzo, "First-principles studies of Ce and Eu doped inorganic materials as candidates for scintillator gamma ray detectors", American Physical Society March Meeting 2012, February 2012, 57,

We have performed high-throughput DFT based (GGA+U) band structure calculations for new Ce and Eu doped wide band gap inorganic materials to determine their potential as candidates for gamma ray scintillator detectors. These calculations are based on determining the 4f ground state level of the Ce and Eu relative to the valence band of the host as well as the position of the Ce and Eu 5d excited state relative to the conduction band of the host. We find many classes of candidate materials where the 5d is in the conduction band preventing scintillation. Even when the Eu and Ce 4f and 5d levels are placed well in the gap of the host, traps on the host can also prevent the energy of the gamma ray transferring to the Eu or Ce. We therefore also performed calculations for host hole traps and electron traps to compare their energies to the Ce and Eu 4f and 5d levels.

S. Chourou, A. Sarje, X. Li, E. Chan, A. Hexemer, GISAXS simulation and analysis on GPU clusters., American Physical Society March Meeting 2012, February 2012,

We have implemented a flexible Grazing Incidence Small-Angle Scattering (GISAXS) simulation code based on the Distorted Wave Born Approximation (DWBA) theory that effectively utilizes the parallel processing power provided by the GPUs. This constitutes a handy tool for experimentalists facing a massive flux of data, allowing them to accurately simulate the GISAXS process and analyze the produced data. The software computes the diffraction image for any given superposition of custom shapes or morphologies (e.g. obtained graphically via a discretization scheme) in a user-defined region of k-space (or region of the area detector) for all possible grazing incidence angles and in-plane sample rotations. This flexibility then allows to easily tackle a wide range of possible sample geometries such as nanostructures on top of or embedded in a substrate or a multilayered structure. In cases where the sample displays regions of significant refractive index contrast, an algorithm has been implemented to perform an optimal slicing of the sample along the vertical direction and compute the averaged refractive index profile to be used as the reference geometry of the unperturbed system. Preliminary tests on a single GPU show a speedup of over 200 times compared to the sequential code.

Abhinav Sarje, Next-Generation Scientific Computing with Graphics Processors, Beijing Computational Science Research Center, February 2012,

Ushizima, D.M., Weber, G., Morozov, D., Bethel, W., Sethian, J.A., "Algorithms for Microstructure Description applied to Microtomography", Carbon Cycle 2.0 Symposium, February 10, 2012,

Richard L. Martin, Prabhat, David D. Donofrio, James A. Sethian & Maciej Haranczyk, "Accelerating Analysis of void spaces in porous materials on multicore and GPU platforms", International Journal of High Performance Computing Applications, February 5, 2012, 26:347-357,

Developing computational tools that enable discovery of new materials for energy-related applications is a challenge. Crystalline porous materials are a promising class of materials that can be used for oil refinement, hydrogen or methane storage as well as carbon dioxide capture. Selecting optimal materials for these important applications requires analysis and screening of millions of potential candidates. Recently, we proposed an automatic approach based on the Fast Marching Method (FMM) for performing analysis of void space inside materials, a critical step preceding expensive molecular dynamics simulations. This breakthrough enables unsupervised, high-throughput characterization of large material databases. The algorithm has three steps: (1) calculation of the cost-grid which represents the structure and encodes the occupiable positions within the void space; (2) using FMM to segment out patches of the void space in the grid of (1), and find how they are connected to form either periodic channels or inaccessible pockets; and (3) generating blocking spheres that encapsulate the discovered inaccessible pockets and are used in proceeding molecular simulations. In this work, we expand upon our original approach through (A) replacement of the FMM-based approach with a more computationally efficient flood fill algorithm; and (B) parallelization of all steps in the algorithm, including a GPU implementation of the most computationally expensive step, the cost-grid generation. We report the acceleration achievable in each step and in the complete application, and discuss the implications for high-throughput material screening.

David H. Bailey, Jonathan M. Borwein, Cristian S. Calude, Michael J. Dinneen, Monica Dumitrescu, Alex Yee,, "Normality and the digits of Pi", Exploratory Experimentation in Mathematics: Selected Works, February 3, 2012,

Mitesh R. Meswani, Laura Carrington, Didem Unat, Allan Snavely, Scott B. Baden, Stephen Poole, "Modeling and Predicting Performance of High Performance Computing Applications on Hardware Accelerators", IPDPS Workshops, IEEE Computer Society, 2012,

D. Y. Parkinson, C. Yang, C. Knoechel, C. A. Larabell, M. Le Gros, "Automatic alignment and reconstruction of images for soft X-ray tomography", J Struct Biol, February 2012, 177:259--266, doi: 10.1016/j.jsb.2011.11.027

Nan Jiang, Daniel U. Becker, George Michelogiannakis, William J. Dally, "Network Congestion Avoidance through Speculative Reservation", International Symposium on High Performance Computer Architecture, IEEE Computer Society, 2012,

Congestion caused by hot-spot traffic can significantly degrade the performance of a computer network. In this study, we present the Speculative Reservation Protocol (SRP), a new network congestion control mechanism that relieves the effect of hot-spot traffic in high bandwidth, low latency, lossless computer networks. Compared to existing congestion control approaches like Explicit Congestion Notification (ECN), which react to network congestion through packet marking and rate throttling, SRP takes a proactive approach of congestion avoidance. Using a light-weight endpoint reservation scheme and speculative packet transmission, SRP avoids hot-spot congestion while incurring minimal overhead. Our simulation results show that SRP responds more rapidly to the onset of severe hot-spots than ECN and has a higher network throughput on bursty network traffic. SRP also performs comparably to networks without congestion control on benign traffic patterns by reducing the latency and throughput overhead commonly associated with reservation protocols.

P. Ghysels, P. Kłosiewicz, W. Vanroose, "Improving the arithmetic intensity of multigrid with the help of polynomial smoothers", Numerical Linear Algebra with Applications, February 1, 2012, 19:2, doi: 10.1002/nla.1808

D. Yu, D. Katramatos, A. Shoshani, A. Sim, J. Gu, V. Natarajan, "StorNet: Integrating Storage Resource Management with Dynamic Network Provisioning for Automated Data Transfer", International Committee for Future Accelerators (ICFA) Standing Committee on Inter-Regional Connectivity (SCIC) 2012 Report: Networking for High Energy Physics, 2012,

Emmanuel Agullo, Patrick R. Amestoy, Alfredo Buttari, Abdou Guermouche, Jean-Yves L Excellent, François-Henry Rouet, Robust memory-aware mappings for parallel multifrontal factorizations, SIAM Conference on Parallel Processing for Scientific Computing (PP12), Savannah, GA, USA, February 2012,

H. M. Aktulga, S. A. Pandit, A. C. T. van Duin, A. Y. Grama, "Reactive Molecular Dynamics: Numerical Methods and Algorithmic Tehniques", SIAM J. Sci. Comput., 34(1), C1–C23, January 31, 2012, doi: 10.1137/100808599

Nils E. R. Zimmermann, Sayee P. Balaji, Frerich J. Keil, "Surface Barriers of Hydrocarbon Transport Triggered by Ideal Zeolite Structures", J. Phys. Chem. C, 2012, 116:3677-3683, doi: 10.1021/jp2112389

Shedding light on the nature of surface barriers of nanoporous materials, molecular simulations (Monte Carlo, Reactive Flux) have been employed to investigate the tracer-exchange characteristics of hydrocarbons in defect-free single-crystal zeolite membranes. The concept of a critical membrane thickness as a quantitative measure of surface barriers is shown to be appropriate and advantageous. Nanopore smoothness, framework density, and thermodynamic state of the fluid phase have been identified as the most important influencing variables of surface barriers. Despite the ideal character of the adsorbent, our simulation results clearly support current experimental findings on MOF Zn(tbip) where a larger number of crystal defects caused exceptionally strong surface barriers. Most significantly, our study predicts that the ideal crystal structure without any such defects will already be a critical aspect of experimental analysis and process design in many cases of the upcoming class of extremely thin and highly oriented nanoporous membranes.

Watch here a movie that highlights how n-hexane molecules are adsorbed in a zeolite slab.

Brian Van Straalen, David Trebotich, Terry Ligocki, Daniel T. Graves, Phillip Colella, Michael Barad, "An Adaptive Cartesian Grid Embedded Boundary Method for the Incompressible Navier Stokes Equations in Complex Geometry", LBNL Report Number: LBNL-1003767, 2012, LBNL LBNL Report Numb,

We present a second-order accurate projection method to solve the
  incompressible Navier-Stokes equations on irregular domains in two
  and three dimensions.  We use a finite-volume discretization
  obtained from intersecting the irregular domain boundary with a
  Cartesian grid.  We address the small-cell stability problem
  associated with such methods by hybridizing a conservative
  discretization of the advective terms with a stable, nonconservative
  discretization at irregular control volumes, and redistributing the
  difference to nearby cells.  Our projection is based upon a
  finite-volume discretization of Poisson's equation.  We use a
  second-order, $L^\infty$-stable algorithm to advance in time.  Block
  structured local refinement is applied in space.  The resulting
  method is second-order accurate in $L^1$ for smooth problems.  We
  demonstrate the method on benchmark problems for flow past a
  cylinder in 2D and a sphere in 3D as well as flows in 3D geometries
  obtained from image data.

Han Suk Kim, Didem Unat, Scott B. Baden, Jürgen P. Schulze, "Interactive Data-centric Viewpoint Selection", Visualization and Data Analysis, Proc. SPIE 8294, January 2012,

S. Guzik, P. McCorquodale, P. Colella, "A Freestream-Preserving High-Order Finite-Volume Method for Mapped Grids with Adaptive-Mesh Refinement", 50th AIAA Aerospace Sciences Meeting Nashville, TN, 2012,

P.S. Li, D.F. Martin, R.I. Klein, and C.F. McKee, "A Stable, Accurate Methodology for High Mach Number, Strong Magnetic Field MHD Turbulence with Adaptive Mesh Refinement: Resolution and Refinement Studies", The Astrophysical Journal Supplement Series, 2012,

Planck Collaboration, "Planck Intermediate Results. IX. Detection of the Galactic haze with Planck", ArXiv e-prints, January 2012,

Planck Collaboration, "Planck intermediate results. VIII. Filaments between interacting clusters", ArXiv e-prints, January 2012,

D. Flammini, A. Pietropaolo, R. Senesi, C. Andreani, F. McBride, A. Hodgson, M. Adams, L. Lin, and R. Car,, "Spherical momentum distribution of the protons in hexagonal ice from modeling of inelastic neutron scattering data", J. Chem. Phys., 2012, 136:024504,

Abhinav Sarje, Jack Pien, Xiaoye Li, "GPU Clusters for Large-Scale Analysis of X-ray Scattering Data", GPU Technology Conference (GTC), January 2012,

V. V. Kharin, F. W. Zwiers, X. Zhang, M. Wehner, "Changes in temperature and precipitation extremes in the CMIP5 ensemble", Climatic Change, 2012,

A. Napov and Y. Notay, "An Algebraic Multigrid Method with Guaranteed Convergence Rate", SIAM J. Sci. Comput., vol.43, pp. A1079-A1109, 2012,

Mark Howison, E. Wes Bethel, Hank Childs, "Hybrid Parallelism for Volume Rendering on Large, Multi- and Many-core Systems", IEEE Transactions on Visualization and Computer Graphics, January 2012, 18:17-29, LBNL 4370E,

David H. Bailey, Marcos M. Lopez de Prado, "The Sharpe ratio efficient frontier", Journal of Risk, January 1, 2012, 15:3-44,

Elif Dede, Zacharia Fadika, Jessica Hartog, Modhusudhan Govindaraju, Lavanya Ramakrishnan, Daniel Gunter, Richard Shane Canon, "MARISSA: MApReduce Implementation for Streaming Science Applications", IEEE eScience Conference, 2012,

R. L. Barone-Nugent, C. Lidman, J. S. B. Wyithe, J., D. A. Howell, I. M. Hook, M. Sullivan, P. E., I. Arcavi, S. B. Cenko, J. Cooke, A., E. Y. Hsiao, M. M. Kasliwal, K. Maguire, E. Ofek, D. Poznanski, D. Xu, "Near-infrared observations of Type Ia supernovae: the best known standard candle for cosmology", Monthly Notices of the RAS, 2012, 425:1007-1012, doi: 10.1111/j.1365-2966.2012.21412.x

Maria Garzon, L. J. Gray, James A. Sethian, "Droplet and bubble pinch-off computations using level sets", Journal of Computational and Applied Mathematics, 2012, 236:3034--3041, doi: 10.1016/j.cam.2011.03.032

Ilia Lebedev, Christopher Fletcher, Shaoyi Cheng, James Martin, Austin Doupnik, Daniel Burke, Mingjie Lin, John Wawrzynek, "Exploring Many-Core Design Templates for FPGAs and ASICs", International Journal of Reconfigurable Computing, 2012, 2012, doi: 10.1155/2012/439141

R. Sisneros, C. Malone, A. Nonaka, and S. Woosley, "Investigation of Turbulence in the Early Stages of a High Resolution Supernova Simulation", Proceedings of the Supercomputing 2012 Conference, 2012,

J. C. van Eyken, D. R. Ciardi, K. von Braun, S. R., P. Plavchan, C. F. Bender, T. M. Brown, J. R., B. J. Fulton, A. W. Howard, S. B. Howell, S., G. W. Marcy, A. Shporer, P. Szkody, R. L., C. A. Beichman, A. F. Boden, D. M., D. W. Hoard, S. V. Ram\ \irez, L. M., J. R. Stauffer, J. S. Bloom, S. B., M. M. Kasliwal, S. R. Kulkarni, N. M., P. E. Nugent, E. O. Ofek, D. Poznanski, R. M., R. Walters, C. J. Grillmair, R., D. B. Levitan, B. Sesar, J. A. Surace, "The PTF Orion Project: A Possible Planet Transiting a T-Tauri Star", Astrophysical Journal, 2012, 755:42, doi: 10.1088/0004-637X/755/1/42

Sanjay Govindjee, Per-Olof Persson, "A time-domain Discontinuous Galerkin method for mechanical resonator quality factor computations", Journal of Computational Physics, 2012, 231:6380--6392, doi: 10.1016/j.jcp.2012.05.034

David H. Bailey, Roberto Barrio, Jonathan M. Borwein, "High precision computation: Mathematical physics and dynamics", Applied Mathematics and Computation, 2012, 218:10106-1012,

R. L. Barone-Nugent, "Near-infrared observations of Type Ia supernovae: the best known candle for cosmology", Monthly Notices of the Royal Astronomical Society, 2012, 1007, doi: 10.1111/j.1365-2966.2012.21412.x

J. P. Bernstein, R. Kessler, S. Kuhlmann, R., E. Kovacs, G. Aldering, I. Crane, C. B., D. A. Finley, J. A. Frieman, T., M. J. Jarvis, A. G. Kim, J. Marriner, P., R. C. Nichol, P. Nugent, D. Parkinson, R. R. R., M. Sako, H. Spinka, M. Sullivan, "Supernova Simulations and Strategies for the Dark Energy Survey", Astrophysical Journal, 2012, 753:152, doi: 10.1088/0004-637X/753/2/152

Lavanya Ramakrishnan, Richard Shane Canon, Krishna Muriki, Iwona Sakrejda, Nicholas J. Wright, "Evaluating Interconnect and Virtualization Performance for High Performance Computing", Special Issue of ACM Performance Evaluation Review, 2012, 40(2),

Fuyu Li, William D. Collins, Michael F. Wehner, Ruby L. Leung, "Hurricanes in an Aquaplanet World: Implications of the Impacts of External Forcing and Model Horizontal Resolution", Journal of Advances in Modeling Earth Systems, 2012,

E. O. Ofek, R. Laher, J. Surace, D. Levitan, B., A. Horesh, N. Law, J. C. van Eyken, S. R., T. A. Prince, P. Nugent, M. Sullivan, O., A. Pickles, M. Ag\ ueros, I. Arcavi, L., J. Bloom, S. B. Cenko, A. Gal-Yam, C., G. Helou, M. M. Kasliwal, D. Poznanski, R. Quimby, "The Palomar Transient Factory photometric catalog 1.0", Publications of the ASP, 2012, 124:854, doi: 10.1086/666978

J. T. Parrent, D. A. Howell, B. Friesen, R. C. Thomas, R. A., D. Milisavljevic, F. B. Bianco, B., P. Nugent, E. Baron, I. Arcavi, S., D. Bersier, L. Bildsten, J. Bloom, Y., S. B. Cenko, A. V. Filippenko, A. Gal-Yam, M. M., N. Konidaris, S. R. Kulkarni, N. M., D. Levitan, K. Maguire, P. A. Mazzali, E. O., Y. Pan, D. Polishook, D. Poznanski, R. M., J. M. Silverman, A. Sternberg, M., E. S. Walker, D. Xu, C. Buton, R. Pereira, "Analysis of the Early-time Optical Spectra of SN 2011fe in M101", Astrophysical Journal Letters, 2012, 752:L26, doi: 10.1088/2041-8205/752/2/L26

Per-Olof Persson, "High-order Navier-Stokes simulations using a sparse line-based discontinuous Galerkin method", 50th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, Nashville, TN, American Institute of Aeronautics and Astronautics, 2012, 436:1--10, doi: 10.2514/6.2012-456

Matthew R. Siebert, Adelia J. A. Aquino, Wibe A. de Jong, Giovanni Granucci, William L. Hase, "Potential energy surface for C2H4I2+center dot dissociation including spin-orbit effects", Molecular Physics, 2012, 110:2599-2609, doi: 10.1080/00268976.2012.725137

A. Nonaka, J. B. Bell, M. S. Day, C. Gilet, A. S. Almgren, and M. L. Minion, "A Deferred Correction Coupling Strategy for Low Mach Number Flow with Complex Chemistry", Combustion Theory and Modelling, 2012, 16(6):1053-1088,

David H. Bailey, Jonathan M. Borwein, "Exploration, Experimentation and Computation", International Advances in Mathematics, 2012, 31:1-14,

N. Smith, S. B. Cenko, N. Butler, J. S. Bloom, M. M., A. Horesh, S. R. Kulkarni, N. M., P. E. Nugent, E. O. Ofek, D. Poznanski, R. M., B. Sesar, S. Ben-Ami, I. Arcavi, A., D. Polishook, D. Xu, O. Yaron, D. A. Frail, M. Sullivan, "SN 2010jp (PTF10aaxi): a jet in a Type II supernova", Monthly Notices of the RAS, 2012, 420:1135-1144, doi: 10.1111/j.1365-2966.2011.20104.x

Y. Cao, M. M. Kasliwal, J. D. Neill, S. R. Kulkarni, Y.-Q., S. Ben-Ami, J. S. Bloom, S. B. Cenko, N. M., P. E. Nugent, E. O. Ofek, D. Poznanski, R. M. Quimby, "Classical Novae in Andromeda: Light Curves from the Palomar Transient Factory and GALEX", Astrophysical Journal, 2012, 752:133, doi: 10.1088/0004-637X/752/2/133

P.-O. Persson, D.J. Willis, J. Peraire, "Numerical simulation of flapping wings using a panel method and a high-order Navier–Stokes solver", International Journal for Numerical Methods in Engineering, 2012, 89:1296--1316, doi: 10.1002/nme.3288

Ichitaro Yamazaki, Kesheng Wu, "A Communication-Avoiding Thick-Restart Lanczos Method a Distributed-Memory System", Lecture Notes in Computer Science, 2012, 7155:345--354, doi: 10.1007/978-3-642-29737-3_39

A. Nonaka, A. J. Aspden, M. Zingale, A. S. Almgren, J. B. Bell, and S. E. Woosley, "High-Resolution Simulations of Convection Preceding Ignition in Type Ia Supernovae Using Adaptive Mesh Refinement", Astrophysical Journal, 745, 73, 2012,

E. Wes Bethel, "Exploration of Optimization Options for Increasing Performance of a GPU Implementation of a Three-dimensional Bilateral Filter", 2012, LBNL 5406E,

David H. Bailey and Marcos M. Lopez de Prado, "Balanced baskets: A new approach to trading and hedging risks", Journal of Investment Strategies, January 1, 2012, 14,

Karan Vahi, Ian Harvey, Taghrid Samak, Dan Gunter, Kieran Evans, David Rogers, Ian Taylor, Monte Goode, Fabio Silva, Eddie Al-Shakarchi, Gaurang Mehta, Andrew Jones, Ewa Deelman, "A General Approach to Real-time Workflow Monitoring", The Seventh Workshop on Workflows in Support of Large-Scale Science (WORKS12), 2012,

T.C. Peterson, R. Heim, R. Hirsch, D. Kaiser, H. Brooks, N.S. Diffenbaugh, R. Dole, J. Giovannettone, K. Guiguis, T.R. Karl, R.W. Katz, K. Kunkel, D. Lettenmaier, G. J. McCabe, C.J. Paciorek, K.Ryberg, S.Schubert, V.B.S. Silva, B. Stewart, A.V. Vecchia, G. Villarini, R.S. Vose, J. Walsh, M.Wehner, D. Wolock, K. Wolter, C.A. Woodhouse and D. Wuebbles, "Monitoring and Understanding Changes in Heatwaves, Coldwaves, Floods and Droughts in the United States: State of Knowledge", Bulletin of the American Meteorological Society (accepted), 2012,

E. O. Ofek, R. Laher, N. Law, J. Surace, D., B. Sesar, A. Horesh, D. Poznanski, J. C. Eyken, S. R. Kulkarni, P. Nugent, J., R. Walters, M. Sullivan, M. Ag\ ueros, L., J. Bloom, S. B. Cenko, A. Gal-Yam, C., G. Helou, M. M. Kasliwal, R. Quimby, "The Palomar Transient Factory Photometric Calibration", Publications of the ASP, 2012, 124:62, doi: 10.1086/664065

D. Polishook, E. O. Ofek, A. Waszczak, S. R. Kulkarni, A., O. Aharonson, R. Laher, J. Surace, C., J. Bloom, N. Brosch, D. Prialnik, C., S. B. Cenko, M. Kasliwal, N. Law, D., P. Nugent, D. Poznanski, R. Quimby, "Asteroid rotation periods from the Palomar Transient Factory survey", Monthly Notices of the RAS, 2012, 421:2094-2108, doi: 10.1111/j.1365-2966.2012.20462.x

Chris H. Rycroft, Frédéric Gibou, "Simulations of a stretching bar using a plasticity model from the shear transformation zone theory", Journal of Computational Physics, 2012, 231:2155--2179, doi: 10.1016/j.jcp.2011.10.009

Benson Ma, Arie Shoshani, Alex Sim, Kesheng, Yong-Ik Byun, Jaegyoon Hahm, Min-Su Shin, "Efficient Attribute-Based Data Access in Astronomy", The 2nd International Workshop on Network-Aware Data Workshop (NDM2012), 2012, 562--571,

Daniel Gunter, Shreyas Cholia, Anubhav Jain, Michael Kocher, Kristin Persson, Lavanya Ramakrishnan, Shyue Ping Ong, Gerbrand Ceder, "Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project", 5th workshop on Many-Task Computing on Grids and Supercomputers (MTAGS), 2012,

J. S. Bloom, D. Kasen, K. J. Shen, P. E. Nugent, N. R., M. L. Graham, D. A. Howell, U., S. Holmes, C. A. Haswell, V. Burwitz, J. Rodriguez, M. Sullivan, "A Compact Degenerate Primary-star Progenitor of SN 2011fe", Astrophysical Journal Letters, 2012, 744:L17, doi: 10.1088/2041-8205/744/2/L17

X. Wang, L. Wang, A. V. Filippenko, E. Baron, M., D. Jack, T. Zhang, G. Aldering, P., W. D. Arnett, D. Baade, B. J. Barris, S., P. Bouchet, A. S. Burrows, R. Canal, E., R. G. Carlberg, E. di Carlo, P. J., A. P. S. Crotts, J. I. Danziger, M. Valle, M. Fink, R. J. Foley, C. Fransson, A., P. M. Garnavich, C. L. Gerardy, G., M. Hamuy, W. Hillebrandt, P. H\ oflich, S. T., D. E. Holz, J. P. Hughes, D. J. Jeffery, S. W., D. Kasen, A. M. Khokhlov, R. P. Kirshner, R. A., C. Kozma, K. Krisciunas, B. C. Lee, B., E. J. Lentz, D. C. Leonard, W. H. G., W. Li, M. Livio, P. Lundqvist, D., T. Matheson, P. A. Mazzali, P. Meikle, G., P. A. Milne, S. W. Mochnacki, K., P. E. Nugent, E. S. Oran, N. Panagia, S., M. M. Phillips, P. Pinto, D. Poznanski, C. J., M. Reinecke, A. G. Riess, P., R. A. Scalzo, E. M. Schlegel, B. P., J. Siegrist, A. M. Soderberg, J., G. Sonneborn, A. Spadafora, J., R. A. Sramek, S. G. Starrfield, L. G., N. B. Suntzeff, R. C. Thomas, J. L., A. Tornambe, J. W. Truran, M. Turatto, M., S. D. Van Dyk, K. W. Weiler, J. C. Wheeler, M. Wood-Vasey, S. E. Woosley, H. Yamaoka, "Evidence for Type Ia Supernova Diversity from Ultraviolet Observations with the Hubble Space Telescope", Astrophysical Journal, 2012, 749:126, doi: 10.1088/0004-637X/749/2/126

D. Ambrose, J. Wilkening, "Computing Time-Periodic Solutions of Nonlinear Systems of Partial Differential Equations", Proceedings of Hyperbolic Problems: Theory, Numerics, Applications (HYP2010), Beijing, Higher Education Press, 2012, 273--280,

Chris H. Rycroft, Terttaliisa Lind, Salih Güntay, Abdel Dehbi, "Granular Flow in Pebble Bed Reactors: Dust Generation and Scaling", International Congress on Advances in Nuclear Power Plants 2012 (ICAPP 2012), Chicago, IL, Curran Associates, Inc., 2012, 1:447,

Sven Kotlarski, T Bosshard, D Luthi, P Pall, C Schär, "Elevation gradients of European climate change in the regional climate model COSMO-CLM", Climatic change, 2012, 112:189--215,

Bo Kågström, Daniel Kressner, Meiyue Shao, "On aggressive early deflation in parallel variants of the QR algorithm", (PARA 2010) Applied Parallel and Scientific Computing, Lecture Notes in Computer Science 7133, 2012, 1--10, doi: 10.1007/978-3-642-28151-8_1

J. B. Bell, M. S. Day and M. J. Lijewski, "Simulation of Nitrogen Emissions in a Premixed Hydrogen Flame Stabilized on a Low Swirl Burner", Proceedings of the Combustion Institute, 2012,

M. Day, S. Tachibana, J. Bell, M. Lijewski, V. Beckner and R. Cheng, "A Combined Computational and Experimental Characterization of Lean Premixed Turbulent Low Swirl Laboratory Flames. I. Methane Flames", Combustion and Flame, 159(1) 275-290, 2012,

Samuel Gerber, Oliver Rübel, Peer-Timo Bremer, Valerio Pascucci and Ross T. Whitaker, "Morse-Smale Regression", Journal of Computational and Graphical Statistics, January 2012, doi: 10.1080/10618600.2012.657132

  • Download File: MSR.pdf (pdf: 292 KB)

Dan Gunter, Raj Kettimuthu, Ezra Kissel, Martin Swany, Jun Yi, Jason Zurawski, "Exploiting Network Parallelism for Improving Data Transfer Performance", SC12 Companion, 2012,

Russell S. Vose, Scott Applequist, Mark A. Bourassa, Sara C. Pryor, Rebecca J. Barthelmie, Brian Blanton, Peter D. Bromirski, Harold E. Brooks, Arthur T. DeGaetano, Randall M. Dole, David R. Easterling, Robert E. Jensen, Thomas R. Karl, Katherine Klink, Richard W. Katz, Michael C. Kruk, Kenneth E. Kunkel, Michael C. MacCracken, Thomas C. Peterson, Bridget R. Thomas, Xiaolan L. Wang, John E. Walsh, Michael F. Wehner, Donald J. Wuebbles, and Robert S. Young, "Monitoring and Understanding Changes in Extremes: Extratropical Storms, Winds, and Waves", Bulletin of the American Meteorological Society (submitted), 2012,

S. B. Cenko, J. S. Bloom, S. R. Kulkarni, L. E., A. A. Miller, N. R. Butler, R. M., A. Gal-Yam, E. O. Ofek, E. Quataert, L., D. Poznanski, D. A. Perley, A. N. Morgan, A. V., D. A. Frail, I. Arcavi, S., A. Cucchiara, C. D. Fassnacht, Y., I. M. Hook, D. A. Howell, D. J. Lagattuta, N. M., M. M. Kasliwal, P. E. Nugent, J. M. Silverman, M. Sullivan, S. P. Tendulkar, O. Yaron, "PTF10iya: a short-lived, luminous flare from the nuclear region of a star-forming galaxy", Monthly Notices of the RAS, 2012, 420:2684-2699, doi: 10.1111/j.1365-2966.2011.20240.x

Alexandre J. Chorin, Xuemin Tu, "An iterative implementation of the implicit nonlinear filter", ESAIM: Mathematical Modelling and Numerical Analysis, 2012, 46:535--543,

Implicit sampling is a sampling scheme for particle filters, designed to move particles one-by-one so that they remain in high-probability domains. We present a new derivation of implicit sampling, as well as a new iteration method for solving the resulting algebraic equations.

R. I. Saye, J. A. Sethian, "Analysis and applications of the Voronoi Implicit Interface Method", Journal of Computational Physics, 2012, 231:6051 - 608, doi: 10.1016/j.jcp.2012.04.004

Karen L. Schuchardt, Deborah A. Agarwal, Stefan A. Finsterle, Carl W. Gable, Ian Gorton, Luke J. Gosink, Elizabeth H. Keating, Carina S. Lansing, Joerg Meyer, William A.M. Moeglein, George S.H. Pau, Ellen A. Porter, Sumit Purohit, Mark L. Rockhold, Arie Shoshani, Chandrika Sivaramakrishnan, "Akuna-Integrated Toolsets Supporting Advanced Subsurface Flow and Transport Simulations for Environmental Management", XIX International Conference on Computational Methods in Water Resources (CMWR 2012), University of Illinois at Urbana-Champaign, June 17-22, 2012, 2012,

R Atta-Fynn, DF Johnson, EJ Bylaska, ES Ilton, GK Schenter, WA De Jong, "Structure and hydrolysis of the U(IV), U(V), and U(VI) aqua ions from Ab initio molecular simulations", Inorganic Chemistry, 2012, 51:3016--3024, doi: 10.1021/ic202338z

K. Balakrishnan, A. L. Kuhl, J. B. Bell and V. E. Beckner, "An Empirical Model for the Ignition of Explosively Dispersed Aluminum Particle Clouds", Shock Waves, 2012, 22:591,

Allen R. Sanderson, Brad Whitlock, Oliver, Hank Childs, Gunther H. Weber, , Kesheng Wu, "A System for Query Based Analysis and Visualization", Third International Eurovis Workshop on Visual EuroVA 2012, Vienna, Austria, January 2012, LBNL 5507E,

Taghrid Samak, Dan Gunter, Monte Goode, Ewa Deelman, Fabio Silva, Karan Vahi, "Failure Analysis of Distributed Scientific Workflows Executing in the Cloud", 8th International Conference on Network and Service Management (CNSM 2012), 2012,

John C. H. Chiang, C. Y. Chang and M.F. Wehner, "Long-term trends of the Atlantic Interhemispheric SST Gradient in the CMIP5 Historical Simulations", J. Climate, 2012,

U. Shumlak, J. Chadney, R.P. Golingo, D.J. Den Hartog, M.C. Hughes, S.D. Knecht, W. Lowrie, V.S. Lukin, B.A. Nelson, R.J. Oberto, J.L. Rohrbach, M.P. Ross, G.V. Vogman, "The Sheared-Flow Stabilized Z-Pinch", Fusion Science and Technology, 61 (1t), 119, 2012,

A. Corsi, E. O. Ofek, A. Gal-Yam, D. A. Frail, D., P. A. Mazzali, S. R. Kulkarni, M. M., I. Arcavi, S. Ben-Ami, S. B. Cenko, A. V., D. B. Fox, A. Horesh, J. L. Howell, I. K. W., E. Nakar, I. Rabinak, R. Sari, J. M., D. Xu, J. S. Bloom, N. M. Law, P. E. Nugent, R. M. Quimby, "Evidence for a Compact Wolf-Rayet Progenitor for the Type Ic Supernova PTF 10vgv", Astrophysical Journal Letters, 2012, 747:L5, doi: 10.1088/2041-8205/747/1/L5

Robert I. Saye, James A. Sethian, "The Voronoi Implicit Interface Method and Computational Challenges in Multiphase Physics", Milan Journal of Mathematics, 2012, 80:369--379, doi: 10.1007/s00032-012-0187-6

O. R\ ubel, S. Byna, K. Wu, F. Li, M., W. Bethel, others, "TECA: A Parallel Toolkit for Extreme Climate", Procedia Computer Science, Elsevier, 2012, 9:866--876, doi: 10.1016/j.procs.2012.04.093

F. Balboa, J. Bell, R. Delgado-Buscalioni, A. Donev, T. Fai, B. Griffith, C. Peskin, "Staggered Schemes for Fluctuating Hydodynamics", Multiscale Modeling and Simulation, 2012, 10(4):1360-1408,

E. W. Bethel and D. Leinweber and O. Rubel and K. Wu, "Federal Market Information Technology in the Post Flash Crash Era: Roles of Supercomputing", The Journal of Trading, 2012, 7:9-24, LBNL 5263E, doi: 10.3905/jot.2012.7.2.009

A. Horesh, S. R. Kulkarni, D. B. Fox, J. Carpenter, M. M., E. O. Ofek, R. Quimby, A. Gal-Yam, S. B., A. G. de Bruyn, A. Kamble, R. A. M. J. Wijers, A. J. der Horst, C. Kouveliotou, P. Podsiadlowski, M., K. Maguire, D. A. Howell, P. E. Nugent, N., N. M. Law, D. Poznanski, M. Shara, "Early Radio and X-Ray Observations of the Youngest nearby Type Ia Supernova PTF 11kly (SN 2011fe)", Astrophysical Journal, 2012, 746:21, doi: 10.1088/0004-637X/746/1/21

Matthias Morzfeld, Xuemin Tu, Ethan Atkins, Alexandre J. Chorin, "A random map implementation of implicit filters", Journal of Computational Physics, 2012, 231:2049--2066, doi: 10.1016/j.jcp.2011.11.022

Kiran Bhaskaran-Nair, Jiri Brabec, Edoardo Apra, Hubertus J. J. van Dam, Jiri Pittner, Karol Kowalski, "Implementation of the multireference Brillouin-Wigner and Mukherjee s coupled cluster methods with non-iterative triple excitations utilizing reference-level parallelism", Journal of Chemical Physics, 2012, 137, doi: 10.1063/1.4747698

M. Kawai, T. Iwashita, H. Nakashima and O. Marques, "Parallel Smoother Based on Block Red-Black Ordering for Multigrid Poisson Solver", LNCS, Proc. VECPAR 2012, Kobe, Japan, Springer, 2012, 7851:292-299,

R. Tiron, A.S. Almgren, R. Camassa, "Shear Instability of Internal Solitary Waves in Euler Fluids with Thin Pycnoclines", Journal of Fluid Mechanics, 2012, 710:324-361,

Benjamin Edwards, Tyler Moore, George Stelle, Steven Hofmeyr, Stephanie Forrest, "Beyond the blacklist: modeling malware spread and the effect of interventions", Proceedings of the 2012 workshop on New security paradigms, January 1, 2012, 53--66,

Taghrid Samak, Dan Gunter, Valerie Hendrix, "Scalable Analysis of Network Measurements with Hadoop and Pig", Fifth International IFIP/IEEE Workshop on Distributed Autonomous Network Management Systems (DANMS 2012), IEEE, 2012,

Maxime Theillard, Chris H. Rycroft, Frédéric Gibou, "A Multigrid Method on Non-Graded Adaptive Octree and Quadtree Cartesian Grids", Journal of Scientific Computing, 2012, 1--15, doi: 10.1007/s10915-012-9619-2

Jiri Brabec, Kiran Bhaskaran-Nair, Niranjan Govind, Jiri Pittner, Karol Kowalski, "Communication: Application of state-specific multireference coupled cluster methods to core-level excitations", Journal of Chemical Physics, 2012, 137, doi: 10.1063/1.4764355