Publications

Journal Article

E. Vecharynski, A. Knyazev,"Preconditioned steepest descent-like methods for symmetric indefinite systems",Linear Algebra and its Applications, Vol. 511, pp. 274–295,2016,

We construct preconditioned steepest descent (PSD)-like methods for iterative solution of symmetric indefinite linear systems using symmetric and positive definite (SPD) preconditioners. Our construction is based on a locally optimal residual minimization over two-dimensional subspaces, mathematically equivalent in exact arithmetic to preconditioned MINRES (PMINRES) restarted after every two steps. A convergence bound is derived. If certain information on the spectrum of the preconditioned system is available, we present a simpler PSD-like algorithm that performs only one-dimensional residual minimization. Search direction randomization for accelerating this algorithm is discussed. Our primary goal is to bridge the theoretical gap between the optimal (PMINRES) and PSD-like methods for solving symmetric indefinite systems. We also demonstrate situations where the suggested PSD-like schemes can be preferable to the optimal PMINRES iteration.

S.V. Venkatakrishnan, Jeffrey Donatelli, Dinesh Kumar, Abhinav Sarje, Sunil K. Sinha, Xiaoye S. Li, Alexander Hexemer,"A Multi-slice Simulation Algorithm for Grazing-Incidence Small-Angle X-ray Scattering",Journal of Applied Crystallography,December 2016,49-6, doi: 10.1107/S1600576716013273

Grazing-incidence small-angle X-ray scattering (GISAXS) is an important technique in the characterization of samples at the nanometre scale. A key aspect of GISAXS data analysis is the accurate simulation of samples to match the measurement. The distorted-wave Born approximation (DWBA) is a widely used model for the simulation of GISAXS patterns. For certain classes of sample such as nanostructures embedded in thin films, where the electric field intensity variation is significant relative to the size of the structures, a multi-slice DWBA theory is more accurate than the conventional DWBA method. However, simulating complex structures in the multi-slice setting is challenging and the algorithms typically used are designed on a case-by-case basis depending on the structure to be simulated. In this paper, an accurate algorithm for GISAXS simulations based on the multi-slice DWBA theory is presented. In particular, fundamental properties of the Fourier transform have been utilized to develop an algorithm that accurately computes the average refractive index profile as a function of depth and the Fourier transform of the portion of the sample within a given slice, which are key quantities required for the multi-slice DWBA simulation. The results from this method are compared with the traditionally used approximations, demonstrating that the proposed algorithm can produce more accurate results. Furthermore, this algorithm is general with respect to the sample structure, and does not require any sample-specific approximations to perform the simulations.

R. Li, Y. Xi, E. Vecharynski, C. Yang, and Y. Saad,"A Thick-Restart Lanczos algorithm with polynomial filtering for Hermitian eigenvalue problems",SIAM Journal on Scientific Computing, Vol. 38, Issue 4, pp. A2512–A2534,2016,doi: 10.1137/15M1054493

Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation ('locking') and a new type of polynomial filters obtained from a least-squares technique. The resulting algorithm can be utilized in a 'spectrum-slicing' approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different sub-intervals independently from one another.

Nils E. R. Zimmermann, Maciej Haranczyk,"History and Utility of Zeolite Framework-Type Discovery from a Data-Science Perspective",Crystal Growth & Design,May 2, 2016,

Mature applications such as fluid catalytic cracking and hydrocracking rely critically on early zeolite structures. With a data-driven approach, we find that the discovery of exceptional zeolite framework types around the new millennium was spurred by exciting new utilization routes. The promising processes have yet not been successfully implemented (“valley of death” effect), mainly because of the lack of thermal stability of the crystals. This foreshadows limited deployability of recent zeolite discoveries that were achieved by novel crystal synthesis routes.

Watch a movie illustrating our seeded simulation strategy here.

J. R. Jones, F.-H. Rouet, K. V. Lawler, E. Vecharynski, K. Z. Ibrahim, S. Williams, B. Abeln, C. Yang, C. W. McCurdy, D. J. Haxton, X. S. Li, T. N. Rescigno,"An efficient basis set representation for calculating electrons in molecules",Journal of Molecular Physics,2016,doi: 10.1080/00268976.2016.1176262

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

E. Vecharynski, C. Yang, and F. Xue,"Generalized preconditioned locally harmonic residual method for non-Hermitian eigenproblems",SIAM Journal on Scientific Computing, Vol. 38, No. 1, pp. A500–A527,2016,doi: 10.1137/15M1027413

We introduce the Generalized Preconditioned Locally Harmonic Residual (GPLHR) method for solving standard and generalized non-Hermitian eigenproblems. The method is particularly useful for computing a subset of eigenvalues, and their eigen- or Schur vectors, closest to a given shift. The proposed method is based on block iterations and can take advantage of a preconditioner if it is available. It does not need to perform exact shift-and-invert transformation. Standard and generalized eigenproblems are handled in a unified framework. Our numerical experiments demonstrate that GPLHR is generally more robust and efficient than existing methods, especially if the available memory is limited.

E. Vecharynski,"A generalization of Saad's bound on harmonic Ritz vectors of Hermitian matrices",Linear Algebra and its Applications, Vol. 494, pp. 219-235,2016,doi: 10.1016/j.laa.2016.01.013

We prove a Saad's type bound for harmonic Ritz vectors of a Hermitian matrix. The new bound reveals a dependence of the harmonic Rayleigh-Ritz procedure on the condition number of a shifted problem operator. Several practical implications are discussed. In particular, the bound motivates incorporation of preconditioning into the harmonic Rayleigh-Ritz scheme.

D. B. Szyld, E. Vecharynski, and F. Xue,"Preconditioned eigensolvers for large-scale nonlinear Hermitian eigenproblems with variational characterizations. II. Interior eigenvalues.",SIAM Journal on Scientific Computing, Vol. 37, Issue 6, pp. A2969-A2997,2015,

We consider the solution of large-scale nonlinear algebraic Hermitian eigenproblems of the form $T(\lambda)v=0$ that admit a variational characterization of eigenvalues. These problems arise in a variety of applications and are generalizations of linear Hermitian eigenproblems $Av\!=\!\lambda Bv$. In this paper, we propose a Preconditioned Locally Minimal Residual (PLMR) method for efficiently computing interior eigenvalues of problems of this type. We discuss the development of search subspaces, preconditioning, and eigenpair extraction procedure based on the refined Rayleigh-Ritz projection. Extension to the block methods is presented, and a moving-window style soft deflation is described. Numerical experiments demonstrate that PLMR methods provide a rapid and robust convergence towards interior eigenvalues. The approach is also shown to be efficient and reliable for computing a large number of extreme eigenvalues, dramatically outperforming standard preconditioned conjugate gradient methods.

E. Vecharynski, A. Knyazev,"Preconditioned Locally Harmonic Residual Method for computing interior eigenpairs of certain classes of Hermitian matrices",SIAM Journal on Scientific Computing, Vol. 37, Issue 5, pp. S3–S29,2015,

We propose a Preconditioned Locally Harmonic Residual (PLHR) method for computing several interior eigenpairs of a generalized Hermitian eigenvalue problem, without traditional spectral transformations, matrix factorizations, or inversions. PLHR is based on a short-term recurrence, easily extended to a block form, computing eigenpairs simultaneously. PLHR can take advantage of Hermitian positive definite preconditioning, e.g., based on an approximate inverse of an absolute value of a shifted matrix, introduced in [SISC, 35 (2013), pp. A696–A718]. Our numerical experiments demonstrate that PLHR is efficient and robust for certain classes of large-scale interior eigenvalue problems, involving Laplacian and Hamiltonian operators, especially if memory requirements are tight.

Tobias Titze, Alexander Lauerer, Lars Heinke, Christian Chmelik, Nils E. R. Zimmermann, Frerich J. Keil, Douglas M. Ruthven, Jörg Kärger,"Transport in Nanoporous Materials Including MOFs: The Applicability of Fick’s Laws",Angew. Chem. Int. Ed.,2015,doi: 10.1002/anie.201506954

Diffusion in nanoporous host–guest systems is often considered to be too complicated to comply with such “simple” relationships as Fick’s first and second law of diffusion. However, it is shown herein that the microscopic techniques of diffusion measurement, notably the pulsed field gradient (PFG) technique of NMR spectroscopy and microimaging by interference microscopy (IFM) and IR microscopy (IRM), provide direct experimental evidence of the applicability of Fick’s laws to such systems. This remains true in many situations, even when the detailed mechanism is complex. The limitations of the diffusion model are also discussed with reference to the extensive literature on this subject.

Nils E. R. Zimmermann, Bart Vorselaars, David Quigley, Baron Peters,"Nucleation of NaCl from Aqueous Solution: Critical Sizes, Ion-Attachment Kinetics, and Rates",J. Am. Chem. Soc.,2015,doi: 10.1021/jacs.5b08098

Nucleation and crystal growth are important in material synthesis, climate modeling, biomineralization, and pharmaceutical formulation. Despite tremendous efforts, the mechanisms and kinetics of nucleation remain elusive to both theory and experiment. Here we investigate sodium chloride (NaCl) nucleation from supersaturated brines using seeded atomistic simulations, polymorph-specific order parameters, and elements of classical nucleation theory. We find that NaCl nucleates via the common rock salt structure. Ion desolvation—not diffusion—is identified as the limiting resistance to attachment. Two different analyses give approximately consistent attachment kinetics: diffusion along the nucleus size coordinate and reaction-diffusion analysis of approach-to-coexistence simulation data from Aragones et al. (J. Chem. Phys. 2012, 136, 244508). Our simulations were performed at realistic supersaturations to enable the first direct comparison to experimental nucleation rates for this system. The computed and measured rates converge to a common upper limit at extremely high supersaturation. However, our rate predictions are between 15 and 30 orders of magnitude too fast. We comment on possible origins of the large discrepancy.

Watch a movie illustrating our seeded simulation strategy here.

Nathan Hanford, Vishal Ahuja, Mehmet Balman, Matthew. Farrens, Dipak Ghosal, Eric Pouyoul, Brian Tierney,"Improving Network Performance on Multicore Systems: Impact of Core Affinities on High Throughput Flows",The International Journal of eScience, Elsevier,2015,doi: doi:10.1016/j.future.2015.09.012

Network throughput is scaling-up to higher data rates while end-system processors are scaling-out to multiple cores. In order to optimize high speed data transfer into multicore end-systems, techniques such as network adaptor offloads and performance tuning have received a great deal of attention. Furthermore, several methods of multi-threading the network receive process have been proposed. However, thus far attention has been focused on how to set the tuning parameters and which offloads to select for higher performance, and little has been done to understand why the various parameter settings do (or do not) work. In this paper, we build on previous research to track down the sources of the end-system bottleneck for high-speed TCP flows. We define protocol processing efficiency to be the amount of system resources (such as CPU and cache) used per unit of achieved throughput (in Gbps). The amount of various system resources consumed are measured using low-level system event counters. In a multicore end-system, affinitization, or core binding, is the decision regarding how the various tasks of network receive process including interrupt, network, and application processing are assigned to the different processor cores. We conclude that affinitization has a significant impact on protocol processing efficiency, and that the performance bottleneck of the network receive process changes significantly with different affinitization.

Štěpán Timr, Jiří Brabec, Alexey Bondar, Tomáš Ryba, Miloš Železný, Josef Lazar, Pavel Jungwirth,"Non-Linear Optical Properties of Fluorescent Dyes Allow for Accurate Determination of Their Molecular Orientations in Phospholipid Membranes",The Journal of Physical Chemistry,July 6, 2015,

Several methods based on single- and two-photon fluorescence detected linear dichroism have recently been used to determine the orientational distributions of fluorescent dyes in lipid membranes. However, these determinations relied on simplified descriptions of non-linear anisotropic properties of the dye molecules, using a transition dipole moment-like vector instead of an absorptivity tensor. To investigate the validity of the vector approximation, we have now carried out a combination of computer simulations and polarization microscopy experiments on two representative fluorescent dyes (DiI and F2N12S) embedded in aqueous phosphatidylcholine bilayers. Our results indicate that a simplified vector-like treatment of the two-photon transition tensor is applicable for molecular geometries sampled in the membrane at ambient conditions. Furthermore, our results allow evaluation of several distinct polarization microscopy techniques. In combination, our results point to a robust and accurate experimental and computational treatment of orientational distributions of DiI, F2N12S and related dyes (including Cy3, Cy5, and others), with implications to monitoring physiologically relevant processes in cellular membranes in a novel way.

P. McCorquodale, P.A. Ullrich, H. Johansen, P. Colella,"An adaptive multiblock high-order finite-volume method for solving the shallow-water equations on the sphere",Comm. App. Math. and Comp. Sci.,2015,

accepted for publication

E. Vecharynski, C. Yang, J. E. Pask,"A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix",Journal of Computational Physics, Vol. 290, pp. 73–89,2015,

We present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimal block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.

Wei Hu, Lin Lin and Chao Yang,"Edge reconstruction in armchair phosphorene nanoribbons revealed by discontinuous Galerkin density functional theory",Phys. Chem. Chem. Phys., 2015, Advance Article,February 11, 2015,doi: 10.1039/C5CP00333D

With the help of our recently developed massively parallel DGDFT (Discontinuous Galerkin Density Functional Theory) methodology, we perform large-scale Kohn–Sham density functional theory calculations on phosphorene nanoribbons with armchair edges (ACPNRs) containing a few thousands to ten thousand atoms. The use of DGDFT allows us to systematically achieve a conventional plane wave basis set type of accuracy, but with a much smaller number (about 15) of adaptive local basis (ALB) functions per atom for this system. The relatively small number of degrees of freedom required to represent the Kohn–Sham Hamiltonian, together with the use of the pole expansion the selected inversion (PEXSI) technique that circumvents the need to diagonalize the Hamiltonian, results in a highly efficient and scalable computational scheme for analyzing the electronic structures of ACPNRs as well as their dynamics. The total wall clock time for calculating the electronic structures of large-scale ACPNRs containing 1080–10 800 atoms is only 10–25 s per self-consistent field (SCF) iteration, with accuracy fully comparable to that obtained from conventional planewave DFT calculations. For the ACPNR system, we observe that the DGDFT methodology can scale to 5000–50 000 processors. We use DGDFT based ab initio molecular dynamics (AIMD) calculations to study the thermodynamic stability of ACPNRs. Our calculations reveal that a 2 × 1 edge reconstruction appears in ACPNRs at room temperature.

Thorsten Kurth, Andrew Pochinsky, Abhinav Sarje, Sergey Syritsyn, Andre Walker-Loud,"High-Performance I/O: HDF5 for Lattice QCD",arXiv:1501.06992,January 2015,

Practitioners of lattice QCD/QFT have been some of the primary pioneer users of the state-of-the-art high-performance-computing systems, and contribute towards the stress tests of such new machines as soon as they become available. As with all aspects of high-performance-computing, I/O is becoming an increasingly specialized component of these systems. In order to take advantage of the latest available high-performance I/O infrastructure, to ensure reliability and backwards compatibility of data files, and to help unify the data structures used in lattice codes, we have incorporated parallel HDF5 I/O into the SciDAC supported USQCD software stack. Here we present the design and implementation of this I/O framework. Our HDF5 implementation outperforms optimized QIO at the 10-20% level and leaves room for further improvement by utilizing appropriate dataset chunking.

D. Zuev, E. Vecharynski, C. Yang, N. Orms, and A.I. Krylov,"New algorithms for iterative matrix-free eigensolvers in quantum chemistry",Journal of Computational Chemistry, Vol. 36, Issue 5, pp. 273–284,2015,

New algorithms for iterative diagonalization procedures that solve for a small set of eigen-states of a large matrix are described. The performance of the algorithms is illustrated by calculations of low and high-lying ionized and electronically excited states using equation-of-motion coupled-cluster methods with single and double substitutions (EOM-IP-CCSD and EOM-EE-CCSD). We present two algorithms suitable for calculating excited states that are close to a specified energy shift (interior eigenvalues). One solver is based on the Davidson algorithm, a diagonalization procedure commonly used in quantum-chemical calculations. The second is a recently developed solver, called the “Generalized Preconditioned Locally Harmonic Residual (GPLHR) method.” We also present a modification of the Davidson procedure that allows one to solve for a specific transition. The details of the algorithms, their computational scaling, and memory requirements are described. The new algorithms are implemented within the EOM-CC suite of methods in the Q-Chem electronic structure program.

Wei Hu, Lin Lin, Chao Yang and Jinlong Yang,"Electronic structure and aromaticity of large-scale hexagonal graphene nanoflakes",J. Chem. Phys. 141, 214704 (2014),December 2, 2014,141:214704,doi: 10.1063/1.4902806

With the help of the recently developed SIESTA-PEXSI method [L. Lin, A. García, G. Huhs, and C. Yang, J. Phys.: Condens. Matter26, 305503 (2014)], we perform Kohn-Sham density functional theory calculations to study the stability and electronic structure of hydrogen passivated hexagonal graphene nanoflakes (GNFs) with up to 11 700 atoms. We find the electronic properties of GNFs, including their cohesive energy, edge formation energy, highest occupied molecular orbital-lowest unoccupied molecular orbital energy gap, edge states, and aromaticity, depend sensitively on the type of edges (armchair graphene nanoflakes (ACGNFs) and zigzag graphene nanoflakes (ZZGNFs)), size and the number of electrons. We observe that, due to the edge-induced strain effect in ACGNFs, large-scale ACGNFs’ edge formation energydecreases as their size increases. This trend does not hold for ZZGNFs due to the presence of many edge states in ZZGNFs. We find that the energy gaps E g of GNFs all decay with respect to 1/L, where L is the size of the GNF, in a linear fashion. But as their size increases, ZZGNFs exhibit more localized edge states. We believe the presence of these states makes their gap decrease more rapidly. In particular, when L is larger than 6.40 nm, we find that ZZGNFs exhibit metallic characteristics. Furthermore, we find that the aromatic structures of GNFs appear to depend only on whether the system has 4N or 4N + 2 electrons, where N is an integer.

Wenqi Xia, Wei Hu, Zhenyu Li and Jinlong Yang,"A first-principles study of gas adsorption on germanene",Phys. Chem. Chem. Phys., 2014,16, 22495-22498,August 29, 2014,doi: 10.1039/C4CP03292F

The adsorption of common gas molecules (N2, CO, CO2, H2O, NH3, NO, NO2, and O2) on germanene is studied with density functional theory. The results show that N2, CO, CO2, and H2O are physisorbed on germanene via van der Waals interactions, while NH3, NO, NO2, and O2 are chemisorbed on germanene via strong covalent (Ge–N or Ge–O) bonds. The chemisorption of gas molecules on germanene opens a band gap at the Dirac point of germanene. NO2 chemisorption on germanene shows strong hole doping in germanene. O2 is easily dissociated on germanene at room temperature. Different adsorption behaviors of common gas molecules on germanene provide a feasible way to exploit chemically modified germanene.

David H. Bailey, Jonathan M. Borwein, Marcos Lopez de Prado, Qiji Jim Zhu,"Pseudo-mathematics and financial charlatanism: The effects of backtest over fitting on out-of-sample performance",Notices of the American Mathematical Society,May 1, 2014,458-471,

Recent computational advances allow investment managers to search for profitable investment strategies. In many instances, that search involves a pseudo-mathematical argument, which is spuriously validated through a simulation of its historical performance (also called backtest).

We prove that high performance is easily achievable after backtesting a relatively small number of alternative strategy configurations, a practice we denote “backtest overfitting”. The higher the number of configurations tried, the greater is the probability that the backtest is overfit. Because financial analysts rarely report the number of configurations tried for a given backtest, investors cannot evaluate the degree of overfitting in most investment proposals.

The implication is that investors can be easily misled into allocating capital to strategies that appear to be mathematically sound and empirically supported by an outstanding backtest. This practice is particularly pernicious, because due to the nature of financial time series, backtest overfitting has a detrimental effect on the future strategy’s performance.

E. Vecharynski and Y. Saad,"Fast updating algorithms for latent semantic indexing",SIAM Journal on Matrix Analysis and Applications, Vol. 35, Issue 3, pp. 1105–1131,2014,

This paper discusses a few algorithms for updating the approximate singular value decomposition (SVD) in the context of information retrieval by latent semantic indexing (LSI) methods. A unifying framework is considered which is based on Rayleigh–Ritz projection methods. First, a Rayleigh–Ritz approach for the SVD is discussed and it is then used to interpret the Zha and Simon algorithms [SIAM J. Sci. Comput., 21 (1999), pp. 782–791]. This viewpoint leads to a few alternatives whose goal is to reduce computational cost and storage requirement by projection techniques that utilize subspaces of much smaller dimension. Numerical experiments show that the proposed algorithms yield accuracies comparable to those obtained from standard ones at a much lower computational cost.

Richard L. Martin, Cory M. Simon, Berend Smit, Maciej Haranczyk,"In-silico design of porous polymer networks: high-throughput screening for methane storage materials",Journal of the American Chemical Society,March 10, 2014,

Porous polymer networks (PPNs) are a class of advanced porous materials that combine the advantages of cheap and stable polymers with the high surface areas and tunable chemistry of metal-organic frameworks. They are of particular interest for gas separation or storage applications, for instance as methane adsorbents for a vehicular natural gas tank or other portable applications.

Richard L. Martin, Maciej Haranczyk,"Construction and Characterization of Structure Models of Crystalline Porous Polymers",Crystal Growth & Design,March 6, 2014,

Metal-organic frameworks (MOFs) and covalent organic frameworks (COFs) are examples of advanced porous polymeric materials that have emerged in recent years. Their crystalline structure and modular synthesis offer unmatched versatility in their design. By exchanging chemical building blocks, one can both explore the unlimited space of possible structural chemistry within an isoreticular (same crystal topology) series, as well as achieve a wide range of alternative topologies.

Lev Sarkisov, Richard L. Martin, Maciej Haranczyk, Berend Smit,"On the Flexibility of Metal-Organic Frameworks",Journal of the American Chemical Society,January 24, 2014,

Occasional, large amplitude flexibility in metal-organic frameworks (MOFs) is one of the most intriguing recent discoveries in chemistry and material science. Yet, there is at present no theoretical framework that permits the identification of flexible structures in the rapidly expanding universe of MOFs. Here, we propose a simple method to predict whether a MOF is flexible, based on treating it as a system of rigid elements, connected by hinges. This proposition is correct in application to MOFs based on rigid carboxylate linkers.

Wei Hu, Nan Xia, Xiaojun Wu, Zhenyu Li and Jinlong Yang,"Silicene as a highly sensitive molecule sensor for NH3, NO and NO2",Phys. Chem. Chem. Phys., 2014,16, 6957-6962,January 23, 2014,doi: 10.1039/C3CP55250K

On the basis of first-principles calculations, we demonstrate the potential application of silicene as a highly sensitive molecule sensor for NH3, NO, and NO2 molecules. NH3, NO and NO2 molecules chemically adsorb on silicene via strong chemical bonds. With distinct charge transfer from silicene to molecules, silicene and chemisorbed molecules form charge-transfer complexes. The adsorption energy and charge transfer in NO2-adsorbed silicene are larger than those of NH3- and NO-adsorbed silicones. Depending on the adsorbate types and concentrations, the silicene-based charge-transfer complexes exhibit versatile electronic properties with tunable band gap opening at the Dirac point of silicene. The calculated charge carrier concentrations of NO2-chemisorbed silicene are 3 orders of magnitude larger than intrinsic charge carrier concentration of graphene at room temperature. The results present a great potential of silicene for application as a highly sensitive molecule sensor.

J.A. Sobota, S.-L. Yang, D. Leuenberger, A.F. Kemper, J.G. Analytis, I.R. Fisher, P.S. Kirchmann, T.P. Devereaux, Z.-X. Shen,"Ultrafast electron dynamics in the topological insulator Bi2Se3 studied by time-resolved photoemission spectroscopy",Journal of Electron Spectroscopy and Related Phenomena,January 22, 2014,

We characterize the topological insulator Bi2Se3 using time- and angle-resolved photoemission spectroscopy. By employing two-photon photoemission, a complete picture of the unoccupied electronic structure from the Fermi level up to the vacuum level is obtained. We demonstrate that the unoccupied states host a second Dirac surface state which can be resonantly excited by 1.5 eV photons. We then study the ultrafast relaxation processes following optical excitation. We find that they culminate in a persistent non-equilibrium population of the first Dirac surface state, which is maintained by a meta-stable population of the bulk conduction band. Finally, we perform a temperature-dependent study of the electron–phonon scattering processes in the conduction band, and find the unexpected result that their rates decrease with increasing sample temperature. We develop a model of phonon emission and absorption from a population of electrons, and show that this counter-intuitive trend is the natural consequence of fundamental electron–phonon scattering processes. This analysis serves as an important reminder that the decay rates extracted by time-resolved photoemission are not in general equal to single electron scattering rates, but include contributions from filling and emptying processes from a continuum of states.

M.A. Sentef, M. Claassen, A.F. Kemper, B. Moritz, T. Oka, J.K. Freericks, T.P. Devereaux,"Theory of pump-probe photoemission in graphene and the generation of light-induced Haldane multilayers",arXiv pre-print,January 20, 2014,

The combination of time-reversal and inversion symmetry protects massless Dirac fermions in graphene and on the surface of topological insulators. In a milestone paper, Haldane envisioned that breaking either or both of these symmetries would open a gap at the Dirac points, allowing one to tune between a trivial insulator and a Chern insulator. While equilibrium band gap engineering has become a major theme since the first synthesis of monolayer graphene, it was only recently proposed that circularly polarized laser light could turn trivial equilibrium bands into topological nonequilibrium bands. Here we observe ultrafast band gap openings and paradoxical gap closings at a critical field strength. Importantly, the gap openings are accompanied by nontrivial changes of the band topology, realizing a photo-induced Haldane multilayer system. We show that pump-probe photoemission spectroscopy can track these transitions in real time via energy gaps exceeding 100 meV. The analogy with Haldane multilayers is revealed by nontrivial pseudospin textures, going from a monolayer p-wave to a bilayer d-wave symmetry at the critical field strength. We thus predict a nonequilibrium realization of a tunable Haldane multilayer model with a Berry curvature that can be tipped optically by small changes in external fields on femtosecond time scales. Since we are focused on the physics of chiral Dirac fermions, these results apply equally to all systems possessing Dirac points, such as surface states of topological insulators.

E. Vecharynski, Y. Saad, and M. Sosonkina,"Graph partitioning using matrix values for preconditioning symmetric positive definite systems",SIAM Journal on Scientific Computing Vol. 36, Issue 1, pp. A63-A87,2014,

Prior to the parallel solution of a large linear system, it is required to perform a partitioning of its equations/unknowns. Standard partitioning algorithms are designed using the considerations of the efficiency of the parallel matrix-vector multiplication, and typically disregard the information on the coefficients of the matrix. This information, however, may have a significant impact on the quality of the preconditioning procedure used within the chosen iterative scheme. In the present paper, we suggest a spectral partitioning algorithm, which takes into account the information on the matrix coefficients and constructs partitions with respect to the objective of enhancing the quality of the nonoverlapping additive Schwarz (block Jacobi) preconditioning for symmetric positive definite linear systems. For a set of test problems with large variations in magnitudes of matrix coefficients, our numerical experiments demonstrate a noticeable improvement in the convergence of the resulting solution scheme when using the new partitioning approach.

Michael Sentef, Alexander F. Kemper, Brian Moritz, James K. Freericks, Zhi-Xun Shen, and Thomas P. Devereaux,"Examining Electron-Boson Coupling Using Time-Resolved Spectroscopy",Phys. Rev. X 3, 041033 (2013),December 26, 2013,

Nonequilibrium pump-probe time-domain spectroscopies can become an important tool to disentangle degrees of freedom whose coupling leads to broad structures in the frequency domain. Here, using the time-resolved solution of a model photoexcited electron-phonon system, we show that the relaxational dynamics are directly governed by the equilibrium self-energy so that the phonon frequency sets a window for “slow” versus “fast” recovery. The overall temporal structure of this relaxation spectroscopy allows for a reliable and quantitative extraction of the electron-phonon coupling strength without requiring an effective temperature model or making strong assumptions about the underlying bare electronic band dispersion.

Daniel T. Graves, Phillip Colella, David Modiano, Jeffrey Johnson, Bjorn Sjogreen, Xinfeng Gao,"A Cartesian Grid Embedded Boundary Method for the Compressible Navier Stokes Equations",Communications in Applied Mathematics and Computational Science,December 23, 2013,

In this paper, we present an unsplit method for the time-dependent
compressible Navier-Stokes equations in two and three dimensions.
We use a a conservative, second-order Godunov algorithm.
We use a Cartesian grid, embedded boundary method to resolve complex
boundaries.  We solve for viscous and conductive terms with a
second-order semi-implicit algorithm.  We demonstrate second-order
accuracy in solutions of smooth problems in smooth geometries and
demonstrate robust behavior for strongly discontinuous initial
conditions in complex geometries.

Cory M. Simon, Jihan Kim, Li-Chiang Lin, Richard L. Martin, Maciej Haranczyk, Berend Smit,"Optimizing nanoporous materials for gas storage",Physical Chemistry Chemical Physics,December 4, 2013,

Natural gas, mostly methane, is an attractive replacement of petroleum fuels for automotive vehicles because of its economic and environmental advantages. The technological obstacle to using methane as a vehicular fuel is its comparatively low volumetric energy density, necessitating densification strategies to yield reasonable driving ranges from a reasonably sized tank.

N. Plonka, A. F. Kemper, S. Graser, A. P. Kampf, T. P. Devereaux,"Tunneling spectroscopy for probing orbital anisotropy in iron pnictides",Phys. Rev. B 88, 174518 (2013),November 27, 2013,

Using realistic multiorbital tight-binding Hamiltonians and the T-matrix formalism, we explore the effects of a nonmagnetic impurity on the local density of states in Fe-based compounds. We show that scanning tunneling spectroscopy (STS) has very specific anisotropic signatures that track the evolution of orbital splitting (OS) and antiferromagnetic gaps. Both anisotropies exhibit two patterns that split in energy with decreasing temperature, but for OS these two patterns map onto each other under 90 rotation. STS experiments that observe these signatures should expose the underlying magnetic and orbital order as a function of temperature across various phase transitions.

Slim T. Chourou, Abhinav Sarje, Xiaoye Li, Elaine Chan and Alexander Hexemer,"HipGISAXS: a high-performance computing code for simulating grazing-incidence X-ray scattering data",Journal of Applied Crystallography,2013,46:1781-1795,doi: 10.1107/ S0021889813025843

We have implemented a flexible Grazing Incidence Small-Angle Scattering (GISAXS) simulation code in the framework of the Distorted Wave Born Approximation (DWBA) that effectively utilizes the parallel processing power provided by graphics processors and multicore processors. This constitutes a handy tool for experimentalists facing a massive flux of data, allowing them to accurately simulate the GISAXS process and analyze the produced data. The software computes the diffraction image for any given superposition of custom shapes or morphologies in a user-defined region of the reciprocal space for all possible grazing incidence angles and sample orientations. This flexibility then allows to easily tackle a wide range of possible sample structures such as nanoparticles on top of or embedded in a substrate or a multilayered structure. In cases where the sample displays regions of significant refractive index contrast, an algorithm has been implemented to perform a slicing of the sample and compute the averaged refractive index profile to be used as the reference geometry of the unperturbed system. Preliminary tests show good agreement with experimental data for a variety of commonly encountered nanostrutures.

Maciej Haranczyk, Li-Chiang Lin, Kyuho Lee, Richard L. Martin, Jeffrey B. Neaton, Berend Smit,"Methane storage capabilities of diamond analogues",Physical Chemistry Chemical Physics,October 31, 2013,

Methane can be an alternative fuel for vehicular usage provided that new porous materials are developed for its efficient adsorption-based storage. Herein, we search for materials for this application within the family of diamond analogues. We used density functional theory to investigate structures in which tetrahedral C atoms of diamond are separated by-CC-or-BN-groups, as well as ones involving substitution of tetrahedral C atoms with Si and Ge atoms.

Wei Hu, Zhenyu Li and Jinlong Yang,"Structural, electronic, and optical properties of hybrid silicene and graphene nanocomposite",J. Chem. Phys. 139, 154704 (2013),October 16, 2013,doi: 10.1063/1.4824887

Structural, electronic, and optical properties of hybrid silicene and graphene (S/G) nanocomposite are examined with density functional theory calculations. It turns out that weak van der Waals interactions dominate between silicene and graphene with their intrinsic electronic properties preserved. Interestingly, interlayer interactions in hybrid S/G nanocomposite induce tunable p-type and n-type doping of silicene and graphene, respectively, showing their doping carrier concentrations can be modulated by their interfacial spacing.

Wei Hu, Zhenyu Li and Jinlong Yang,"Surface and size effects on the charge state of NV center in nanodiamonds",Computational and Theoretical Chemistry, 2013, 1021, 49-53,October 1, 2013,doi: 10.1016/j.comptc.2013.06.015

Electronic structures and stability of nitrogen–vacancy (NV) centers doped in nanodiamonds (NDs) have been investigated with large-scale density functional theory (DFT) calculations. Spin polarized defect states are not affected by the particle sizes and surface decorations, while the band gap is sensitive to these effects. Induced by the spherical surface electric dipole layer, surface functionalization has a long-ranged impact on the stability of charged NV centers doped in NDs. NV− center doped in DNs is more favorable for n-type fluorinated diamond, while NV0 is preferred for p-type hydrogenated NDs. Therefore, surface decoration provides a useful way for defect state engineering.

J. A. Sobota, S.-L. Yang, A. F. Kemper, J. J. Lee, F. T. Schmitt, W. Li, R. G. Moore, J. G. Analytis, I. R. Fisher, P. S. Kirchmann, T. P. Devereaux, and Z.-X. Shen,"Direct Optical Coupling to an Unoccupied Dirac Surface State in the Topological Insulator Bi2Se3",Phys. Rev. Lett. 111, 136802 (2013),September 24, 2013,

We characterize the occupied and unoccupied electronic structure of the topological insulator Bi2Se3 by one-photon and two-photon angle-resolved photoemission spectroscopy and slab band structure calculations. We reveal a second, unoccupied Dirac surface state with similar electronic structure and physical origin to the well-known topological surface state. This state is energetically located 1.5 eV above the conduction band, which permits it to be directly excited by the output of a Ti:sapphire laser. This discovery demonstrates the feasibility of direct ultrafast optical coupling to a topologically protected, spin-textured surface state.

Y. F. Kung, W.-S. Lee, C.-C. Chen, A. F. Kemper, A. P. Sorini, B. Moritz, and T. P. Devereaux,"Time-dependent charge-order and spin-order recovery in striped systems",Phys. Rev. B 88, 125114 (2013),September 24, 2013,

Using time-dependent Ginzburg-Landau theory, we study the role of amplitude and phase fluctuations in the recovery of charge-stripe and spin-stripe phases in response to a pump pulse that melts the orders. For parameters relevant to the case where charge order precedes spin order thermodynamically, amplitude recovery governs the initial time scales, while phase recovery controls behavior at longer times. In addition to these intrinsic effects, there is a longer spin reorientation time scale related to the scattering geometry that dominates the recovery of the spin phase. Coupling between the charge and spin orders locks the amplitude and similarly the phase recovery, reducing the number of distinct time scales. Our results well reproduce the major experimental features of pump-probe x-ray diffraction measurements on the striped nickelate La1.75Sr0.25NiO4. They highlight the main idea of this work, which is the use of time-dependent Ginzburg-Landau theory to study systems with multiple coexisting order parameters.

Richard L Martin, Mahdi Niknam Shahrak, Joseph A Swisher, Cory M Simon, Julian P Sculley, Hong-Cai Zhou, Berend Smit, Maciej Haranczyk,"Modeling Methane Adsorption in Interpenetrating Porous Polymer Networks",The Journal of Physical Chemistry C,September 19, 2013,

Porous polymer networks (PPNs) are a class of porous materials of particular interest in a variety of energy-related applications because of their stability, high surface areas, and gas uptake capacities. Computationally derived structures for five recently synthesized PPN frameworks, PPN-2,-3,-4,-5, and-6, were generated for various topologies, optimized using semiempirical electronic structure methods, and evaluated using classical grand-canonical Monte Carlo simulations.

Richard L. Martin, Maciej Haranczyk,"Insights into Multi-Objective Design of Metal–Organic Frameworks",Crystal Growth & Design,September 18, 2013,

Metal-organic framework (MOF) crystal topologies which permit the highest internal surface areas are identified by means of multiobjective optimization and abstract structure models. We demonstrate that MOF design efforts can be focused within five underlying nets to engineer distinct, Pareto-optimal compromises between high gravimetric and high volumetric surface area materials.

Marielle Pinheiro, Richard L. Martin, Chris H. Rycroft, Maciej Haranczyk,"High accuracy geometric analysis of crystalline porous materials",CrystEngComm,September 5, 2013,

A number of algorithms to analyze crystalline porous materials and their porosity employ the Voronoi tessellation, whereby the space in the material is divided into irregular polyhedral cells that can be analyzed to determine the pore topology and structure. However, the Voronoi tessellation is only appropriate when atoms all have equal radii, and the natural generalization to structures with unequal radii leads to cells with curved boundaries, which are computationally expensive to compute.

B. Moritz, A. F. Kemper, M. Sentef, T. P. Devereaux, J. K. Freericks,"Electron-Mediated Relaxation Following Ultrafast Pumping of Strongly Correlated Materials: Model Evidence of a Correlation-Tuned Crossover between Thermal and Nonthermal States",Phys. Rev. Lett. 111, 077401 (2013),2013,

We examine electron-electron mediated relaxation following ultrafast electric field pump excitation of the fermionic degrees of freedom in the Falicov-Kimball model for correlated electrons. The results reveal a dichotomy in the temporal evolution of the system as one tunes through the Mott metal-to-insulator transition: in the metallic regime relaxation can be characterized by evolution toward a steady state well described by Fermi-Dirac statistics with an increased effective temperature; however, in the insulating regime this quasithermal paradigm breaks down with relaxation toward a nonthermal state with a complicated electronic distribution as a function of momentum. We characterize the behavior by studying changes in the energy, photoemission response, and electronic distribution as functions of time. This relaxation may be observable qualitatively on short enough time scales that the electrons behave like an isolated system not in contact with additional degrees of freedom which would act as a thermal bath, especially when using strong driving fields and studying materials whose physics may manifest the effects of correlations.

Marielle Pinheiro, Richard L. Martin, Chris H. Rycroft, Andrew Jones, Enrique Iglesia, Maciej Haranczyk,"Characterization and comparison of pore landscapes in crystalline porous materials",Journal of Molecular Graphics and Modelling,July 31, 2013,

Crystalline porous materials have many applications, including catalysis and separations. Identifying suitable materials for a given application can be achieved by screening material databases. Such a screening requires automated high-throughput analysis tools that characterize and represent pore landscapes with descriptors, which can be compared using similarity measures in order to select, group and classify materials. Here, we discuss algorithms for the calculation of two types of pore landscape descriptors.

Wei Hu, Xiaojun Wu, Zhenyu Li and Jinlong Yang,"Helium separation via porous silicene based ultimate membrane",Nanoscale, 2013, 5, 9062-9066,July 11, 2013,doi: 10.1039/C3NR02326E

Helium purification has become more important for increasing demands in scientific and industrial applications. In this work, we demonstrated that the porous silicene can be used as an effective ultimate membrane for helium purification on the basis of first-principles calculations. Prinstine silicene monolayer is impermeable to helium gas with a high penetration energy barrier (1.66 eV). However, porous silicene with either Stone–Wales (SW) or divacancy (555[thin space (1/6-em)]777 or 585) defect presents a surmountable barrier for helium (0.33 to 0.78 eV) but formidable for Ne, Ar, and other gas molecules. In particular, the porous silicene with divacancy defects shows high selectivity for He/Ne and He/Ar, superior to graphene, polyphenylene, and traditional membranes.

A.F. Kemper, M. Sentef, B. Moritz, C.C. Kao, Z.X. Shen, J.K. Freericks, T.P. Devereaux,"Mapping of the unoccupied states and relevant bosonic modes via the time dependent momentum distribution",Phys. Rev. B 87, 235139 (2013),June 28, 2013,

The unoccupied states of complex materials are difficult to measure, yet they play a key role in determining their properties. We propose a technique that can measure the unoccupied states, called time-resolved Compton scattering, which measures the time-dependent momentum distribution (TDMD). Using a nonequilibrium Keldysh formalism, we study the TDMD for electrons coupled to a lattice in a pump-probe setup. We find a direct relation between temporal oscillations in the TDMD and the dispersion of the underlying unoccupied states, suggesting that both can be measured by time-resolved Compton scattering. We demonstrate the experimental feasibility by applying the method to a model of MgB2 with realistic material parameters.

Y. S. Lee, S. J. Moon, Scott C. Riggs, M. C. Shapiro, I. R. Fisher, Bradford W. Fulfer, Julia Y. Chan, A. F. Kemper, and D. N. Basov,"Infrared study of the electronic structure of the metallic pyrochlore iridate Bi2Ir2O7",Phys. Rev. B 87, 195143 (2013),May 30, 2013,

We investigated the electronic properties of a single crystal of metallic pyrochlore iridate Bi2Ir2O7 by means of infrared spectroscopy. Our optical conductivity data show the splitting of t2gbands into Jeff ones due to strong spin-orbit coupling. We observed a sizable midinfrared absorption near 0.2 eV which can be attributed to the optical transition within the Jeff,1/2 bands. More interestingly, we found an abrupt suppression of optical conductivity in the very far-infrared region. Our results suggest that the electronic structure of Bi2Ir2O7 is governed by the strong spin-orbit coupling and correlation effects, which are a prerequisite for theoretically proposed nontrivial topological phases in pyrochlore iridates.

Richard L. Martin, Maciej Haranczyk,"Optimization-Based Design of Metal-Organic Framework Materials",Journal of Chemical Theory and Computation,May 16, 2013,

Metal–organic frameworks (MOFs) are a class of porous materials constructed from metal or metal oxide building blocks connected by organic linkers. MOFs are highly tunable structures that can in theory be custom designed to meet the specific pore geometry and chemistry required for a given application such as methane storage or carbon capture. However, due to the sheer number of potential materials, identification of optimal MOF structures is a significant challenge.

Richard L. Martin, Li-Chiang Lin, Kuldeep Jariwala, Berend Smit, Maciej Haranczyk,"Mail-Order Metal–Organic Frameworks (MOFs): Designing Isoreticular MOF-5 Analogues Comprising Commercially Available Organic Molecules",The Journal of Physical Chemistry C,April 17, 2013,

Metal–organic frameworks (MOFs), a class of porous materials, are of particular interest in gas storage and separation applications due largely to their high internal surface areas and tunable structures. MOF-5 is perhaps the archetypal MOF; in particular, many isoreticular analogues of MOF-5 have been synthesized, comprising alternative dicarboxylic acid ligands. In this contribution we introduce a new set of hypothesized MOF-5 analogues, constructed from commercially available organic molecules.

Nils E. R. Zimmermann, Timm J. Zabel, Frerich J. Keil,"Transport into Nanosheets: Diffusion Equations Put to Test",J. Phys. Chem. C,2013,117:7384-7390,doi: 10.1021/jp400152q

Ultrathin porous materials, such as zeolite nanosheets, are prominent candidates for performing catalysis, drug supply, and separation processes in a highly efficient manner due to exceptionally short transport paths. Predictive design of such processes requires the application of diffusion equations that were derived for macroscopic, homogeneous surroundings to nanoscale, nanostructured host systems. Therefore, we tested different analytical solutions of Fick’s diffusion equations for their applicability to methane transport into two different zeolite nanosheets (AFI, LTA) under instationary conditions. Transient molecular dynamics simulations provided hereby concentration profiles and uptake curves to which the different solutions were fitted. Two central conclusions were deduced by comparing the fitted transport coefficients. First, the transport can be described correctly only if concentration profiles are used and the transport through the solid–gas interface is explicitly accounted for by the surface permeability. Second and most importantly, we have unraveled a size limitation to applying the diffusion equations to nanoscale objects. This is because transport-diffusion coefficients, DT, and surface permeabilities, α, of methane in AFI become dependent on nanosheet thickness. Deviations can amount to factors of 2.9 and 1.4 for DT and α, respectively, when, in the worst case, results from the thinnest AFI nanosheet are compared with data from the thickest sheet. We present a molecular explanation of the size limitation that is based on memory effects of entering molecules and therefore only observable for smooth pores such as AFI and carbon nanotubes. Hence, our work provides important tools to accurately predict and intuitively understand transport of guest molecules into porous host structures, a fact that will become the more valuable the more tiny nanotechnological objects get.

Watch a movie illustrating the transient molecular dynamics approach, which was critical for this study, here.

Wei Hu, Zhenyu Li and Jinlong Yang,"Electronic and optical properties of graphene and graphitic ZnO nanocomposite structures",J. Chem. Phys. 138, 124706 (2013),March 28, 2013,doi: 10.1063/1.4796602

Electronic and optical properties of graphene and graphitic ZnO (G/g-ZnO) nanocomposites have been investigated with density functional theory. Graphene interacts overall weakly with g-ZnO monolayer via van der Waals interaction. There is no charge transfer between the graphene and g-ZnO monolayer, while a charge redistribution does happen within the graphene layer itself, forming well-defined electron-hole puddles. When Al or Li is doped in the g-ZnO monolayer, substantial electron (n-type) and hole (p-type) doping can be induced in graphene, leading to well-separated electron-hole pairs at their interfaces. Improved optical properties in graphene/g-ZnO nanocomposite systems are also observed, with potential photocatalytic and photovoltaic applications.

E. Vecharynski and A. Knyazev,"Absolute value preconditioning for symmetric indefinite linear systems",SIAM Journal on Scientific Computing Vol. 35, Issue 2, pp. A696-A718,2013,

We introduce a novel strategy for constructing symmetric positive definite (SPD) preconditioners for linear systems with symmetric indefinite matrices. The strategy, called absolute value preconditioning, is motivated by the observation that the preconditioned minimal residual method with the inverse of the absolute value of the matrix as a preconditioner converges to the exact solution of the system in at most two steps. Neither the exact absolute value of the matrix nor its exact inverse are computationally feasible to construct in general. However, we provide a practical example of an SPD preconditioner that is based on the suggested approach. In this example we consider a model problem with a shifted discrete negative Laplacian and suggest a geometric multigrid (MG) preconditioner, where the inverse of the matrix absolute value appears only on the coarse grid, while operations on finer grids are based on the Laplacian. Our numerical tests demonstrate practical effectiveness of the new MG preconditioner, which leads to a robust iterative scheme with minimalist memory requirements.

Wei Hu, Xiaojun Wu, Zhenyu Li and Jinlong Yang,"Porous silicene as a hydrogen purification membrane",Phys. Chem. Chem. Phys., 2013, 15, 5753-5757,February 22, 2013,doi: 10.1039/C3CP00066D

We investigated theoretically the hydrogen permeability and selectivity of a porous silicene membrane via first-principles calculations. The subnanometer pores of the silicene membrane are designed as divacancy defects with octagonal and pentagonal rings (585-divacancy). The porous silicene exhibits high selectivity comparable with graphene-based membranes for hydrogen over various gas molecules (N2, CO, CO2, CH4, and H2O). The divacancy defects in silicene are chemically inert to the considered gas molecules. Our results suggest that the porous silicene membrane is expected to find great potential in gas separation and filtering applications.

Abhinav Sarje, Srinivas Aluru,"All-pairs computations on many-core graphics processors",Parallel Computing,2013,39-2:79-93,doi: 10.1016/j.parco.2013.01.002

Developing high-performance applications on emerging multi- and many-core architectures requires efficient mapping techniques and architecture-specific tuning methodologies to realize performance closer to their peak compute capability and memory bandwidth. In this paper, we develop architecture-aware methods to accelerate all-pairs computations on many-core graphics processors. Pairwise computations occur frequently in numerous application areas in scientific computing. While they appear easy to parallelize due to the independence of computing each pairwise interaction from all others, development of techniques to address multi-layered memory hierarchies, mapping within the restrictions imposed by the small and low-latency on-chip memories, striking the right balanced between concurrency, reuse and memory traffic etc., are crucial to obtain high-performance. We present a hierarchical decomposition scheme for GPUs based on decomposition of the output matrix and input data. We demonstrate that a careful tuning of the involved set of decomposition parameters is essential to achieve high efficiency on the GPUs. We also compare the performance of our strategies with an implementation on the STI Cell processor as well as multi-core CPU parallelizations using OpenMP and Intel Threading Building Blocks.

Developing high-performance applications on emerging multi- and many-core
architectures requires efficient mapping techniques and architecture-specific
tuning methodologies to realize performance closer to their peak compute
capability and memory bandwidth. In this paper, we develop architecture-aware
methods to accelerate all-pairs computations on many-core graphics processors.
Pairwise computations occur frequently in numerous application areas in
scientific computing. While they appear easy to parallelize due to the
independence of computing each pairwise interaction from all others, development
of techniques to address multi-layered memory hierarchies, mapping within the
restrictions imposed by the small and low-latency on-chip memories, striking the
right balanced between concurrency, reuse and memory traffic etc., are crucial
to obtain high-performance. We present a hierarchical decomposition scheme for
GPUs based on decomposition of the output matrix and input data. We demonstrate
that a careful tuning of the involved set of decomposition parameters is
essential to achieve high efficiency on the GPUs. We also compare the
performance of our strategies with an implementation on the STI Cell processor
as well as multi-core CPU parallelizations using OpenMP and Intel Threading
Building Blocks.Developing high-performance applications on emerging multi- and many-core
architectures requires efficient mapping techniques and architecture-specific
tuning methodologies to realize performance closer to their peak compute
capability and memory bandwidth. In this paper, we develop architecture-aware
methods to accelerate all-pairs computations on many-core graphics processors.
Pairwise computations occur frequently in numerous application areas in
scientific computing. While they appear easy to parallelize due to the
independence of computing each pairwise interaction from all others, development
of techniques to address multi-layered memory hierarchies, mapping within the
restrictions imposed by the small and low-latency on-chip memories, striking the
right balanced between concurrency, reuse and memory traffic etc., are crucial
to obtain high-performance. We present a hierarchical decomposition scheme for
GPUs based on decomposition of the output matrix and input data. We demonstrate
that a careful tuning of the involved set of decomposition parameters is
essential to achieve high efficiency on the GPUs. We also compare the
performance of our strategies with an implementation on the STI Cell processor
as well as multi-core CPU parallelizations using OpenMP and Intel Threading
Building Blocks.

Richard L. Martin, Maciej Haranczyk,"Exploring frontiers of high surface area metal-organic frameworks",Chemical Science,February 6, 2013,4:1781-1785,

Metal–organic frameworks (MOFs) have enjoyed considerable interest due to their high internal surface areas as well as tunable pore geometry and chemistry. However, design of optimal MOFs is a great challenge due to the significant number of possible structures. In this work, we present a strategy to rapidly explore the frontiers of these high surface area materials. Here, organic ligands are abstracted by geometrical (alchemical) building blocks, and an optimization of their defining geometrical parameters is performed to identify shapes of ligands which maximize gravimetric surface area of the resulting MOFs. A strength of our approach is that the space of ligands to be explored can be rigorously bounded, allowing discovery of the optimum ligand shape within any criteria, conforming to synthetic requirements or arbitrary exploratory limits. By modifying these bounds, we can project to what extent achievable surface area increases when moving beyond the present limits of organic synthesis. Projecting optimal ligand shapes onto real chemical species, we achieve blueprints for MOFs of various topologies that are predicted to achieve up to 70% higher surface area than the current benchmark materials.

Kumari Gaurav Rana, Takeaki Yajima, Subir Parui, Alexander F. Kemper, Thomas P.Devereaux, Yasuyuki Hikita, Harold Y. Hwang, Tamalika Banerjee,"Hot electron transport in a strongly correlated transition-metal oxide",Nature Scientific Reports, Volume 3, id. 1274 (2013).,February 2013,

Oxide heterointerfaces are ideal for investigating strong correlation effects to electron transport, relevant for oxide-electronics. Using hot-electrons, we probe electron transport perpendicular to the La0.7Sr0.3MnO3 (LSMO)- Nb-doped SrTiO3 (Nb:STO) interface and find the characteristic hot-electron attenuation length in LSMO to be 1.48 +/- 0.10 unit cells (u.c.) at -1.9 V, increasing to 2.02 +/- 0.16 u.c. at -1.3 V at room temperature. Theoretical analysis of this energy dispersion reveals the dominance of electron-electron and polaron scattering. Direct visualization of the local electron transport shows different transmission at the terraces and at the step-edges.

Wei Hu, Zhenyu Li and Jinlong Yang,"Diamond as an inert substrate of graphene",J. Chem. Phys. 138, 054701 (2013),February 1, 2013,doi: 10.1063/1.4789420

Interaction between graphene and semiconducting diamond substrate has been examined with large-scale density functional theory calculations. Clean and hydrogenated diamond (100) and (111) surfaces have been studied. It turns out that weak van der Waals interactions dominate for graphene on all these surfaces. High carrier mobility of graphene is almost not affected, except for a negligible energy gap opening at the Dirac point. No charge transfer between graphene and diamond (100) surfaces is detected, while different charge-transfer complexes are formed between graphene and diamond (111) surfaces, inducing either p-type or n-type doping on graphene. Therefore, diamond can be used as an excellent substrate of graphene, which almost keeps its electronic structures at the same time providing the flexibility of charge doping.

M. Dandouna, N. Emad and L.A. Drummond,"A Proposed Programming Model for Writing Sustainable Numerical Libraries for Extreme Scale Computing",Conc. and Compt.,January 16, 2013,

The promise of computer systems with very large orders of processing elements cannot be realized without an effective solution that targets the programming model with a suitable programming environ- ment. Nowadays, it is necessary to identify and rapidly make available robust software technologies to enable high-end computer applications to run efficiently on these emerging systems, and to enable the development of more complex and capable simulation codes for scientific and engineering applica- tions. We review some of numerical libraries that have achieved modularity, scalability and extensibility thanks to their use of object-oriented programming approaches. However, only a few of these libraries have managed to effectively implement sequential and parallel code reusability.

Here, we discuss what is currently missing from existing library implementations and propose a pro- gramming model based on a modular and multi-level parallelism approach that has a strict separation between computational operations, data management and communication. We illustrate how this model makes it possible to design more scalable libraries by exploiting better their functionalities and even enable the formulation of hybrid numerical scheme to be run efficiently on multi-level parallel systems with a large number of heterogeneous processing units without confining the parallelism to the program- ming model of the communication library. We use the multiple explicitly restarted Arnoldi method as our test case and our implementations require full reuse of serial/parallel kernels in their implementation. Our experiments include comparisons with state-of-the-art numerical libraries on high-end computing systems.

Kesheng Wu, Wes Bethel, Ming Gu, David, Oliver R\ ubel,"A Big Data Approach to Analyzing Market Volatility",Algorithmic Finance,2013,2:241--267,LBNL LBNL-6382E, doi: 10.3233/AF-13030

Understanding the microstructure of the financial market requires the processing of a vast amount of data related to individual trades, and sometimes even multiple levels of quotes. Analyzing such a large volume of data requires tremendous computing power that is not easily available to financial academics and regulators. Fortunately, public funded High Performance Computing (HPC) power is widely available at the National Laboratories in the US. In this paper we demonstrate that the HPC resource and the techniques for data-intensive sciences can be used to greatly accelerate the computation of an early warning indicator called Volume-synchronized Probability of Informed trading (VPIN). The test data used in this study contains five and a half year's worth of trading data for about 100 most liquid futures contracts, includes about 3 billion trades, and takes 140GB as text files. By using (1) a more efficient file format for storing the trading records, (2) more effective data structures and algorithms, and (3) parallelizing the computations, we are able to explore 16,000 different ways of computing VPIN in less than 20 hours on a 32-core IBM DataPlex machine. Our test demonstrates that a modest computer is sufficient to monitor a vast number of trading activities in real-time -- an ability that could be valuable to regulators.

Our test results also confirm that VPIN is a strong predictor of liquidity-induced volatility. With appropriate parameter choices, the false positive rates are about 7% averaged over all the futures contracts in the test data set. More specifically, when VPIN values rise above a threshold (CDF > 0.99), the volatility in the subsequent time windows is higher than the average in 93% of the cases.

E. O. Ofek, D. Fox, S. B. Cenko, M. Sullivan, O., D. A. Frail, A. Horesh, A. Corsi, R. M., N. Gehrels, S. R. Kulkarni, A., P. E. Nugent, O. Yaron, A. V. Filippenko, M. M., L. Bildsten, J. S. Bloom, D., I. Arcavi, R. R. Laher, D. Levitan, B. Sesar, J. Surace,"X-Ray Emission from Supernovae in Dense Circumstellar Matter Environments: A Search for Collisionless Shocks",Astrophysical Journal,2013,763:42,doi: 10.1088/0004-637X/763/1/42

The optical light curve of some supernovae (SNe) may be powered by the
outward diffusion of the energy deposited by the explosion shock (the
so-called shock breakout) in optically thick (

Michael F. Wehner,"Very extreme seasonal precipitation in the NARCCAP ensemble: model performance and projections",Climate Dynamics,January 2013,40:59-80,doi: 10.1007/s00382-012-1393-1

Seasonal extreme daily precipitation is analyzed in the ensemble of NARCAPP regional climate models. Significant variation in these models’ abilities to reproduce observed precipitation extremes over the contiguous United States is found. Model performance metrics are introduced to characterize overall biases, seasonality, spatial extent and the shape of the precipitation distribution. Comparison of the models to gridded observations that include an elevation correction is found to be better than to gridded observations without this correction. A complicated model weighting scheme based on model performance in simulating observations is found to cause significant improvements in ensemble mean skill only if some of the models are poorly performing outliers. The effect of lateral boundary conditions are explored by comparing the integrations driven by reanalysis to those driven by global climate models. Projected mid-century future changes in seasonal precipitation means and extremes are presented and discussions of the sources of uncertainty and the mechanisms causing these changes are presented.

George Michelogiannakis, William J. Dally,"Elastic Buffer Flow Control for On-Chip Networks",Transactions on Computers,2013,

Networks-on-chip (NoCs) were developed to meet the communication requirements of large-scale systems. The majority of current NoCs spend considerable area and power for router buffers. In our past work, we have developed elastic buffer (EB) flow control which adds simple control logic in the channels to use pipeline flip-flops (FFs) as EBs with two storage locations. This way, channels act as distributed FIFOs and input buffers are no longer required. Removing buffers and virtual channels (VCs) significantly simplifies router design. Compared to VC networks, EB networks provide an up to 45% shorter cycle time, 16% more throughput per unit power or 22% more throughput per unit area. EB networks provide traffic classes using duplicate physical subnetworks. However, this approach negates the cost gains or becomes infeasible for a large number of traffic classes. Therefore, in this paper we propose a hybrid EB-VC router which provides an arbitrary number of traffic classes by using an input buffer to drain flits facing severe contention or deadlock. Thus, hybrid routers operate as EB routers in the common case, and as VC routers when necessary. For this reason, the hybrid EB-VC scheme offers 21% more throughput per unit power than VC networks and 12% than EB networks.

H.-M. Eiter, M. Lavagnini, R. Hackl, E.A. Nowadnick, A.F. Kemper, T.P. Devereaux, J.-H. Chu, J.G. Analytis, I.R. Fisher, L. Degiorgi,"Alternative route to charge density wave formation in multiband systems",Proceedings of the National Academy of Sciences,2012,doi: 10.1073/pnas.1214745110

Charge and spin density waves, periodic modulations of the electron, and magnetization densities, respectively, are among the most abundant and nontrivial low-temperature ordered phases in condensed matter. The ordering direction is widely believed to result from the Fermi surface topology. However, several recent studies indicate that this common view needs to be supplemented. Here, we show how an enhanced electron–lattice interaction can contribute to or even determine the selection of the ordering vector in the model charge density wave system ErTe3. Our joint experimental and theoretical study allows us to establish a relation between the selection rules of the electronic light scattering spectra and the enhanced electron–phonon coupling in the vicinity of band degeneracy points. This alternative proposal for charge density wave formation may be of general relevance for driving phase transitions into other broken-symmetry ground states, particularly in multiband systems, such as the iron-based superconductors.

Nils E. R. Zimmermann, Berend Smit, Frerich J. Keil,"Predicting Local Transport Coefficients at Solid-Gas Interfaces",J. Phys. Chem. C,2012,116:18878-1888,doi: 10.1021/jp3059855

The regular nanoporous structure make zeolite membranes attractive candidates for separating molecules on the basis of differences in transport rates (diffusion). Since improvements in synthesis have led to membranes as thin as several hundred nanometers by now, the slow transport in the boundary layer separating bulk gas and core of the nanoporous membrane is becoming increasingly important. Therefore, we investigate the predictability of the coefficient quantifying this local process, the surface permeability α, by means of a two-scale simulation approach. Methane tracer-release from the one-dimensional nanopores of an AFI-type zeolite is employed. Besides a pitfall in determining α on the basis of tracer exchange, we, importantly, present an accurate prediction of the surface permeability using readily available information from molecular simulations. Moreover, we show that the prediction is strongly influenced by the degree of detail with which the boundary region is modeled. It turns out that not accounting for the fact that molecules aiming to escape the host structure must indeed overcome two boundary regions yields too large a permeability by a factor of 1.7–3.3, depending on the temperature. Finally, our results have far-reaching implications for the design of future membrane applications.

Watch a movie illustrating the conditions of self- or tracer-diffusion here.

Richard L. Martin, Thomas F. Willems, Li-Chiang Lin, Jihan Kim, Joseph A. Swisher, Berend Smit & Maciej Haranczyk,"Similarity-Driven Discovery of Zeolite Materials for Adsorption-Based Separations",ChemPhysChem,August 22, 2012,13:3595-3597,

Crystalline porous materials can be exploited in many applications. Discovery of materials with optimum adsorption properties typically involves expensive brute-force characterization of large sets of materials. An alternative approach based on similarity searching that enables discovery of materials with optimum adsorption for CO2 and other molecules at a fraction of the cost of brute-force characterization is demonstrated.

This work was featured on the front cover of the journal, available here: http://onlinelibrary.wiley.com/doi/10.1002/cphc.201290074/abstract

Jihan Kim, Li-Chiang Lin, Richard L. Martin, Joseph A. Swisher, Maciej Haranczyk & Berend Smit,"Large-Scale Computational Screening of Zeolites for Ethane/Ethene Separation",Langmuir,July 11, 2012,28:11914–1191,

Large-scale computational screening of thirty thousand zeolite structures was conducted to find optimal structures for separation of ethane/ethene mixtures. Efficient grand canonical Monte Carlo (GCMC) simulations were performed with graphics processing units (GPUs) to obtain pure component adsorption isotherms for both ethane and ethene. We have utilized the ideal adsorbed solution theory (IAST) to obtain the mixture isotherms, which were used to evaluate the performance of each zeolite structure based on its working capacity and selectivity. In our analysis, we have determined that specific arrangements of zeolite framework atoms create sites for the preferential adsorption of ethane over ethene. The majority of optimum separation materials can be identified by utilizing this knowledge and screening structures for the presence of this feature will enable the efficient selection of promising candidate materials for ethane/ethene separation prior to performing molecular simulations.

J. K. Freericks, A. Y. Liu, A. F. Kemper, T. P. Devereaux,"Pulsed high harmonic generation of light due to pumped Bloch oscillations in noninteracting metals",Physica Scripta,2012,T151:014062,doi: 10.1088/0031-8949/2012/T151/014062

We derive a simple theory for high-order harmonic generation due to pumping a noninteracting metal with a large amplitude oscillating electric field. The model assumes that the radiated light field arises from the acceleration of electrons due to the time-varying current generated by the pump, and also assumes that the system has a constant density of photoexcited carriers, hence it ignores the dipole excitation between bands (which would create carriers in semiconductors). We examine the circumstances under which odd harmonic frequencies would be expected to dominate the spectrum of radiated light, and we also apply the model to real materials like ZnO, for which high-order harmonic generation has already been demonstrated in experiments.

Li-Chiang Lin, Adam H. Berger, Richard L. Martin, Jihan Kim, Joseph A. Swisher, Kuldeep Jariwala, Chris H. Rycroft, Abhoyjit S. Bhown, Michael W. Deem, Maciej Haranczyk & Berend Smit,"In Silico Screening of Carbon Capture Materials",Nature Materials,May 27, 2012,11:633–641,

One of the main bottlenecks to deploying large-scale carbon dioxide capture and storage (CCS) in power plants is the energy required to separate the CO2 from flue gas. For example, near-term CCS technology applied to coal-fired power plants is projected to reduce the net output of the plant by some 30% and to increase the cost of electricity by 60–80%. Developing capture materials and processes that reduce the parasitic energy imposed by CCS is therefore an important area of research. We have developed a computational approach to rank adsorbents for their performance in CCS. Using this analysis, we have screened hundreds of thousands of zeolite and zeolitic imidazolate framework structures and identified many different structures that have the potential to reduce the parasitic energy of CCS by 30–40% compared with near-term technologies.

W.S. Lee, Y.D. Chuang, R.G. Moore, Y. Zhu, L. Patthey, M. Trigo, D.H. Lu, P.S. Kirchmann, O. Krupin, M. Yi, M. Langner, N. Huse, J.S. Robinson, Y. Chen, S.Y. Zhou, G. Coslovich, B. Huber, D.A. Reis, R.A. Kaindl, R.W. Schoenlein, D. Doering, P. Denes, W.F. Schlotter, J.J. Turner, S.L. Johnson, M. Först, T. Sasagawa, Y.F. Kung, A.P. Sorini, A.F. Kemper, B. Moritz, T.P. Devereaux, D.-H. Lee, Z.X. Shen & Z. Hussain,"Phase fluctuations and the absence of topological defects in a photo-excited charge-ordered nickelate",Nature Communications 3, Article number: 838,May 15, 2012,

The dynamics of an order parameter's amplitude and phase determines the collective behaviour of novel states emerging in complex materials. Time- and momentum-resolved pump-probe spectroscopy, by virtue of measuring material properties at atomic and electronic time scales out of equilibrium, can decouple entangled degrees of freedom by visualizing their corresponding dynamics in the time domain. Here we combine time-resolved femotosecond optical and resonant X-ray diffraction measurements on charge ordered La1.75Sr0.25NiO4 to reveal unforeseen photoinduced phase fluctuations of the charge order parameter. Such fluctuations preserve long-range order without creating topological defects, distinct from thermal phase fluctuations near the critical temperature in equilibrium. Importantly, relaxation of the phase fluctuations is found to be an order of magnitude slower than that of the order parameter's amplitude fluctuations, and thus limits charge order recovery. This new aspect of phase fluctuations provides a more holistic view of the phase's importance in ordering phenomena of quantum matter.

Fuyu Li, Daniele Rosa, William D. Collins, and Michael F. Wehner,"“Super-parameterization”: A better way to simulate regional extreme precipitation?",Journal of Advances in Modeling Earth Systems,April 4, 2012,4, doi: 10.1029/2011MS000106

Extreme precipitation is generally underestimated by current climate models relative to observations of present-day rainfall distributions. Possible causes of this systematic error include the convective parameterization in these models that have been designed to reproduce measurements of climatological mean precipitation. One possible approach to improve the interaction of subgrid-scale physical processes and large-scale climate is to replace the conventional convective parameterizations with a high-resolution cloud-system resolving model. A “super-parameterized” Community Atmosphere Model (SP-CAM) utilizing this approach is used in this study to investigate the distribution of extreme precipitation in the United States. Results show that SP-CAM better simulates the distributions of both light and intense precipitation compared to the standard version of CAM based upon conventional parameterizations. The improvements are mostly seen in regions dominated by convective precipitation, suggesting that super-parameterization provides a better representation of subgrid convective processes.

Erjun Kan, Wei Hu, Chuanyun Xiao, Ruifeng Lu, Kaiming Deng, Jinlong Yang and Haibin Su,"Half-Metallicity in Organic Single Porous Sheets",J. Am. Chem. Soc., 2012, 134 (13), 5718–5721,March 22, 2012,doi: 10.1021/ja210822c

The unprecedented applications of two-dimensional (2D) atomic sheets in spintronics are formidably hindered by the lack of ordered spin structures. Here we present first-principles calculations demonstrating that the recently synthesized dimethylmethylene-bridged triphenylamine (DTPA) porous sheet is a ferromagnetic half-metal and that the size of the band gap in the semiconducting channel is roughly 1 eV, which makes the DTPA sheet an ideal candidate for a spin-selective conductor. In addition, the robust half-metallicity of the 2D DTPA sheet under external strain increases the possibility of applications in nanoelectric devices. In view of the most recent experimental progress on controlled synthesis, organic porous sheets pave a practical way to achieve new spintronics.

Jihan Kim, Richard L. Martin, Oliver Rübel, Maciej Haranczyk & Berend Smit,"High-throughput Characterization of Porous Materials Using Graphics Processing Units",Journal of Chemical Theory and Computation,March 16, 2012,8:1684–1693,LBNL 5409E, doi: 10.1021/ct200787v

We have developed a high-throughput graphics processing unit (GPU) code that can characterize a large database of crystalline porous materials. In our algorithm, the GPU is utilized to accelerate energy grid calculations, where the grid values represent interactions (i.e., Lennard-Jones + Coulomb potentials) between gas molecules (i.e., CH4 and CO2) and materials’ framework atoms. Using a parallel flood fill central processing unit (CPU) algorithm, inaccessible regions inside the framework structures are identified and blocked, based on their energy profiles. Finally, we compute the Henry coefficients and heats of adsorption through statistical Widom insertion Monte Carlo moves in the domain restricted to the accessible space. The code offers significant speedup over a single core CPU code and allows us to characterize a set of porous materials at least an order of magnitude larger than those considered in earlier studies. For structures selected from such a prescreening algorithm, full adsorption isotherms can be calculated by conducting multiple Grand Canonical Monte Carlo (GCMC) simulations concurrently within the GPU.

Xiaodan Gu, Zuwei Liu, Ilja Gunkel, Slim Chourou, Sung Woo Hong, Deirdre Olynick, Thomas P. Russell,"High Aspect Ratio Sub-15nm Silicon Trenches From Block Copolymer Templates",Advanced Materials,2012,24:5688,

High-aspect-ratio sub-15-nm silicon trenches are fabricated directly from plasma etching of a block copolymer mask. A novel method that combines a block copolymer reconstruction process and reactive ion etching is used to make the polymer mask. Silicon trenches are characterized by various methods and used as a master for subsequent imprinting of different materials. Silicon nanoholes are generated from a block copolymer with cylindrical microdomains oriented normal to the surface.

Richard L. Martin, Prabhat, David D. Donofrio, James A. Sethian & Maciej Haranczyk,"Accelerating Analysis of void spaces in porous materials on multicore and GPU platforms",International Journal of High Performance Computing Applications,February 5, 2012,26:347-357,

Developing computational tools that enable discovery of new materials for energy-related applications is a challenge. Crystalline porous materials are a promising class of materials that can be used for oil refinement, hydrogen or methane storage as well as carbon dioxide capture. Selecting optimal materials for these important applications requires analysis and screening of millions of potential candidates. Recently, we proposed an automatic approach based on the Fast Marching Method (FMM) for performing analysis of void space inside materials, a critical step preceding expensive molecular dynamics simulations. This breakthrough enables unsupervised, high-throughput characterization of large material databases. The algorithm has three steps: (1) calculation of the cost-grid which represents the structure and encodes the occupiable positions within the void space; (2) using FMM to segment out patches of the void space in the grid of (1), and find how they are connected to form either periodic channels or inaccessible pockets; and (3) generating blocking spheres that encapsulate the discovered inaccessible pockets and are used in proceeding molecular simulations. In this work, we expand upon our original approach through (A) replacement of the FMM-based approach with a more computationally efficient flood fill algorithm; and (B) parallelization of all steps in the algorithm, including a GPU implementation of the most computationally expensive step, the cost-grid generation. We report the acceleration achievable in each step and in the complete application, and discuss the implications for high-throughput material screening.

Nils E. R. Zimmermann, Sayee P. Balaji, Frerich J. Keil,"Surface Barriers of Hydrocarbon Transport Triggered by Ideal Zeolite Structures",J. Phys. Chem. C,2012,116:3677-3683,doi: 10.1021/jp2112389

Shedding light on the nature of surface barriers of nanoporous materials, molecular simulations (Monte Carlo, Reactive Flux) have been employed to investigate the tracer-exchange characteristics of hydrocarbons in defect-free single-crystal zeolite membranes. The concept of a critical membrane thickness as a quantitative measure of surface barriers is shown to be appropriate and advantageous. Nanopore smoothness, framework density, and thermodynamic state of the fluid phase have been identified as the most important influencing variables of surface barriers. Despite the ideal character of the adsorbent, our simulation results clearly support current experimental findings on MOF Zn(tbip) where a larger number of crystal defects caused exceptionally strong surface barriers. Most significantly, our study predicts that the ideal crystal structure without any such defects will already be a critical aspect of experimental analysis and process design in many cases of the upcoming class of extremely thin and highly oriented nanoporous membranes.

Watch here a movie that highlights how n-hexane molecules are adsorbed in a zeolite slab.

Alexandre J. Chorin, Xuemin Tu,"An iterative implementation of the implicit nonlinear filter",ESAIM: Mathematical Modelling and Numerical Analysis,2012,46:535--543,

Implicit sampling is a sampling scheme for particle filters, designed to move particles one-by-one so that they remain in high-probability domains. We present a new derivation of implicit sampling, as well as a new iteration method for solving the resulting algebraic equations.

T.C. Peterson, R. Heim, R. Hirsch, D. Kaiser, H. Brooks, N.S. Diffenbaugh, R. Dole, J. Giovannettone, K. Guiguis, T.R. Karl, R.W. Katz, K. Kunkel, D. Lettenmaier, G. J. McCabe, C.J. Paciorek, K.Ryberg, S.Schubert, V.B.S. Silva, B. Stewart, A.V. Vecchia, G. Villarini, R.S. Vose, J. Walsh, M.Wehner, D. Wolock, K. Wolter, C.A. Woodhouse and D. Wuebbles,"Monitoring and Understanding Trends in Extreme Storms: State of Knowledge",Bulletin of the American Meteorological Society,2012,doi: 10.1175/BAMS-D-11-00262.1

The state of knowledge regarding trends and an understanding of their causes is presented for a specific subset of extreme weather and climate types. For severe convective storms (tornadoes, hail storms, and severe thunderstorms), differences in time and space of practices of collecting reports of events make the use of the reporting database to detect trends extremely difficult. Overall, changes in the frequency of environments favorable for severe thunderstorms have not been statistically significant. For extreme precipitation, there is strong evidence for a nationally-averaged upward trend in the frequency and intensity of events. The causes of the observed trends have not been determined with certainty, although there is evidence that increasing atmospheric water vapor may be one factor. For hurricanes and typhoons, robust detection of trends in Atlantic and western North Pacific tropical cyclone (TC) activity is significantly constrained by data heterogeneity and deficient quantification of internal variability. Attribution of past TC changes is further challenged by a lack of consensus on the physical linkages between climate forcing and TC activity. As a result, attribution of trends to anthropogenic forcing remains controversial. For severe snowstorms and ice storms, the number of severe regional snowstorms that occurred since 1960 was more than twice that of the preceding 60 years. There are no significant multi-decadal trends in the areal percentage of the contiguous U.S. impacted by extreme seasonal snowfall amounts since 1900. There is no distinguishable trend in the frequency of ice storms for the U.S. as a whole since 1950.

Richard L. Martin, Berend Smit & Maciej Haranczyk,"Addressing Challenges of Identifying Geometrically Diverse Sets of Crystalline Porous Materials",Journal of Chemical Information and Modeling,November 18, 2011,52:308–318,

Crystalline porous materials have a variety of uses, such as for catalysis and separations. Identifying suitable materials for a given application can, in principle, be done by screening material databases. Such a screening requires automated high-throughput analysis tools that calculate topological and geometrical parameters describing pores. These descriptors can be used to compare, select, group, and classify materials. Here, we present a descriptor that captures shape and geometry characteristics of pores. Together with proposed similarity measures, it can be used to perform diversity selection on a set of porous materials. Our representations are histogram encodings of the probe-accessible fragment of the Voronoi network representing the void space of a material. We discuss and demonstrate the application of our approach on the International Zeolite Association (IZA) database of zeolite frameworks and the Deem database of hypothetical zeolites, as well as zeolitic imidazolate frameworks constructed from IZA zeolite structures. The diverse structures retrieved by our method are complementary to those expected by emphasizing diversity in existing one-dimensional descriptors, e.g., surface area, and similar to those obtainable by a (subjective) manual selection based on materials’ visual representations. Our technique allows for reduction of large sets of structures and thus enables the material researcher to focus efforts on maximally dissimilar structures.

T. Kosar, M. Balman, E. Yildirim, S. Kulasekaran, B. Ross,"Stork Data Scheduler: Mitigating the Data Bottleneck in e-Science",Philosophical Transactions of the Royal Society A, Vol.369 (2011), pp. 3254-3267,July 18, 2011,doi: 10.1098/rsta.2011.0148

In this paper, we present the Stork data scheduler as a solution for mitigating the data bottleneck in e-Science and data-intensive scientific discovery. Stork focuses on planning, scheduling, monitoring and management of data placement tasks and application-level end-to-end optimization of networked inputs/outputs for petascale distributed e-Science applications. Unlike existing approaches, Stork treats data resources and the tasks related to data access and movement as first-class entities just like computational resources and compute tasks, and not simply the side-effect of computation. Stork provides unique features such as aggregation of data transfer jobs considering their source and destination addresses, and an application-level throughput estimation and optimization service. We describe how these two features are implemented in Stork and their effects on end-to-end data transfer performance.

George Michelogiannakis, Nan Jiang, Daniel U. Becker, William J. Dally,"Packet Chaining: Efficient Single-Cycle Allocation for On-Chip Networks",Computer Architecture Letters,July 1, 2011,

This paper introduces packet chaining, a simple and effective method to increase allocator matching efficiency and hence network performance, particularly suited to networks with short packets and short cycle times. Packet chaining operates by chaining packets destined to the same output together, to reuse the switch connection of a departing packet. This allows an allocator to build up an efficient matching over a number of cycles, like incremental allocation, but not limited by packet length. For a 64-node 2D mesh at maximum injection rate and with single-flit packets, packet chaining increases network throughput by 15% compared to a conventional single-iteration separable iSLIP allocator, outperforms a wavefront allocator, and gives comparable throughput with an augmenting paths allocator. Packet chaining achieves this performance with a cycle time comparable to a single-iteration separable allocator. Packet chaining also reduces average network latency by 22.5%. Finally, packet chaining increases IPC up to 46% (16% average) for application benchmarks because short packets are critical in a typical cache-coherent CMP. These are considerable improvements given the maturity of network-on-chip routers and allocators.

Nils E. R. Zimmermann, Maciej Haranczyk, Manju Sharma, Bei Liu, Berend Smit, Frerich J. Keil,"Adsorption and diffusion in zeolites: The pitfall of isotypic crystal structures",Mol. Simul.,2011,37:986-989,doi: 10.1080/08927022.2011.562502

The influence of isotypic crystal structures on adsorption and diffusion of methane in all-silica LTA, SAS and ITE zeolites is studied. Results obtained with the experimental structures are compared with structure predictions and approximations that are commonly employed. The results indicate that diffusion coefficients are much more affected than Henry coefficients. In fact, orders of magnitude deviations in the diffusivity can be observed and a systematic parameter study finally gives rise to the correlation between structure sensitivity and diffusion-window size.

George Michelogiannakis, Daniel U. Becker, William J. Dally,"Evaluating Elastic Buffer and Wormhole Flow Control",Transactions on Computers,2011,

With the emergence of on-chip networks, router buffer power has become a primary concern. Elastic buffer (EB) flow control utilizes existing pipeline flip-flops in the channels to implement distributed FIFOs, eliminating the need for input buffers at the routers. EB routers have been shown to be more efficient than virtual channel routers, as they do not require input buffers or complex logic for managing virtual channels and tracking credits. Wormhole routers are more comparable in terms of complexity because they also lack virtual channels. This paper compares EB and wormhole routers and explores novel hybrid designs to more closely examine the effect of design simplicity and input buffer cost. Our results show that EB routers have up to 25 percent smaller cycle time compared to wormhole and hybrid routers. Moreover, EB flow control requires 10 percent less energy to transfer a single bit through a router and offers three percent more throughput per unit energy as well as 62 percent more throughput per unit area. The main contributor to these results is the cost and delay overhead of the input buffer.

Michael Wehner, David R. Easterling, Jay H. Lawrimore, Richard R. Heim Jr., Russell S. Vose, and Benjamin Santer,"Projections of Future Drought in the Continental United States and Mexico",Journal of Hydrometerology,2011,12:1359–1377,doi: 10.1175/2011JHM1351.1

Using the Palmer drought severity index, the ability of 19 state-of-the-art climate models to reproduce observed statistics of drought over North America is examined. It is found that correction of substantial biases in the models’ surface air temperature and precipitation fields is necessary. However, even after a bias correction, there are significant differences in the models’ ability to reproduce observations. Using metrics based on the ability to reproduce observed temporal and spatial patterns of drought, the relationship between model performance in simulating present-day drought characteristics and their differences in projections of future drought changes is investigated. It is found that all models project increases in future drought frequency and severity. However, using the metrics presented here to increase confidence in the multimodel projection is complicated by a correlation between models’ drought metric skill and climate sensitivity. The effect of this sampling error can be removed by changing how the projection is presented, from a projection based on a specific time interval to a projection based on a specified temperature change. This modified class of projections has reduced intermodel uncertainty and could be suitable for a wide range of climate change impacts projections.

Fuyu Li, William Collins, Michael Wehner, David Williamson, Jerry Olson,"Response of precipitation extremes to idealized global warming in an aqua-planet climate model: towards a robust projection across different horizontal resolutions",Tellus,January 1, 2011,63, No.:876-883,doi: 10.1111/j.1600-0870.2011.00543.x

Current climate models produce quite heterogeneous projections for the responses of precipitation extremes to future climate change. To help understand the range of projections from multimodel ensembles, a series of idealized ‘aquaplanet’ Atmospheric General Circulation Model (AGCM) runs have been performed with the Community Atmosphere Model CAM3. These runs have been analysed to identify the effects of horizontal resolution on precipitation extreme projections under two simple global warming scenarios. We adopt the aquaplanet framework for our simulations to remove any sensitivity to the spatial resolution of external inputs and to focus on the roles of model physics and dynamics. Results show that a uniform increase of sea surface temperature (SST) and an increase of low-to-high latitude SST gradient both lead to increase of precipitation and precipitation extremes for most latitudes. The perturbed SSTs generally have stronger impacts on precipitation extremes than on mean precipitation. Horizontal model resolution strongly affects the global warming signals in the extreme precipitation in tropical and subtropical regions but not in high latitude regions. This study illustrates that the effects of horizontal resolution have to be taken into account to develop more robust projections of precipitation extremes.

Fuyu Li, William Collins, Michael Wehner, David Williamson, Jerry Olson, and Christopher Algieri,"Impact of horizontal resolution on simulation of precipitation extremes in an aqua-planet version of Community Atmospheric Model (CAM3)",Tellus,2011,63, No. :884-823,doi: 10.1111/j.1600-0870.2011.00544.x

One key question regarding current climate models is whether the projection of climate extremes converges to a realistic representation as the spatial and temporal resolutions of the model are increased. Ideally the model extreme statistics should approach a fixed distribution once the resolutions are commensurate with the characteristic length and time scales of the processes governing the formation of the extreme phenomena of interest. In this study, a series of AGCM runs with idealized ‘aquaplanet-steady-state’ boundary conditions have been performed with the Community Atmosphere Model CAM3 to investigate the effect of horizontal resolution on climate extreme simulations. The use of the aquaplanet framework highlights the roles of model physics and dynamics and removes any apparent convergence in extreme statistics due to better resolution of surface boundary conditions and other external inputs. Assessed at a same large spatial scale, the results show that the horizontal resolution and time step have strong effects on the simulations of precipitation extremes. The horizontal resolution has a much stronger impact on precipitation extremes than on mean precipitation. Updrafts are strongly correlated with extreme precipitation at tropics at all the resolutions, while positive low-tropospheric temperature anomalies are associated with extreme precipitation at mid-latitudes.

Chang, C.-Y., J. C. H. Chiang, M. F. Wehner, A. R. Friedman, R. Ruedy,"Sulfate Aerosol Control of Tropical Atlantic Climate over the Twentieth Century",Journal of Climate,2011,24:2540–2555,doi: 10.1175/2010JCLI4065.1

The tropical Atlantic interhemispheric gradient in sea surface temperature significantly influences the rainfall climate of the tropical Atlantic sector, including droughts over West Africa and Northeast Brazil. This gradient exhibits a secular trend from the beginning of the twentieth century until the 1980s, with stronger warming in the south relative to the north. This trend behavior is on top of a multidecadal variation associated with the Atlantic multidecadal oscillation. A similar long-term forced trend is found in a multimodel ensemble of forced twentieth-century climate simulations. Through examining the distribution of the trend slopes in the multimodel twentieth-century and preindustrial models, the authors conclude that the observed trend in the gradient is unlikely to arise purely from natural variations; this study suggests that at least half the observed trend is a forced response to twentieth-century climate forcings. Further analysis using twentieth-century single-forcing runs indicates that sulfate aerosol forcing is the predominant cause of the multimodel trend. The authors conclude that anthropogenic sulfate aerosol emissions, originating predominantly from the Northern Hemisphere, may have significantly altered the tropical Atlantic rainfall climate over the twentieth century.

Robert I. Saye, James A. Sethian,"The Voronoi Implicit Interface Method for computing multiphase physics",Proceedings of the National Academy of Sciences,2011,108:19498--195,doi: 10.1073/pnas.1111557108

We introduce a numerical framework, the Voronoi Implicit Interface Method for tracking multiple interacting and evolving regions (phases) whose motion is determined by complex physics (fluids, mechanics, elasticity, etc.), intricate jump conditions, internal constraints, and boundary conditions. The method works in two and three dimensions, handles tens of thousands of interfaces and separate phases, and easily and automatically handles multiple junctions, triple points, and quadruple points in two dimensions, as well as triple lines, etc., in higher dimensions. Topological changes occur naturally, with no surgery required. The method is first-order accurate at junction points/lines, and of arbitrarily high-order accuracy away from such degeneracies. The method uses a single function to describe all phases simultaneously, represented on a fixed Eulerian mesh. We test the method’s accuracy through convergence tests, and demonstrate its applications to geometric flows, accurate prediction of von Neumann’s law for multiphase curvature flow, and robustness under complex fluid flow with surface tension and large shearing forces.

A. Napov and Y. Notay,"Algebraic Analysis of Aggregation-Based Multigrid",Numer. Lin. Alg. Appl., vol.18, pp. 539-564,2011,

the short version of the paper, winner of the Student Paper competition of 11th Copper Mountain Conference on Iterative Methods

T. Kosar, I. Akturk, M. Balman, X. Wang,"PetaShare: A Reliable, Efficient, and Transparent Distributed Storage Management System",Journal Scientific Programming archive Volume 19 Issue 1, January 2011 Pages 27-43,2011,

Modern collaborative science has placed increasing burden on data management infrastructure to handle the increasingly large data archives generated. Beside functionality, reliability and availability are also key factors in delivering a data management system that can efficiently and effectively meet the challenges posed and compounded by the unbounded increase in the size of data generated by scientific applications. We have developed a reliable and efficient distributed data storage system, PetaShare, which spans multiple institutions across the state of Louisiana. At the back-end, PetaShare provides a unified name space and efficient data movement across geographically distributed storage sites. At the front-end, it provides light-weight clients the enable easy, transparent and scalable access. In PetaShare, we have designed and implemented an asynchronously replicated multi-master metadata system for enhanced reliability and availability, and an advanced buffering system for improved data transfer performance. In this paper, we present the details of our design and implementation, show performance results, and describe our experience in developing a reliable and efficient distributed data management system for data-intensive science.

M.O. Williams, J. Wilkening, E. Shlizerman, J. N. Kutz,"Continuation of periodic solutions in the waveguide array mode-locked laser",Physica D: Nonlinear Phenomena,2011,240:1791--1804,doi: 10.1016/j.physd.2011.06.018

We apply the adjoint continuation method to construct highly accurate, periodic solutions that are observed to play a critical role in the multi-pulsing transition of mode-locked laser cavities. The method allows for the construction of solution branches and the identification of their bifurcation structure. Supplementing the adjoint continuation method with a computation of the Floquet multipliers allows for explicit determination of the stability of each branch. This method reveals that, when gain is increased, the multi-pulsing transition starts with a Hopf bifurcation, followed by a period-doubling bifurcation, and a saddle-node bifurcation for limit cycles. Finally, the system exhibits chaotic dynamics and transitions to the double-pulse solutions. Although this method is applied specifically to the waveguide array mode-locking model, the multi-pulsing transition is conjectured to be ubiquitous and these results agree with experimental and computational results from other models.

Abhinav Sarje, Srinivas Aluru,"Accelerating Pairwise Computations on Cell Processors",Transactions on Parallel and Distributed Systems (TPDS),January 2011,22-1:69-77,doi: 10.1109/TPDS.2010.65

Direct computation of all pairwise distances or interactions is a fundamental problem that arises in many application areas including particle or atomistic simulations, fluid dynamics, computational electromagnetics, materials science, genomics and systems biology, and clustering and data mining. In this paper, we present methods for performing such pairwise computations efficiently in parallel on Cell processors. This problem is particularly challenging on the Cell processor due to the small sized Local Stores of the Synergistic Processing Elements, the main computational cores of the processor. We present techniques for different variants of this problem including those with large number of entities or when the dimensionality of the information per entity is large. We demonstrate our methods in the context of multiple applications drawn from fluid dynamics, materials science and systems biology, and present detailed experimental results. Our software library is an open source and can be readily used by application scientists to accelerate pairwise computations using Cell accelerators.

E. Vecharynski and J. Langou,"Any admissible cycle-convergence behavior is possible for restarted GMRES at its initial cycles",Numerical Linear Algebra with Applications Vol. 18, Issue 3, pp. 499-511,2011,

We show that any admissible cycle-convergence behavior is possible for restarted GMRES at a number of initial cycles, moreover the spectrum of the coefficient matrix alone does not determine this cycle-convergence. The latter can be viewed as an extension of the result of Greenbaum, Pták and Strakosˇ (SIAM Journal on Matrix Analysis and Applications 1996; 17(3):465–469) to the case of restarted GMRES.

Daniel Sanchez, George Michelogiannakis, Christos Kozyrakis,"An Analysis of Interconnection Networks for Large Scale Chip Multiprocessors",Transactions on Architecture and Code Optimization,2010,

With the number of cores of chip multiprocessors (CMPs) rapidly growing as technology scales down, connecting the different components of a CMP in a scalable and efficient way becomes increasingly challenging. In this article, we explore the architectural-level implications of interconnection network design for CMPs with up to 128 fine-grain multithreaded cores. We evaluate and compare different network topologies using accurate simulation of the full chip, including the memory hierarchy and interconnect, and using a diverse set of scientific and engineering workloads.

We find that the interconnect has a large impact on performance, as it is responsible for 60% to 75% of the miss latency. Latency, and not bandwidth, is the primary performance constraint, since, even with many threads per core and workloads with high miss rates, networks with enough bandwidth can be efficiently implemented for the system scales we consider. From the topologies we study, the flattened butterfly consistently outperforms the mesh and fat tree on all workloads, leading to performance advantages of up to 22%. We also show that considering interconnect and memory hierarchy together when designing large-scale CMPs is crucial, and neglecting either of the two can lead to incorrect conclusions. Finally, the effect of the interconnect on overall performance becomes more important as the number of cores increases, making interconnection choices especially critical when scaling up.

David M. Ambrose, Jon Wilkening,"Computation of symmetric, time-periodic solutions of the vortex sheet with surface tension",Proceedings of the National Academy of Sciences,2010,doi: 10.1073/pnas.0910830107

A numerical method is introduced for the computation of time-periodic vortex sheets with surface tension separating two immiscible, irrotational, two-dimensional ideal fluids of equal density. The approach is based on minimizing a nonlinear functional of the initial conditions and supposed period that is positive unless the solution is periodic, in which case it is zero. An adjoint-based optimal control technique is used to efficiently compute the gradient of this functional. Special care is required to handle singular integrals in the adjoint formulation. Starting with a solution of the linearized problem about the flat rest state, a family of smooth, symmetric breathers is found that, at quarter-period time intervals, alternately pass through a flat state of maximal kinetic energy, and a rest state in which all the energy is stored as potential energy in the interface. In some cases, the interface overturns before returning to the initial, flat configuration. It is found that the bifurcation diagram describing these solutions contains several disjoint curves separated by near-bifurcation events.

Mehmet Balman, Tevfik Kosar,"Error Detection and Error Classification: Failure Awareness in Data Transfer Scheduling,",International Journal of Autonomic Computing 2010 - Vol. 1, No.4 pp. 425 - 446, DOI: 10.1504/IJAC.2010.037516,2010,doi: http://dx.doi.org/10.1504/IJAC.2010.037516

Data transfer in distributed environment is prone to frequent failures resulting from back-end system level problems, like connectivity failure which is technically untraceable by users. Error messages are not logged efficiently, and sometimes are not relevant/useful from users' point-of-view. Our study explores the possibility of efficient error detection and reporting system for such environments. Prior knowledge about the environment and awareness of the actual reason behind a failure would enable higher level planners to make better and accurate decisions. It is necessary to have well defined error detection and error reporting methods to increase the usability and serviceability of existing data transfer protocols and data management systems. We investigate the applicability of early error detection and error classification techniques and propose an error reporting framework and a failure-aware data transfer life cycle to improve arrangement of data transfer operations and to enhance decision making of data transfer schedulers.

L. C. Lee, S. J. S. Morris, J. Wilkening,"Stress concentrations, diffusionally accommodated grain boundary sliding and the viscoelasticity of polycrystals",Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science,2010,doi: 10.1098/rspa.2010.0447

Using analytical and numerical methods, we analyse the Raj–Ashby bicrystal model of diffusionally accommodated grain-boundary sliding for finite interface slopes. Two perfectly elastic layers of finite thickness are separated by a given fixed spatially periodic interface. Dissipation occurs by time-periodic shearing of the viscous interfacial region, and by time-periodic grain-boundary diffusion. Although two time scales govern these processes, of particular interest is the characteristic time tD for grain-boundary diffusion to occur over distances of order of the grain size. For seismic frequencies ωtD≫1, we find that the spectrum of mechanical loss Q−1 is controlled by the local stress field near corners. For a simple piecewise linear interface having identical corners, this localization leads to a simple asymptotic form for the loss spectrum: for ωtD≫1, Q−1∼const.ω−α. The positive exponent α is determined by the structure of the stress field near the corners, but depends both on the angle subtended by the corner and on the orientation of the interface; the value of α for a sawtooth interface having 120° angles differs from that for a truncated sawtooth interface whose corners subtend the same 120° angle. When corners on an interface are not all identical, the behaviour is even more complex. Our analysis suggests that the loss spectrum of a finely grained solid results from volume averaging of the dissipation occurring in the neighbourhood of a randomly oriented three-dimensional network of grain boundaries and edges.

Wehner, M.F.,"Sources of uncertainty in the extreme value statistics of climate data",Extremes,2010,13:205-217,doi: 10.1007/s10687-010-0105-7

We investigate three sources of uncertainty in the calculation of extreme
value statistics for observed and modeled climate data. Inter-model differences in
formulation, unforced internal variability and choice of statistical model all contribute
to uncertainty. Using fits to the GEV distribution to obtain 20 year return
values, we quantify these uncertainties for the annual maximum daily mean surface
air temperatures of pre-industrial control runs from 15 climate models in the CMIP3
dataset.

Alexandre J. Chorin, Xuemin Tu,"Interpolation and iteration for nonlinear filters",ESAIM: Mathematical Modelling and Numerical Analysis. Submitted,2010,arXiv:0910,

We present a general form of the iteration and interpolation process used in implicit particle filters. Implicit filters are based on a pseudo-Gaussian representation of posterior densities, and are designed to focus the particle paths so as to reduce the number of particles needed in nonlinear data assimilation. Examples are given.

Wehner, M.F. ,R. Smith, P. Duffy, G. Bala,"The effect of horizontal resolution on simulation of very extreme US precipitation events in a global atmosphere model.",Climate Dynamics,2010,32:241-247,doi: 10.1007/s00382-009-0656-y

We investigate the ability of a global atmospheric general circulation model (AGCM) to reproduce observed 20 year return values of the annual maximum daily precipitation totals over the continental United States as a function of horizontal resolution. We find that at the high resolutions enabled by contemporary supercomputers, the AGCM can produce values of comparable magnitude to high quality observations. However, at the resolutions typical of the coupled general circulation models used in the Fourth Assessment Report of the Intergovernmental Panel on Climate Change, the precipitation return values are severely underestimated.

E. Vecharynski, J. Langou,"The cycle-convergence of restarted GMRES for normal matrices is sublinear",SIAM Journal on Scientific Computing Vol. 32, Issue 1, pp. 186-196,2010,

We prove that the cycle-convergence of the restarted GMRES applied to a system of linear equations with a normal coefficient matrix is sublinear.

Nils E. R. Zimmermann, Berend Smit, Frerich J. Keil,"On the Effects of the External Surface on the Equilibrium Transport in Zeolite Crystals",J. Phys. Chem. C,2009,114:300-310,doi: 10.1021/jp904267a

With the aid of molecular simulation techniques (molecular dynamics, grand-canonical Monte Carlo, and reactive flux correlation function RFCF), the influence of the external surface on the equilibrium permeation of methane and ethane into and out of an AFI-type zeolite crystal has been studied. In particular, “extended dynamically corrected transition state theory”, which has been proven to describe the transport of tracers in periodic crystals correctly, has been applied to surface problems. The results suggest that the molecules follow paths that are close to the pore wall in the interior and also at the crystal surface. Moreover, the recrossing rate at the surface turns out to be non-negligible, yet, in contrast to the intracrystalline recrossing rate, remains almost constant over loading which gives indication to diffusive barrier crossing at the crystal surface. As a consequence of very different adsorption and desorption barriers, the corresponding permeabilities are shown to be not equal for one and the same condition (T and p). The critical crystal length, beyond which surface effects can be certainly neglected, is computed on basis of flux densities. Entrance/exit effects, in the present cases, are practically important solely for ethane at low pressures. The influence of the type of external surface on the surface flux is, hereby, rather small, because the transport at the surface is controlled by the slow supply from the gas phase. This has been evidenced by a simplified thermodynamic model that has been derived within this work and which is based on rapidly assessable simulation data. Finally, we propose a procedure for estimating the importance of different factors that have an impact on surface effects.

M. Wehner, L. Oliker., and J. Shalf,"Low Power Supercomputers",IEEE Spectrum,October 2009,

High-performance computing for such things as climate modeling is not going to advance at anything like the pace it has during the last two decades unless we apply fundamentally new ideas. Here we describe one possible approach. Rather than constructing supercomputers from the kinds of microprocessors found in fast desktop computers or servers, we propose adopting designs and design principles drawn, oddly enough, from the portable-electronics marketplace.

Easterling, D. R., and M. F. Wehner,"Is the climate warming or cooling?",Geophys. Res. Lett.,April 2009,36, L087, doi: 10.1029/2009GL037810

Numerous websites, blogs and articles in the media have claimed that the climate is no longer warming, and is now cooling. Here we show that periods of no trend or even cooling of the globally averaged surface air temperature are found in the last 34 years of the observed record, and in climate model simulations of the 20th and 21st century forced with increasing greenhouse gases. We show that the climate over the 21st century can and likely will produce periods of a decade or two where the globally averaged surface air temperature shows no trend or even slight cooling in the presence of longer-term warming.

Chris H. Rycroft,"VORO++: A three-dimensional Voronoi cell library in C++",Chaos: An Interdisciplinary Journal of Nonlinear Science,2009,19:041111,LBNL 1432E, doi: 10.1063/1.3215722

Voro++ is a free software library for the computation of three dimensional Voronoi cells. It is primarily designed for applications in physics and materials science, where the Voronoi tessellation can be a useful tool in the analysis of densely-packed particle systems, such as granular materials or glasses. The software comprises of several C++ classes that can be modified and incorporated into other programs. A command-line utility is also provided that can use most features of the code. Voro++ makes use of a direct cell-by-cell construction, which is particularly suited to handling special boundary conditions and walls. It employs algorithms which are tolerant for numerical precision errors, and it has been successfully employed on very large particle systems.

Alexandre J. Chorin, Xuemin Tu,"Implicit sampling for particle filters",Proceedings of the National Academy of Sciences,2009,106:17249-1725,doi: 10.1073/pnas.0909196106

We present a particle-based nonlinear filtering scheme, related to recent work on chainless Monte Carlo, designed to focus particle paths sharply so that fewer particles are required. The main features of the scheme are a representation of each new probability density function by means of a set of functions of Gaussian variables (a distinct function for each particle and step) and a resampling based on normalization factors and Jacobians. The construction is demonstrated on a standard, ill-conditioned test problem.

M. Haranczyk, J. A. Sethian,"Navigating molecular worms inside chemical labyrinths",Proceedings of the National Academy of Sciences,2009,106:21472-2147,doi: 10.1073/pnas.0910016106

Predicting whether a molecule can traverse chemical labyrinths of channels, tunnels, and buried cavities usually requires performing computationally intensive molecular dynamics simulations. Often one wants to screen molecules to identify ones that can pass through a given chemical labyrinth or screen chemical labyrinths to identify those that allow a given molecule to pass. Because it is impractical to test each molecule/labyrinth pair using computationally expensive methods, faster, approximate methods are used to prune possibilities, “triaging” the ability of a proposed molecule to pass through the given chemical labyrinth. Most pruning methods estimate chemical accessibility solely on geometry, treating atoms or groups of atoms as hard spheres with appropriate radii. Here, we explore geometric configurations for a moving “molecular worm,” which replaces spherical probes and is assembled from solid blocks connected by flexible links. The key is to extend the fast marching method, which is an ordered upwind one-pass Dijkstra-like method to compute optimal paths by efficiently solving an associated Eikonal equation for the cost function. First, we build a suitable cost function associated with each possible configuration, and second, we construct an algorithm that works in ensuing high-dimensional configuration space: at least seven dimensions are required to account for translational, rotational, and internal degrees of freedom. We demonstrate the algorithm to study shortest paths, compute accessible volume, and derive information on topology of the accessible part of a chemical labyrinth. As a model example, we consider an alkane molecule in a porous material, which is relevant to designing catalysts for oil processing.

B.D. Santer, K.E. Taylor, P.J. Gleckler, C. Bonﬁls, T.P. Barnett, D.W. Pierce, T.M.L. Wigley, C. Mears, F.J. Wentz, W. Brueggemann, N.P. Gillett, S.A. Klein, S. Solomon, P.A. Stott, and M.F. Wehner,"Incorporating Model Quality Information in Climate Change Detection and Attribution Studies.",Proceeding of the National Academy of Sciences,2009,

In a recent multimodel detection and attribution (D&A) study using the pooled results from 22 different climate models, the simulated “fingerprint” pattern of anthropogenically caused changes in water vapor was identifiable with high statistical confidence in satellite data. Each model received equal weight in the D&A analysis, despite large differences in the skill with which they simulate key aspects of observed climate. Here, we examine whether water vapor D&A results are sensitive to model quality. The “top 10” and “bottom 10” models are selected with three different sets of skill measures and two different ranking approaches. The entire D&A analysis is then repeated with each of these different sets of more or less skillful models. Our performance metrics include the ability to simulate the mean state, the annual cycle, and the variability associated with El Niño. We find that estimates of an anthropogenic water vapor fingerprint are insensitive to current model uncertainties, and are governed by basic physical processes that are well-represented in climate models. Because the fingerprint is both robust to current model uncertainties and dissimilar to the dominant noise patterns, our ability to identify an anthropogenic influence on observed multidecadal changes in water vapor is not affected by “screening” based on model quality.

Baron Peters, Nils E. R. Zimmermann, Gregg T. Beckham, Jefferson W. Tester, Bernhardt L. Trout,"Path Sampling Calculation of Methane Diffusivity in Natural Gas Hydrates from a Water-Vacancy Assisted Mechanism",J. Am. Chem. Soc.,2008,130:17342-1735,doi: 10.1021/ja802014m

Increased interest in natural gas hydrate formation and decomposition, coupled with experimental difficulties in diffusion measurements, makes estimating transport properties in hydrates an important technological challenge. This research uses an equilibrium path sampling method for free energy calculations [Radhakrishnan, R.; Schlick, T. J. Chem. Phys. 2004, 121, 2436] with reactive flux and kinetic Monte Carlo simulations to estimate the methane diffusivity within a structure I gas hydrate crystal. The calculations support a water-vacancy assisted diffusion mechanism where methane hops from an occupied “donor” cage to an adjacent “acceptor” cage. For pathways between cages that are separated by five-membered water rings, the free energy landscape has a high barrier with a shallow well at the top. For pathways between cages that are separated by six-membered water rings, the free energy calculations show a lower barrier with no stable intermediate. Reactive flux simulations confirm that many reactive trajectories become trapped in the shallow intermediate at the top of the barrier leading to a small transmission coefficient for these paths. Stable intermediate configurations are identified as doubly occupied off-pathway cages and methane occupying the position of a water vacancy. Rate constants are computed and used to simulate self-diffusion with a kinetic Monte Carlo algorithm. Self-diffusion rates were much slower than the Einstein estimate because of lattice connectivity and methane’s preference for large cages over small cages. Specifically, the fastest pathways for methane hopping are arranged in parallel (nonintersecting) channels, so methane must hop via a slow pathway to escape the channel. From a computational perspective, this paper demonstrates that equilibrium path sampling can compute free energies for a broader class of coordinates than umbrella sampling with molecular dynamics. From a technological perspective, this paper provides one estimate for an important transport property that has been difficult to measure. In a hydrate I crystal at 250 K with nearly all cages occupied by methane, we estimate D ≈ 7 × 10−15X m2/s where X is the fraction of unoccupied cages.

J.M. Rabaey, D. Burke, K. Lutz, J. Wawrzynek,"Workloads of the Future",IEEE Design Test of Computers,July 8, 2008,25:358-365,doi: 10.1109/MDT.2008.118

Along with changing technologies and design techniques, target applications span a wide range: from large-scale computing to personal services and perceptual interfaces. The authors of this article characterize these workloads of the future and argue for a new set of benchmarks to guide the exploration and optimization of future systems.

L. Pilipchuk, E. Vecharynski, Yu. H. Pesheva,"Solution of Large Linear Systems with Embedded Network Structure for a Non-Homogeneous Network Flow Programming Problem",Mathematica Balkanica Vol. 22, Fasc. 3-4, p. 235-254,2008,

In the paper the linear underdetermined system of a special type is considered. Systems of this type appear in non-homogeneous network flow programming problems in the form of systems of constraints and can be characterized as systems with a large sparse submatrix representing the embedded network structure. A direct method for finding solutions of the system is developed. The algorithm is based on the theoretic-graph specificities for the structure of the support and properties of the basis of a solution space of a homogeneous system. One of the key steps is decomposition of the system. A simple example is regarded at the end of the paper.

Nils E. R. Zimmermann, Sven Jakobtorweihen, Edith Beerdsen, Berend Smit, Frerich J. Keil,"In-Depth Study of the Influence of Host-Framework Flexibility on the Diffusion of Small Gas Molecules in One-Dimensional Zeolitic Pore Systems",J. Phys. Chem. C,2007,111:17370-1738,doi: 10.1021/jp0746446

Molecular-dynamics simulations are performed to understand the role of host−framework flexibility on the diffusion of methane molecules in the one-dimensional pores of AFI-, LTL-, and MTW-type zeolites. In particular, the impact of the choice of the host model is studied. Dynamically corrected Transition State Theory is used to provide insights into the diffusion mechanism on a molecular level. Free-energy barriers and dynamical correction factors can change significantly by introducing lattice flexibility. In order to understand the phenomenon of free-energy barriers reduction, we investigate the motion of the window atoms. The influence that host−framework flexibility exerts on gas diffusion in zeolites is, generally, a complex function of material, host model, and loading such that transferability of conclusions from one zeolite to the other is not guaranteed.

Katherine Yelick, Paul Hilfinger, Susan Graham, Dan Bonachea, Jimmy Su, Amir Kamil, Kaushik Datta, Phillip Colella, and Tong Wen,"Parallel Languages and Compilers: Perspective from the Titanium Experience",The International Journal Of High Performance Computing Applications,June 16, 2006,21,

We describe the rationale behind the design of key features of Titanium—an explicitly parallel dialect of JavaTM for high-performance scientific programming—and our experiences in building applications with the language. Specifically, we address Titanium’s Partitioned Global Address Space model, SPMD parallelism support, multi-dimensional arrays and array-index calculus, memory management, immutable classes (class-like types that are value types rather than reference types), operator overloading, and generic programming. We provide an overview of the Titanium compiler implementation, covering various parallel analyses and optimizations, Titanium runtime technology and the GASNet network communication layer. We summarize results and lessons learned from implementing the NAS parallel benchmarks, elliptic and hyperbolic solvers using Adaptive Mesh Refinement, and several applications of the Immersed Boundary method.

Best Paper Award