# Chao Yang

## Short Bio

Chao Yang received his Ph.D in computational mathematics from Rice Universisty in 1998. He worked at NEC Systems Lab, Inc, a subsidiary of NEC from 1998 to 1999. He was awarded the 1999 Householder fellow in scientific computing by Oak Ridge National Laboratory. He joined Lawrence Berkeley National Laboratory in 2000, and is currently a senior scientist. His research interests include:

- developing numerical algorithms and fast implementation to accelerate scientific simulations
- developing efficient and robust algorithms and implementations for solving inverse problems

His core expertise is in numerical linear algebra, optimization, large-scale data analysis and high performance computing. Over the last several years, he has successfully used techniques developed in these areas to solve problems in electronic structure calcuations, nuclear structure calculations, cavity design for accelerator models, single-particle analysis for cryo-electron microscopy, single molecular diffractive imaging, phase retrieval, ptychography etc.

## Journal Articles

### Fabien Bruneval, Tonatiuh Rangel, Samia M. Hamed, Meiyue Shao, Chao Yang, Jeffrey B. Neaton, "MOLGW 1: many-body perturbation theory software for atoms, molecules, and clusters", Computer Physics Communications (accepted), 2016,

### J. R. Jones, F.-H. Rouet, K. V. Lawler, E. Vecharynski, K. Z. Ibrahim, S. Williams, B. Abeln, C. Yang, C. W. McCurdy, D. J. Haxton, X. S. Li, T. N. Rescigno, "An efficient basis set representation for calculating electrons in molecules", Journal of Molecular Physics, 2016, doi: 10.1080/00268976.2016.1176262

The method of McCurdy, Baertschy, and Rescigno, J. Phys. B, 37, R137 (2004) is generalized to obtain a straightforward, surprisingly accurate, and scalable numerical representation for calculating the electronic wave functions of molecules. It uses a basis set of product sinc functions arrayed on a Cartesian grid, and yields 1 kcal/mol precision for valence transition energies with a grid resolution of approximately 0.1 bohr. The Coulomb matrix elements are replaced with matrix elements obtained from the kinetic energy operator. A resolution-of-the-identity approximation renders the primitive one- and two-electron matrix elements diagonal; in other words, the Coulomb operator is local with respect to the grid indices. The calculation of contracted two-electron matrix elements among orbitals requires only O(N log(N)) multiplication operations, not O(N^4), where N is the number of basis functions; N = n^3 on cubic grids. The representation not only is numerically expedient, but also produces energies and properties superior to those calculated variationally. Absolute energies, absorption cross sections, transition energies, and ionization potentials are reported for one- (He^+, H_2^+ ), two- (H_2, He), ten- (CH_4) and 56-electron (C_8H_8) systems.

### R. Li, Y. Xi, E. Vecharynski, C. Yang, and Y. Saad, "A Thick-Restart Lanczos algorithm with polynomial filtering for Hermitian eigenvalue problems", to appear in SISC, 2016,

Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a Thick-Restart version of the Lanczos algorithm with deflation ('locking') and a new type of polynomial filters obtained from a least-squares technique. The resulting algorithm can be utilized in a 'spectrum-slicing' approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different sub-intervals independently from one another.

### Z. Wen, C. Yang, X. Liu and Y. Zhang, "A Penalty-based Trace Minimization Method for Large-scale Eigenspace Computation", J. Sci. Comp., March 1, 2016, 66:1175-1203, doi: 10.1007/s10915-015-0061-0

### E. Vecharynski, C. Yang, and F. Xue, "Generalized preconditioned locally harmonic residual method for non-Hermitian eigenproblems", SIAM Journal on Scientific Computing, Vol. 38, No. 1, pp. A500–A527, 2016, doi: 10.1137/15M1027413

We introduce the Generalized Preconditioned Locally Harmonic Residual (GPLHR) method for solving standard and generalized non-Hermitian eigenproblems. The method is particularly useful for computing a subset of eigenvalues, and their eigen- or Schur vectors, closest to a given shift. The proposed method is based on block iterations and can take advantage of a preconditioner if it is available. It does not need to perform exact shift-and-invert transformation. Standard and generalized eigenproblems are handled in a unified framework. Our numerical experiments demonstrate that GPLHR is generally more robust and efficient than existing methods, especially if the available memory is limited.

### Wei Hu, Lin Lin, Chao Yang, Jun Dai and Jinlong Yang, "Edge-Modied Phosphorene Nano ake Heterojunctions as Highly Ecient Solar Cells", Nano Lett, February 5, 2016, 16:1675–1682, doi: 10.1021/acs.nanolett.5b04593

### L. Lin, Y. Saad and C. Yang, "Approximating spectral densities of large matrices", SIAM Review, February 1, 2016, 58:34–65, doi: 10.1137/130934283

### P. Li, X. Liu, M. Chen, P. Lin, X. Ren, L. Lin, C. Yang, L. He, "Large-scale ab initio simulations based on systematically improvable atomic basis", Computational Materials Science, February 1, 2016, 112:503–517, doi: doi:10.1016/j.commatsci.2015.07.004

### J. Brabec, C. Yang, E. Epifanovsky, A.I. Krylov, and E. Ng, "Reduced-cost sparsity-exploiting algorithm for solving coupled-cluster equations", Journal of Computational Chemistry, January 24, 2016, 37:1059–1067, doi: 10.1002/jcc.24293

### Meiyue Shao, Felipe H. da Jornada, Chao Yang, Jack Deslippe, Steven G. Louie, "Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem", Linear Algebra and its Applications, 2016, 488:148–167, doi: 10.1016/j.laa.2015.09.036

### M. van Setten; F. Carouso; S. Sharifzadeh; X. Ren; M. Scheffler; F. Liu; J. Lischner; L. Lin; J. Deslippe; S. Louie; C. Yang; F. Weigend; J. Neaton; F. Evers; P. Rinke, "GW 100: Benchmarking G0W0 for molecular systems", Journal of Chemical Theory and Computation, October 22, 2015,

### Jiri Brabec, Lin Lin, Meiyue Shao, Niranjan Govind, Chao Yang, Yousef Saad, Esmond G. Ng, "Fast Algorithms for Estimating the Absorption Spectrum within Linear Response Time-dependent Density Functional Theory", Journal of Chemical Theory and Computation, 2015, doi: 10.1021/acs.jctc.5b00887

### M. Ulbrich, Z. Wen, C. Yang, D. Klockner, Z. Lu, "A proximal gradient method for ensemble density functional theory", SIAM J. Sci. Comp., June 20, 2015, 37:A1975--A20, doi: 10.1137/14098973X

### Mathias Jacquelin, Lin Lin, Chao Yang, "A Distributed Memory Parallel Algorithm for Selected Inversion : the Symmetric Case", To appear in ACM Transactions on Mathematical Software (TOMS), May 28, 2015,

### Fang Liu, Lin Lin , Derek Vigil-Fowlerd , Johannes Lischnerd, Alexander F. Kemper, , Sahar Sharifzadehe, Felipe H. da Jornadad, Jack Deslippef, Chao Yangc, Jeffrey B. Neaton, Steven G. Louied,, "Numerical integration for ab initio many-electron self energy calculations within the GW approximation", Journal of Computational Physics, April 1, 2015,

### E. Vecharynski, C. Yang, J. E. Pask, "A projected preconditioned conjugate gradient algorithm for computing many extreme eigenpairs of a Hermitian matrix", Journal of Computational Physics, Vol. 290, pp. 73–89, 2015,

We present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix *A*. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of *A*. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimal block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.

### Wei Hu, Lin Lin and Chao Yang, "Edge reconstruction in armchair phosphorene nanoribbons revealed by discontinuous Galerkin density functional theory", Phys. Chem. Chem. Phys., 2015, Advance Article, February 11, 2015, doi: 10.1039/C5CP00333D

With the help of our recently developed massively parallel DGDFT (Discontinuous Galerkin Density Functional Theory) methodology, we perform large-scale Kohn–Sham density functional theory calculations on phosphorene nanoribbons with armchair edges (ACPNRs) containing a few thousands to ten thousand atoms. The use of DGDFT allows us to systematically achieve a conventional plane wave basis set type of accuracy, but with a much smaller number (about 15) of adaptive local basis (ALB) functions per atom for this system. The relatively small number of degrees of freedom required to represent the Kohn–Sham Hamiltonian, together with the use of the pole expansion the selected inversion (PEXSI) technique that circumvents the need to diagonalize the Hamiltonian, results in a highly efficient and scalable computational scheme for analyzing the electronic structures of ACPNRs as well as their dynamics. The total wall clock time for calculating the electronic structures of large-scale ACPNRs containing 1080–10 800 atoms is only 10–25 s per self-consistent field (SCF) iteration, with accuracy fully comparable to that obtained from conventional planewave DFT calculations. For the ACPNR system, we observe that the DGDFT methodology can scale to 5000–50 000 processors. We use DGDFT based ab initio molecular dynamics (AIMD) calculations to study the thermodynamic stability of ACPNRs. Our calculations reveal that a 2 × 1 edge reconstruction appears in ACPNRs at room temperature.

### D. Zuev, E. Vecharynski, C. Yang, N. Orms, and A.I. Krylov, "New algorithms for iterative matrix-free eigensolvers in quantum chemistry", Journal of Computational Chemistry, Vol. 36, Issue 5, pp. 273–284, 2015,

New algorithms for iterative diagonalization procedures that solve for a small set of eigen-states of a large matrix are described. The performance of the algorithms is illustrated by calculations of low and high-lying ionized and electronically excited states using equation-of-motion coupled-cluster methods with single and double substitutions (EOM-IP-CCSD and EOM-EE-CCSD). We present two algorithms suitable for calculating excited states that are close to a specified energy shift (interior eigenvalues). One solver is based on the Davidson algorithm, a diagonalization procedure commonly used in quantum-chemical calculations. The second is a recently developed solver, called the “Generalized Preconditioned Locally Harmonic Residual (GPLHR) method.” We also present a modification of the Davidson procedure that allows one to solve for a specific transition. The details of the algorithms, their computational scaling, and memory requirements are described. The new algorithms are implemented within the EOM-CC suite of methods in the Q-Chem electronic structure program.

### Wei Hu, Lin Lin, Chao Yang and Jinlong Yang, "Electronic structure and aromaticity of large-scale hexagonal graphene nanoflakes", J. Chem. Phys. 141, 214704 (2014), December 2, 2014, 141:214704, doi: 10.1063/1.4902806

- Download File: JCPGNFs.pdf (pdf: 3.7 MB)

With the help of the recently developed SIESTA-PEXSI method [L. Lin, A. García, G. Huhs, and C. Yang, J. Phys.: Condens. Matter26, 305503 (2014)], we perform Kohn-Sham density functional theory calculations to study the stability and electronic structure of hydrogen passivated hexagonal graphene nanoflakes (GNFs) with up to 11 700 atoms. We find the electronic properties of GNFs, including their cohesive energy, edge formation energy, highest occupied molecular orbital-lowest unoccupied molecular orbital energy gap, edge states, and aromaticity, depend sensitively on the type of edges (armchair graphene nanoflakes (ACGNFs) and zigzag graphene nanoflakes (ZZGNFs)), size and the number of electrons. We observe that, due to the edge-induced strain effect in ACGNFs, large-scale ACGNFs’ edge formation energydecreases as their size increases. This trend does not hold for ZZGNFs due to the presence of many edge states in ZZGNFs. We find that the energy gaps E g of GNFs all decay with respect to 1/L, where L is the size of the GNF, in a linear fashion. But as their size increases, ZZGNFs exhibit more localized edge states. We believe the presence of these states makes their gap decrease more rapidly. In particular, when L is larger than 6.40 nm, we find that ZZGNFs exhibit metallic characteristics. Furthermore, we find that the aromatic structures of GNFs appear to depend only on whether the system has 4N or 4N + 2 electrons, where N is an integer.

### J. Kaye, L. Lin and C. Yang, "A posteriori error estimator for adaptive local basis functions to solve Kohn-Sham density functional theory", Comm. Math. Sci., January 5, 2014, 13:1741--1740, doi: http://dx.doi.org/10.4310/CMS.2015.v13.n7.a5

### H. M. Aktulga, L. Lin, C. Haine, E. G. Ng, C. Yang, "Parallel Eigenvalue Calculation based on Multiple Shift-invert Lanczos and Contour Integral based Spectral Projection Method", Parallel Computing, December 6, 2013, in press,

### H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, "Improving the Scalability of a Symmetric Iterative Eigensolver for Multi-core Platforms", Concurrency and Computation: Practice & Experience, September 12, 2013, online, doi: 10.1002/cpe.3129

### L. Lin, M. Chen, C. Yang, L. He, "Accelerating Atomic Orbital-based Electronic Structure Calculation via Pole Expansion and Selected Inversion", J Phsy: Condens Matter, 2013,

### L. Lin, C. Yang, "Elliptic preconditioner for accelerating the self-consistent field iteration in Kohn-Sham Density Functional Theory", SIAM J. Sci. Comp., 2013,

### H. Hu, C. Yang, K. Zhao, "Absorption correction A* for cylindrical and spherical crystals with extended range and high accuracy calculated by Thorkildsen & Larsen analytical method", in press Acta Crystallographica, A, 2012,

### Zaiwen Wen, Chao Yang, Xin Liu, Stefano Marchesini, "Alternating direction methods for classical and ptychographic phase retrieval", Inverse Problems, November 2012, 28, doi: 10.1088/0266-5611/28/11/115010

### D. Y. Parkinson, C. Yang, C. Knoechel, C. A. Larabell, M. Le Gros, "Automatic alignment and reconstruction of images for soft X-ray tomography", J Struct Biol, February 2012, 177:259--266, doi: 10.1016/j.jsb.2011.11.027

### Filipe R.N.C. Maia, Chao Yang, Stefano Marchesini, "Compressive auto-indexing in femtosecond nanocrystallography", Ultramicroscopy, 2011, 111:807--811, LBNL 4598E,

### L. Lin, C. Yang, J. Lu, L. Ying, W. E, "A fast parallel algorithm for selected inversion of structured sparse matrices with application to 2D electronic structure calculations", SIAM J. Sci. Comput., 2011, 33:1329,

### L. Lin, C. Yang, J. Meza, J. Lu, L. Ying, W. E, "SelInv -- An algorithm for selected inversion of a sparse symmetric matrix", ACM Trans. Math. Software, 2011, 37:40,

## Conference Papers

### Mathias Jacquelin, Lin Lin, Weile Jia, Yonghua Zhao, Chao Yang, "A Left-Looking Selected Inversion Algorithm and Task Parallelism on Shared Memory Systems", Submitted to SuperComputing'16, May 10, 2016,

### M. Jacquelin, L. Lin, N. Wichmann and C. Yang, "Enhancing the scalability tree-based asynchronous communication", accepted IPDPS16, November 25, 2015,

### W.A. de Jong, L. Lin, H. Shan, C. Yang and L. Oliker, "Towards modelling complex mesoscale molecular environments", International Conference on Computational and Mathematical Methods in Science and Engineering (CMMSE), 2014,

### M. Jung, E. H. Wilson III, W. Choi, J. Shalf, H. M. Aktulga, C. Yang, E. Saule, U. V. Catalyurek, M. Kandemir, "Exploring the Future of Out-of-core Computing with Compute-Local Non-Volatile Memory", International Conference for High Performance Computing, Networking, Storage and Analysis 2013 (SC13), NY, USA, ACM New York, November 17, 2013, doi: 10.1145/2503210.2503261

### P. Maris, H. M. Aktulga, S. Binder, A. Calci, U. V. Catalyurek, J. Langhammer, E. G. Ng, E. Saule, R. Roth, J. P. Vary, C. Yang, "No Core CI calculations for light nuclei with chiral 2- and 3-body forces", J. Phys. Conf. Ser., IOP Publishing, August 1, 2013, 454:012063, doi: 10.1088/1742-6596/454/1/012063

###
P. Maris, H. M. Aktulga, M. A. Caprio, U. V. Catalyurek, E. G. Ng, D. Oryspayev, H. Potter, E.

Saule, M. Sosonkina, J. P. Vary, C. Yang, Z. Zhou,
"Large-scale Ab-initio Configuration Interaction Calculations for Light Nuclei",
J. Phys. Conf. Ser.,
IOP Publishing,
December 18, 2012,
403:012019,
doi: doi:10.1088/1742-6596/403/1/012019

### Z. Zhou, E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, U. V. Catalyurek, "An Out-of-core Eigensolver on SSD-equipped Clusters", 2012 IEEE International Conference on Cluster Computing (CLUSTER), Beijing, China, September 26, 2012, 248 - 256, doi: 10.1109/CLUSTER.2012.76

### Z. Zhou, E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, P. Maris, J. P. Vary, U. V. Catalyurek, "An Out-Of-Core Dataflow Middleware to Reduce the Cost of Large Scale Iterative Solvers", 2012 41st International Conference on Parallel Processing Workshops (ICPPW), Pittsburgh, PA, September 10, 2012, 71 - 80, doi: 10.1109/ICPPW.2012.13

### H. M. Aktulga, C. Yang, P. Maris, J. P. Vary, E. G. Ng, "Topology-Aware Mappings for Large-Scale Eigenvalue Problems", Euro-Par 2012 Parallel Processing Conference, Rhode Island, Greece, August 31, 2012, LNCS 748:830-842, doi: 10.1007/978-3-642-32820-6_82

### H. M. Aktulga, C. Yang, U. V. Catalyurek, P. Maris, J. P. Vary, E. G. Ng, "On Reducing I/O Overheads in Large-Scale Invariant Subspace Projections", Euro-Par 2011: Parallel Processing Workshops, Bordeaux, France, August 29, 2011, LNCS 715:305-314, doi: 10.1007/978-3-642-29737-3_35

### E. G. Ng, J. Sarich, S. M.Wild, T. Munson, H. M. Aktulga, C. Yang, P. Maris, J. P. Vary, N. Schunck, M. G. Bertolli, M. Kortelainen, W. Nazarewicz, T. Papenbrock, M. V. Stoitsov, "Advancing Nuclear Physics Through TOPS Solvers and Tools", SciDAC 2011 Conference, Denver, CO, July 10, 2011, arXiv:1110.1708,

### H. M. Aktulga, C. Yang, P. Maris, J. P. Vary, E. G. Ng, "Large-scale Parallel Null Space Calculation for Nuclear Configuration Interaction", 2011 International Conference on High Performance Computing and Simulation (HPCS), Istanbul, Turkey, July 8, 2011, 176 - 185, doi: 10.1109/HPCSim.2011.5999822

## Book Chapters

### E. Saule, H. M. Aktulga, C. Yang, E. G. Ng, U. V. Catalyurek, "An Out-of-core Task-based Middleware for Data Intensive Scientific Computing", Handbook on Data Centers, in press, (Springer: February 1, 2014)

## Presentation/Talks

### C. Yang, Absorption Spectrum Estimation via Linear Response TDDFT, Applied Math Seminar, Stanford University, May 13, 2015,

### C. Yang, Fast Numerical Algorithms for Large-scale Electronic Structure Calculations, DOE BES Computational and Theoretical Chemistry PI Meeting, April 28, 2015,

### C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Math Colloquium, Michigan Tech University, April 24, 2015,

### C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Applied math & PDE seminar, UC Davis, April 14, 2015,

### C. Yang, Fast Numerical Methods for Computational Materials Science and Chemistry, CRD All-hands meeting, March 4, 2015,

### C. Yang, Fast Numerical Methods for Electronic Structure Calculations, Workshop on High Performance and Parallel Computing Methods and Algorithms for Materials Defects, Singapore, February 9, 2015,

## Reports

### E. Vecharynski and C. Yang, "Preconditioned iterative methods for eigenvalue counts", submitted, 2016,

We describe preconditioned iterative methods for estimating the number of eigenvalues of a Hermitian matrix within a given interval. Such estimation is useful in a number of applications.In particular, it can be used to develop an efficient spectrum-slicing strategy to compute many eigenpairs of a Hermitian matrix. Our method is based on the Lanczos- and Arnoldi-type of iterations. We show that with a properly defined preconditioner, only a few iterations may be needed to obtain a good estimate of the number of eigenvalues within a prescribed interval. We also demonstrate that the number of iterations required by the proposed preconditioned schemes is independent of the size and condition number of the matrix. The efficiency of the methods is illustrated on several problems arising from density functional theory based electronic structure calculations.

### Meiyue Shao and Chao Yang, "Properties of Definite Bethe--Salpeter Eigenvalue Problems", submitted to Lecture Notes in Computational Science, February 1, 2016,

### Meiyue Shao, Lin Lin, Chao Yang, Fang Liu, Felipe H. da Jornada, Jack Deslippe and Steven G. Louie, "Low rank approximation in G0W0 calculations", accepted Science China Mathematics, ICIAM special issue, January 1, 2016, doi: 10.1007/s11425-016-0296-x

### E. Vecharynski, J. Brabec, M. Shao, N. Govind, C. Yang, "Efficient Block Preconditioned Eigensolvers for Linear Response Time-dependent Density Functional Theory", submitted to JCC, 2015,

We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising fromthe time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into a product eigenvalue problem that is self-adjoint with respect to a K-inner product. This product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. However, the other component of the eigenvector can be easily recovered in a postprocessing procedure. Therefore, the algorithms we present here are more efficient than existing algorithms that try to approximate both components of the eigenvectors simultaneously.The efficiency of the new algorithms is demonstrated by numerical examples.