Skip to navigation Skip to content
Careers | Phone Book | A - Z Index
Computer Architecture Group

Tan Nguyen

Tan Thanh Nhat Nguyen
Research Scientist
Phone: 510-495-2517

Biographical Sketch

Tan Nguyen is a research scientist at Lawrence Berkeley National Laboratory. Nguyen's research focuses on performance analysis and code optimizations for various processor architectures, including multi- and many-core CPUs, GPUs,  FPGAs and specialized (e.g. AI) accelerators. He is also interested in compiler analysis and code generation, programming models and runtime systems for large-scale applications. 

Nguyen received his Ph.D. degree in Computer Science from University of California, San Diego in 2014. Nguyen is an alumnus of the Vietnam Education Foundation.

Current Projects

  • PARADISE++: Large Scale Optimistic Synchronization based simulation of Post Moore Systems (ARO Project)
  • ExaEPI/RADIUM: develop agent-based models for epidemiological simulation 
  • ICAR: downscaling method for meteorological simulations
  • Veracity: develop compiler techniques that facilitate hardware coDesign
  • Collaborative research activities: scalable AI algorithms, Graph Neural Networks, disaggregated systems 

Past Projects

  • NERSC10: Explore alternative processor architectures for the next generation of supercomputer at NERSC
  • ECP-Hardware Evaluation: Use Performance Analysis Tools to predict application performance on future DOE's systems
  • AMReX: Optimize Massively Parallel, Block-structured Adaptive Mesh Refinement (AMR) Applications
  • ExaSAT: An Exascale Static Analysis Tool for Hardware/Software Design Space Evaluation

Journal Articles

Tan Nguyen, Colin MacLean, Marco Siracusa, Douglas Doerfler, Nicholas J. Wright, Samuel Williams, "FPGA‐based HPC accelerators: An evaluation on performance and energy efficiency", CCPE, August 22, 2021, doi: 10.1002/cpe.6570

Weiqun Zhang, Ann Almgren, Vince Beckner, John Bell, Johannes Blashke, Cy Chan, Marcus Day, Brian Friesen, Kevin Gott, Daniel Graves, Max P. Katz, Andrew Myers, Tan Nguyen, Andrew Nonaka, Michele Rosso, Samuel Williams, Michael Zingale, "AMReX: a framework for block-structured adaptive mesh refinement", Journal of Open Source Software, May 2019, doi: 10.21105/joss.01370

Tan Nguyen, Pietro Cicotti, Eric Bylaska, Dan Quinlan, and Scott Baden, "Automatic Translation of MPI Source into a Latency-tolerant, Data-driven Form", Journal of Parallel and Distributed Computing, February 21, 2017,

Weiqun Zhang, Ann Almgren, Marcus Day, Tan Nguyen, John Shalf, Didem Unat, "BoxLib with Tiling: An AMR Software Framework", SIAM Journal on Scientific Computing, 2016,

Tan Nguyen, Daniel Hefenbrock, Jason Oberg, Ryan Kastner and Scott Baden, "A software-based dynamic-warp scheduling approach for load-balancing the Viola-Jones face detection algorithm on GPUs", Journal of Parallel and Distributed Computing, January 31, 2013,

Conference Papers

Maximilian Bremer, Nirmalendu Patra, Tan Nguyen, Dilip Vasudevan, Cy Chan, "Benefits of Optimistic Parallel Discrete Event Simulation for Network-on-Chip Simulation", 2023 IEEE/ACM 27th International Symposium on Distributed Simulation and Real Time Applications (DS-RT), Singapore, October 2, 2023, doi: 10.1109/DS-RT58998.2023.00013

Khaled Z. Ibrahim, Tan Nguyen, Hai Ah Nam, Wahid Bhimji, Steven Farrell, Leonid Oliker, Michael Rowan, Nicholas J. Wright, Samuel Williams, "Architectural Requirements for Deep Learning Workloads in HPC Environments", (BEST PAPER), Performance Modeling, Benchmarking, and Simulation (PMBS), November 2021,

Tan Nguyen, Erich Strohmaier, John Shalf, "Facilitating CoDesign with Automatic Code Similarity Learning", 7th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), November 14, 2021,

Douglas Doerfler, Farzad Fatollahi-Fard, Colin MacLean, Tan Nguyen, Samuel Williams, Nicholas J. Wright, Marco Siracusa, "Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs", International Workshop on OpenCL (iWOCL), April 2021, doi: 10.1145/3456669.3456671

Tan Nguyen, Samuel Williams, Marco Siracusa, Colin MacLean, Douglas Doerfler, Nicholas J. Wright, "The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing", (BEST PAPER) Performance Modeling, Benchmarking, and Simulation of High Performance Computer Systems (PMBS), November 2020,

T Nguyen, D Unat, W Zhang, A Almgren, N Farooqi, J Shalf, "Perilla: Metadata-Based Optimizations of an Asynchronous Runtime for Adaptive Mesh Refinement", International Conference for High Performance Computing, Networking, Storage and Analysis, SC, January 1, 2017, 945--956, doi: 10.1109/SC.2016.80

MN Farooqi, D Unat, T Nguyen, W Zhang, A Almgren, J Shalf, "Nonintrusive AMR asynchrony for communication optimization", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), January 1, 2017, 10417 LN:682--694, doi: 10.1007/978-3-319-64203-1_49

D Unat, T Nguyen, W Zhang, MN Farooqi, B Bastem, G Michelogiannakis, A Almgren, J Shalf, "TiDA: High-level programming abstractions for data locality management", Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), January 2016, 9697:116--135, doi: 10.1007/978-3-319-41321-1_7

Tan Nguyen and Scott Baden, "LU Factorization: Towards Hiding Communication Overheads With A Lookahead-free Algorithm", IEEE Cluster 2015, Chicago, IL, September 8, 2015,

Tan Nguyen and Scott Baden, "Bamboo - Preliminary scaling results on multiple hybrid nodes of Knights Corner and Sandy Bridge processors", WOLFHPC: Workshop on Domain-Specific Languages and High-Level Frameworks for HPC, November 19, 2013,

T. Nguyen, P. Cicotti, E. Bylaska, D. Quinlan and S. B. Baden, "Bamboo: Translating MPI applications to a latency-tolerant, data-driven form", Proceedings of the 2012 ACM/IEEE conference on Supercomputing (SC12), November 14, 2012,

Daniel Hefenbrock, Jason Oberg, Nhat Tan Nguyen Thanh, Ryan Kastner and Scott B. Baden, "Accelerating Viola-Jones Face Detection to FPGA-Level using GPUs", Proc 18th Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '10), May 3, 2010,


Tan Nguyen, Data and Loop Abstractions for Portable Locality Management, 2016 High Performance Data Analysis and Visualization (HPDAV) Workshop, May 23, 2016,

Tan Nguyen, Tolerating Communication Overheads with Tiling Abstractions, 2015 Intel Xeon Phi User's Group (IXPUG), October 1, 2015,

Tan Nguyen and Scott Baden, Automating the communication-computation overlap with Bamboo, 2013 SIAM conference on Computational Science and Engineering, February 25, 2013,