Dilip Vasudevan

Research Scientist

Applied Mathematics & Computational Research Division

dilipv@lbl.gov

Lawrence Berkeley National Lab

One Cyclotron Road

Berkeley, California 94720

Biographical Sketch

Dilip Vasudevan is a research scientist in CAG. His current research interests include the design space exploration of Post Moore architectures, reconfigurable spintronic devices, superconducting architectures and hardware/software co-design and the development of new computing models and systems to deliver for future exascale systems. His recent work was in ultra low power sub-threshold system design for Internet of Things (IoT). He obtained his Ph.D. in Informatics (Computer Engineering) from the University of Edinburgh, Scotland, UK.

Current Projects

Continuing the Scaling of Digital Computing Post Moore’s Law
iARPA SuperTools
PARADISE: PostMoore Architecture and Accelerator Design Space Exploration Using Device Level Simulation and Experiments
PARADISE++: Large Scale Optimistic Synchronization based simulation of Post Moore Systems (ARO Project)
PINE: An Energy Efficient Flexibly Interconnected Photonic Data Center Architecture for Extreme Scalability
Project 38: A set of vendor-agnostic architectural explorations involving NSA, the DOE Office of Science, and NNSA
Skyrmion-Based Beyond Moore Computing
Superconducting Race Logic Accelerators: Make computation in superconducting circuits, circuits that operate around 4K temperatures and have close to zero resistance, as efficient as possible

Journal Articles

Ran Cheng, Christoph Kirst, Dilip Vasudevan, "Superconducting-Oscillatory Neural Network With Pixel Error Detection for Image Recognition", IEEE Transaction on Applied Superconductivity, August 2023, 33:1-7,

Dilip Vasudevan, George Michelogiannakis, "Efficient Temporal Arithmetic Logic Design for Superconducting RSFQ Logic", IEEE Transactions on Applied Superconductivity, March 2023,

Georgios Tzimpragos, Jennifer Volk, Dilip Vasudevan, Nestan Tsiskaridze, George Michelogiannakis, Advait Madhavan, John Shalf, Timothy Sherwood, "Temporal Computing With Superconductors", IEEE MIcro, March 2021, 41:71-79, doi: 10.1109/MM.2021.3066377

W Cui, G Tzimpragos, Y Tao, J Mcmahan, D Dangwal, N Tsiskaridze, G Michelogiannakis, DP Vasudevan, T Sherwood, "Language Support for Navigating Architecture Design in Closed Form", ACM Journal on Emerging Technologies in Computing Systems, January 2019, 16:1--28, doi: 10.1145/3360047

A. Roy, A. Klinefelter, F. B. Yahya, X. Chen, L. P. Gonzalez-Guerrero, C. J. Lukas, D. A. Kamakshi, J. Boley, K. Craig, M. Faisal, S. Oh, N. E. Roberts, Y. Shakhsheer, A. Shrivastava, D. P. Vasudevan, D. D. Wentzloff, B. H. Calhoun, "A 6.45 μW Self-Powered SoC With Integrated Energy-Harvesting Power Management and ULP Asymmetric Radios for Portable Biomedical Systems", IEEE Transactions on Biomedical Circuits and Systems, December 28, 2015, 9:862 - 874, doi: 10.1109/TBCAS.2015.2498643

This paper presents a batteryless system-on-chip (SoC) that operates off energy harvested from indoor solar cells and/or thermoelectric generators (TEGs) on the body. Fabricated in a commercial 0.13 μW process, this SoC sensing platform consists of an integrated energy harvesting and power management unit (EH-PMU) with maximum power point tracking, multiple sensing modalities, programmable core and a low power microcontroller with several hardware accelerators to enable energy-efficient digital signal processing, ultra-low-power (ULP) asymmetric radios for wireless transmission, and a 100 nW wake-up radio. The EH-PMU achieves a peak end-to-end efficiency of 75% delivering power to a 100 μA load. In an example motion detection application, the SoC reads data from an accelerometer through SPI, processes it, and sends it over the radio. The SPI and digital processing consume only 2.27 μW, while the integrated radio consumes 4.18 μW when transmitting at 187.5 kbps for a total of 6.45 μW.

Andrew A. Chien, Tung Thanh-Hoang, Dilip Vasudevan, Yuanwei Fang, Amirali Shambayati, "10x10: A Case Study in Highly-Programmable and Energy-Efficient Heterogeneous Federated Architecture", SIGARCH Comput. Archit. News, December 2015, 43:2 - 9, doi: 10.1145/2856113.2856115

Customized architecture is widely recognized as an important approach for improved performance and energy efficiency. To balance generality and customization benefit, researchers have proposed to federate heterogeneous micro-engines. Using the 10x10 architecture and an integrated image and vision benchmark as a case study, we explore the performance and energy benefits achievable. Results for current 32nm technology and DDR3 memory show 10x10 architecture benefits of 140x performance and 72x energy overall. Adding 3D-stacked DRAM increase benefits to 171x (performance) and 100x (energy). Finally, considering future 7nm transistor process, benefits as large as 597x (performance) and 137x energy are observed.

J Chen, D Vasudevan, M Schellekens, E Popovici, "Ultra Low Power Asynchronous Charge Sharing Logic", Journal of Low Power Electronics, 2012, 8:526--534, doi: 10.1166/jolpe.2012.1213

D. P. Vasudevan, P. K. Lala, J. P. Parkerson, "Self-Checking Carry-Select Adder Design Based on Two-Rail Encoding", IEEE Transactions on Circuits and Systems I: Regular Papers, 2007, 54:2696-2705, doi: 10.1109/TCSI.2007.910537

DP Vasudevan, PK Lala, J Di, JP Parkerson, "Reversible-Logic Design With Online Testability", IEEE Transactions on Instrumentation and Measurement, January 1, 2006, 55:406--414, doi: 10.1109/tim.2006.870319

Conventional digital circuits dissipate a significant amount of energy because bits of information are erased during the logic operations. Thus, if logic gates are designed such that the information bits are not destroyed, the power consumption can be reduced dramatically. The information bits are not lost in case of a reversible computation. This has led to the development of reversible gates. This paper proposes three new reversible logic gates; two of the proposed gates can be employed to design online testable reversible logic circuits. Furthermore, they can be used to implement any Boolean logic function. The application of the reversible gates in implementing several benchmark functions has been presented.

Conference Papers

Maximilian Bremer, Nirmalendu Patra, Tan Nguyen, Dilip Vasudevan, Cy Chan, "Benefits of Optimistic Parallel Discrete Event Simulation for Network-on-Chip Simulation", 2023 IEEE/ACM 27th International Symposium on Distributed Simulation and Real Time Applications (DS-RT), Singapore, October 2, 2023, doi: 10.1109/DS-RT58998.2023.00013

Georgios Tzimpragos, Dilip Vasudevan, Nestan Tsiskaridze, George Michelogiannakis, Advait Madhavan, Jennifer Volk, John Shalf, Timothy Sherwood, "A Computational Temporal Logic for Superconducting Accelerators", ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, March 2020,

S Werner, P Fotouhi, X Xiao, M Fariborz, SJB Yoo, G Michelogiannakis, D Vasudevan, "3D photonics as enabling technology for deep 3D DRAM stacking", Proceedings of the International Symposium on Memory Systems - MEMSYS 19, ACM Press, September 2019, doi: 10.1145/3357526.3357559

G Tzimpragos, A Madhavan, D Vasudevan, D Strukov and T Sherwood, "Boosted Race Trees for Low Energy Classification - Best Paper Award", ("Best Paper Award"), ASPLOS 2019, April 2019, doi: 10.1145/3297858.3304036

D Vasudevan, G Michclogiannakis, D Donofrio, J Shalf, "PARADISE - Post-Moore Architecture and Accelerator Design Space Exploration Using Device Level Simulation and Experiments", 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), IEEE, January 2019, doi: 10.1109/ispass.2019.00022

Dilip Vasudevan, George Michelogiannakis, John Shalf, "CASPER - Configurable Design Space Exploration of Programmable Architectures for Machine Learning using Beyond Moore Devices", IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), July 2017,

D Vasudevan, A Butko, G Michelogiannakis, D Donofrio, J Shalf, "Towards an Integrated Strategy to Preserve Digital Computing Performance Scaling Using Emerging Technologies", Springer International Publishing, January 1, 2017, 115--123, doi: 10.1007/978-3-319-67630-2_10

With the decline and eventual end of historical rates of lithographic scaling, we arrive at a crossroad where synergistic and holistic decisions are required to preserve Moore's law technology scaling. Numerous emerging technologies aim to extend digital electronics scaling of performance, energy efficiency, and computational power/density,
ranging from devices (transistors), memories, 3D integration capabilities, specialized architectures, photonics, and others.
The wide range of technology options creates the need for an integrated strategy to understand the impact of these emerging technologies on future large-scale digital systems for diverse application requirements and optimization metrics.
In this paper, we argue for a comprehensive methodology that spans the different levels of abstraction -- from materials, to devices, to complex digital systems and applications. Our approach integrates compact models of low-level characteristics of the emerging technologies to inform higher-level simulation models to evaluate their responsiveness to application requirements.
The integrated framework can then automate the search for an optimal architecture using available emerging technologies to maximize a targeted optimization metric.

Jiaoyan Chen, Emanuel Popovici, Dilip Vasudevan, Michel Schellekens, "Ultra Low Power Booth Multiplier Using Asynchronous Logic", 2012 18th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), Lyngby, Denmark, IEEE, July 19, 2012, 81 - 88, doi: 10.1109/ASYNC.2012.15

Asynchronous logic shows promising applicability in ASIC design due to its potentially low power and high robustness properties. For deep submicron technologies the static power is becoming very significant and many applications require that this power component to be reduced. A new logic called Positive Feedback Charge Sharing Logic (PFCSL) is proposed, which reduces both dynamic and especially static power and also could be implemented with asynchronous logic. This new logic combines adiabatic logic with charge sharing technology avoiding the penalty of power clock generator. A novel 16-by-16-bit Radix-4 Booth Multiplier is built based on PFCSL and implemented in 45nm technology. We achieve around 30% reduction in dynamic power and 60% in static power respectively compared to the same design being implemented using static dual-rail logic. Also, the area of the multiplier is significantly smaller.

X. Wang, D. Vasudevan, H. S. Lee, "Global Built-In Self-Repair for 3D memories with redundancy sharing and parallel testing", 2011 IEEE International 3D Systems Integration Conference (3DIC), 2011 IEEE International, 2012, 1-8, doi: 10.1109/3DIC.2012.6262967

T Ye, D Vasudevan, J Chen, E Popovici, M Schellekens, "Static Average Case Power Estimation Technique for Block Ciphers", 2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools, IEEE, 2010, doi: 10.1109/dsd.2010.105

J Chen, DP Vasudevan, E Popovici, M Schellekens, "Reversible online BIST using bidirectional BILBO", Proceedings of the 7th ACM international conference on Computing frontiers - CF 10, ACM Press, 2010, doi: 10.1145/1787275.1787333

Jia Di, P. K. Lala, D. Vasudevan, "Synthesis of nanoelectronic circuits on delay-insensitive cellular arrays", Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA 06), 2006, 5 pp.-156, doi: 10.1109/DELTA.2006.84