Dilip Vasudevan is a computer science research fellow in CAG. His current research interests include the design space exploration of Post Moore architectures, reconfigurable spintronic devices, superconducting architectures and hardware/software co-design and the development of new computing models and systems to deliver for future exascale systems. His recent work was in ultra low power sub-threshold system design for Internet of Things (IoT). He obtained his Ph.D. in Informatics (Computer Engineering) from the University of Edinburgh, Scotland, UK.
- Continuing the Scaling of Digital Computing Post Moore’s Law
- iARPA SuperTools
- PARADISE: PostMoore Architecture and Accelerator Design Space Exploration Using Device Level Simulation and Experiments
- PINE: An Energy Efficient Flexibly Interconnected Photonic Data Center Architecture for Extreme Scalability
- Project 38: A set of vendor-agnostic architectural explorations involving NSA, the DOE Office of Science, and NNSA
A. Roy, A. Klinefelter, F. B. Yahya, X. Chen, L. P. Gonzalez-Guerrero, C. J. Lukas, D. A. Kamakshi, J. Boley, K. Craig, M. Faisal, S. Oh, N. E. Roberts, Y. Shakhsheer, A. Shrivastava, D. P. Vasudevan, D. D. Wentzloff, B. H. Calhoun, "A 6.45 μW Self-Powered SoC With Integrated Energy-Harvesting Power Management and ULP Asymmetric Radios for Portable Biomedical Systems", IEEE Transactions on Biomedical Circuits and Systems, December 28, 2015, 9:862 - 874, doi: 10.1109/TBCAS.2015.2498643
This paper presents a batteryless system-on-chip (SoC) that operates off energy harvested from indoor solar cells and/or thermoelectric generators (TEGs) on the body. Fabricated in a commercial 0.13 μW process, this SoC sensing platform consists of an integrated energy harvesting and power management unit (EH-PMU) with maximum power point tracking, multiple sensing modalities, programmable core and a low power microcontroller with several hardware accelerators to enable energy-efficient digital signal processing, ultra-low-power (ULP) asymmetric radios for wireless transmission, and a 100 nW wake-up radio. The EH-PMU achieves a peak end-to-end efficiency of 75% delivering power to a 100 μA load. In an example motion detection application, the SoC reads data from an accelerometer through SPI, processes it, and sends it over the radio. The SPI and digital processing consume only 2.27 μW, while the integrated radio consumes 4.18 μW when transmitting at 187.5 kbps for a total of 6.45 μW.
Andrew A. Chien, Tung Thanh-Hoang, Dilip Vasudevan, Yuanwei Fang, Amirali Shambayati, "10x10: A Case Study in Highly-Programmable and Energy-Efficient Heterogeneous Federated Architecture", SIGARCH Comput. Archit. News, December 2015, 43:2 - 9, doi: 10.1145/2856113.2856115
Customized architecture is widely recognized as an important approach for improved performance and energy efficiency. To balance generality and customization benefit, researchers have proposed to federate heterogeneous micro-engines. Using the 10x10 architecture and an integrated image and vision benchmark as a case study, we explore the performance and energy benefits achievable. Results for current 32nm technology and DDR3 memory show 10x10 architecture benefits of 140x performance and 72x energy overall. Adding 3D-stacked DRAM increase benefits to 171x (performance) and 100x (energy). Finally, considering future 7nm transistor process, benefits as large as 597x (performance) and 137x energy are observed.
D.P. Vasudevan, P.K. Lala, Jia Di, J.P. Parkerson, "Reversible-logic design with online testability", IEEE Transactions on Instrumentation and Measurement, March 20, 2006, 55:406 - 414, doi: 10.1109/TIM.2006.870319
Conventional digital circuits dissipate a significant amount of energy because bits of information are erased during the logic operations. Thus, if logic gates are designed such that the information bits are not destroyed, the power consumption can be reduced dramatically. The information bits are not lost in case of a reversible computation. This has led to the development of reversible gates. This paper proposes three new reversible logic gates; two of the proposed gates can be employed to design online testable reversible logic circuits. Furthermore, they can be used to implement any Boolean logic function. The application of the reversible gates in implementing several benchmark functions has been presented.
Dilip Vasudevan ; George Michclogiannakis ; David Donofrio ; John Shalf, "PARADISE - Post-Moore Architecture and Accelerator Design Space Exploration Using Device Level Simulation and Experiments", IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), March 2019,
Dilip Vasudevan, George Michelogiannakis, John Shalf, "CASPER - Configurable Design Space Exploration of Programmable Architectures for Machine Learning using Beyond Moore Devices", IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH), July 2017,
Dilip Vasudevan, Anastasiia Butko, George Michelogiannakis, David Donofrio, John Shalf, "Towards an Integrated Strategy to Preserve Digital Computing Performance Scaling Using Emerging Technologies", Workshop on HPC computing in a Post Moore’s law world (HCPM), June 22, 2017,
With the decline and eventual end of historical rates of lithographic scaling, we arrive at a crossroad where synergistic and holistic decisions are required to preserve Moore's law technology scaling. Numerous emerging technologies aim to extend digital electronics scaling of performance, energy efficiency, and computational power/density,
ranging from devices (transistors), memories, 3D integration capabilities, specialized architectures, photonics, and others.
The wide range of technology options creates the need for an integrated strategy to understand the impact of these emerging technologies on future large-scale digital systems for diverse application requirements and optimization metrics.
In this paper, we argue for a comprehensive methodology that spans the different levels of abstraction -- from materials, to devices, to complex digital systems and applications. Our approach integrates compact models of low-level characteristics of the emerging technologies to inform higher-level simulation models to evaluate their responsiveness to application requirements.
The integrated framework can then automate the search for an optimal architecture using available emerging technologies to maximize a targeted optimization metric.
Jiaoyan Chen, Emanuel Popovici, Dilip Vasudevan, Michel Schellekens, "Ultra Low Power Booth Multiplier Using Asynchronous Logic", 2012 18th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), Lyngby, Denmark, IEEE, July 19, 2012, 81 - 88, doi: 10.1109/ASYNC.2012.15
Asynchronous logic shows promising applicability in ASIC design due to its potentially low power and high robustness properties. For deep submicron technologies the static power is becoming very significant and many applications require that this power component to be reduced. A new logic called Positive Feedback Charge Sharing Logic (PFCSL) is proposed, which reduces both dynamic and especially static power and also could be implemented with asynchronous logic. This new logic combines adiabatic logic with charge sharing technology avoiding the penalty of power clock generator. A novel 16-by-16-bit Radix-4 Booth Multiplier is built based on PFCSL and implemented in 45nm technology. We achieve around 30% reduction in dynamic power and 60% in static power respectively compared to the same design being implemented using static dual-rail logic. Also, the area of the multiplier is significantly smaller.