Speaker: Dr. Yanzhi Wang, Assistant Professor, Syracuse University
Title: Towards the limits of energy efficiency and performance of hardware deep learning systems
Deep learning systems have achieved unprecedented progress in a number of fields such as computer vision, robotics, game playing, unmanned driving and aerial systems, and other AI-related fields. However, rapidly expanding model sizes are posing a significant restriction on both the computation and weight storage, for both inference and training, and on both high-performance computing systems and low-power embedded system and IoT applications. In order to overcome these limitations, we propose a holistic framework of incorporating block-circulant matrices into deep learning systems, that can achieve: (i) a simultaneous reduction on weight storage and computational complexity, (ii) simultaneous speedup of training and inference, and (iii) generality and fundamentality that can be adopted to different neural network types, sizes, and scales.
Besides algorithm-level achievements, our framework has: (i) a solid theoretical foundation to prove that our approach will converge to the same “effectiveness” as deep learning without compression; (ii) high-performance and high-efficiency reconfigurable hardware implementations. The hardware implementations are based on a key principle of FFT-IFFT decoupling and the development of reconfigurable basic computing modules which can support different layers of DNNs, different DNN models and types, and different computing platforms (smartphones, FPGAs, ASICs). Our FPGA-based implementations for deep learning systems and LSTM-based recurrent neural networks could achieve 11X+ energy efficiency improvement compared with the best state-of-the-arts, and even higher energy efficiency gain compared with IBM TrueNorth neurosynaptic processor. Our proposed framework can achieve 3.5 TOPS computation performance in FPGAs, and is the first to enable nano-second level recognition speed for image recognition tasks.
Finally, I will briefly introduce additional research topics, including stochastic computing-based deep learning systems and implementation using superconducting Josephson junctions, security aspects of deep learning systems, emerging deep learning algorithms, and applications of deep reinforcement learning.
Yanzhi Wang is currently an assistant professor in the Department of Electrical Engineering and Computer Science at Syracuse University, from August 2015. He has received his Ph.D. Degree in Computer Engineering from University of Southern California (USC) in 2014, under supervision of Prof. Massoud Pedram, and his B.S. Degree with Distinction in Electronic Engineering from Tsinghua University in 2009.
Dr. Wang's current research interests are the energy-efficient and high-performance implementations of deep learning and artificial intelligence systems, and emerging deep learning algorithms/systems such as Bayesian neural networks, generative adversarial networks (GANs), and deep reinforcement learning. Besides, he works on the application of deep learning and machine intelligence in various mobile and IoT systems, medical systems, and UAVs, as well as the integration of security protection in deep learning systems. His group works on both algorithms and actual implementations (FPGAs, circuit tapeouts, mobile and embedded systems, and UAVs). His works have been published in top venues in conferences and journals (e.g. ASPLOS, MICRO, AAAI, ICML, FPGA, DAC, ICCAD, DATE, ISLPED, LCTES, INFOCOM, ICDCS, TComputer, TCAD, Plos One, etc.), and have been cited for around 3,000 times according to Google Scholar. He has received four Best Paper or Top Paper Awards from major conferences including IEEE ICASSP (top 3 among all 2,000+ submissions), ISLPED, IEEE CLOUD, and ISVLSI. He has another seven Best Paper Nominations and two Popular Papers in IEEE TCAD. His group is sponsored by the NSF, DARPA, IARPA, AFRL/AFOSR, Syracuse CASE Center, and industry source NSF, DARPA, IARPA, AFRL/AFOSR, Syracuse CASE Center, and industry source.