Efficient HW and SW Interface Design for Convolutional Neural Networks Using High-Level Synthesis and TensorFlow
TimeMonday, 15 November 202111:30am - 12pm CST
DescriptionHardware accelerators have been extensively used for the deployment of convolutional neural networks (CNNs) as they offer speedup by extracting the parallelism existing in CNNs. The development of such accelerators spans a large design space. The figures of merit of an accelerator are its frequency of operation, the number of operations performed per unit time, and supported configurations and thus it makes the design a multi-objective optimization problem. This work presents a systematic approach to develop an efficient framework for CNN that qualifies such merits and can be scaled to different configurations using Xilinx Vitis-HLS. The presented framework utilizes four copies of a single unified module for executing convolution and pooling in hardware and uses TensorFlow to run certain layers in software using multiprocessing. The framework has been evaluated with Squeezenet1.0, VGG16, and Resent50 at 250 MHz clock frequency on the Xilinx Alveo U250 board achieving 750 GOPS.