Energy-efficient Hardware Acceleration of Shallow Machine Learning Applications

Abstract

ML accelerators have largely focused on building general platforms for deep neural networks (DNNs), but less so on shallow machine learning (SML) algorithms. This paper proposes Axiline, a compact, configurable, template-based generator for SML hardware acceleration. Axiline identifies computational kernels as templates that are common to these algorithms and builds a pipelined accelerator for efficient execution. The dataflow graphs of individual ML instances, with different data dimensions, are mapped to the pipeline stages and then optimized by customized algorithms. The approach generates energy-efficient hardware for training and inference of various ML algorithms, as demonstrated with post-layout FPGA and ASIC results.

Link to Publication Webpage