ESE:EfficientSpeechRecognitionEnginewithSparseLSTMonFPGA资源-CSDN文库

AI,FPGA

需积分: 36 23 浏览量 2017-09-07 09:13:03 上传评论 2 收藏 1.73MB PDF 举报

资源推荐

资源详情

资源评论

ESE: Efﬁcient Speech Recognition Engine

with Compressed LSTM on FPGA

Song Han

1,2

, Junlong Kang

, Huizi Mao

1,2

, Yiming Hu

2,3

, Xin Li

, Yubin Li

, Dongliang Xie

Hong Luo

, Song Yao

, Yu Wang

2,3

, Huazhong Yang

and William J. Dally

1,4

Stanford University,

DeePhi Tech,

Tsinghua University,

NVIDIA

{songhan,dally}@stanford.edu,

song.yao@deephi.tech,

yu-wang@mail.tsinghua.edu.cn

Abstract

Long Short-Term Memory (LSTM) is widely used in speech recognition. In order

to achieve higher prediction accuracy, machine learning scientists have built larger

and larger models. Such large model is both computation intensive and memory

intensive. Deploying such bulky model results in high power consumption given

latency constraint and leads to high total cost of ownership (TCO) of a data center.

In order to speedup the prediction and make it energy efﬁcient, we ﬁrst propose

a load-balance-aware pruning method that can compress the LSTM model size

by 20

(10

from pruning and 2

from quantization) with negligible loss of the

prediction accuracy. The pruned model is friendly for parallel processing. Next,

we propose scheduler that encodes and partitions the compressed model to each PE

for parallelism, and schedule the complicated LSTM data ﬂow. Finally, we design

the hardware architecture, named Efﬁcient Speech Recognition Engine (ESE) that

works directly on the compressed model. Implemented on Xilinx XCKU060 FPGA

running at 200MHz, ESE has a performance of 282 GOPS working directly on the

compressed LSTM network, corresponding to 2.52 TOPS on the uncompressed

one, and processes a full LSTM for speech recognition with a power dissipation of

41 Watts. Evaluated on the LSTM for speech recognition benchmark, ESE is 43

and 3

faster than Core i7 5930k CPU and Pascal Titan X GPU implementations.

It achieves 40

and 11.5

higher energy efﬁciency compared with the CPU and

GPU respectively.

1 Introduction

Deep neural network has surpassed the traditional acoustic model and become the state-of-the-art

method for speech recognition [

]. Long Short-Term Memory (LSTM) [

], Gated Recurrent Unit

(GRU) [

] and vanilla recurrent neural networks (RNNs) are popular in speech recognition. In this

work, we designed a hardware accelerator called ESE for the most complex one: the LSTM.

ESE takes the approach of EIE [

] one step further to address a more general problem of accelerating

not only feed forward neural networks but also recurrent neural networks and LSTM. The recurrent

nature of RNN produces complicated data dependency, which is more challenging than feed forward

neural nets. To deal with this problem, we designed a data ﬂow that can effectively schedule the

complex RNN operations using multiple EIE cores.

Among all factors contribute to the monthly bill of a data center, power consumption is the major

one. Since memory reference consumes more than two orders of magnitude higher energy than ALU

operations, we focus on reducing the memory footprint.

In order to achieve this, we design a novel method to optimize across the algorithm, software

and hardware. At algorithm level, ESE revisited pruning algorithm from the hardware efﬁciency

1st International Workshop on Efﬁcient Methods for Deep Neural Networks at NIPS 2016, Barcelona, Spain.

Full paper to appear at FPGA 2017.

arXiv:1612.00694v1 [cs.CL] 1 Dec 2016

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余5页未读，立即下载

评论收藏

内容反馈

sytek

粉丝: 0
资源: 1

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPG...

最新资源

ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPG...

基于FPGA的LSTM加速器设计（MNIST数据集为例）

speechrecognition对cmu的pocketsphinx的普通话解析模块

微软自带库SpeechRecognitionEngine语音识别演示C#

Windows-DB2 v9.7

实现基于FPGA的硬件算法加速器

时间长短序列网络LSTM

DB2 Enterprise Server Edition, V9.7 license 永久有效

ese:ES6 日常使用

ESE: Enterprise SmartEiffel-开源

DB2 V9.5 永久Lincense 试过可以用 分享出来

db2ese_c.lic DB2 V9.7 license linux x86_64

db2永久Lincense

db2 v9.5 企业版 License

ese:将 Elasticsearch 索引从一个集群导出到另一个集群的工具

PL3_ESE:编程实验室 3 的 ESE 问题陈述的存储库

FIRE ESE 模板

android通过oma获取ESE的CPLC

使用Xilinx FPGA实现深度学习语音识别

Android eSE

db2ese-c.lic

FPGA学习资料 适用于开发者

db2ese_c.lic (DB2 9.7 LINUX安装)

BIM+ESE数字化：低碳园区智慧能源数字化管理解决方案.pptx

Extensible-Storage-Engine:ESE是基于ISAM的嵌入式数据库引擎，提供基本表和索引访问。 但是，该库还提供了许多其他高度分层且因此可重用的子功能。

matlab归零码功率谱源码-ese524:ese524

Exchange 2007 中的 ESE 数据库缓存大小

ESE2005:嵌入式系统架构

RDA5856ESE_V1.3蓝牙音乐芯片引脚及全部文档说明.pdf

最新资源

DB2 V9.5 永久Lincense 试过可以用分享出来

FPGA学习资料适用于开发者

Extensible-Storage-Engine:ESE是基于ISAM的嵌入式数据库引擎，提供基本表和索引访问。但是，该库还提供了许多其他高度分层且因此可重用的子功能。