ConnectionistTemporalClassificationLabellingUnsegmented.pdf资源-CSDN文库

论文

46 浏览量 2024-11-19 16:06:18 上传评论收藏 306KB PDF 举报

资源推荐

资源详情

资源评论

Connectionist Temporal Classiﬁcation: Labelling Unsegmented

Sequence Data with Recurrent Neural Networks

Alex Graves

alex@idsia.ch

Santiago Fern´andez

santiago@idsia.ch

Faustino Gomez

tino@idsia.ch

J¨urgen Schmidhuber

1,2

juergen@idsia.ch

Istituto Dalle Molle di Studi sull’Intelligenza Artiﬁciale (IDSIA), Galleria 2, 6928 Manno-Lugano, Switzerland

Technische Universit¨at M¨unchen (TUM), Boltzmannstr. 3, 85748 Garching, Munich, Germany

Abstract

Many real-world sequence learning tasks re-

quire the prediction of sequences of lab e ls

from noisy, unsegmented input data. In

speech recognition, for example, an acoustic

signal is transcribed into words or sub-word

units. Recurrent neural networks (RNNs) are

powerful sequence learners that would seem

well suited to such tasks. However, because

they require pre-segmented training data,

and post-processing to transform their out-

puts into label sequences, their applicability

has so far been limited. This paper presents a

novel method for training RNNs to label un-

segmented sequences directly, thereby solv-

ing both problems. An experiment on the

TIMIT speech corpus demonstrates its ad-

vantages over both a baseline HMM and a

hybrid HMM-RNN.

1. Introduction

Labelling unsegmented sequence data is a ubiquitous

problem in real-world sequence learning. It is partic-

ularly common in perceptual tasks (e.g. handwriting

recognition, speech recognition, gesture recognition)

where noisy, real-valued input streams are annotated

with strings of discrete labels, such as letters or words.

Currently, graphical models such as hidden Markov

Models (HMMs; Rabiner, 1989), conditional random

ﬁelds (CRFs; Laﬀerty et al., 2001) and their vari-

ants, are the predominant framework for sequence la-

App earing in Proceedings of the 23

International Con-

ference on Machine Learning, Pittsburgh, PA, 2006. Copy-

right 2006 by the author(s)/owner(s).

belling. While these approaches have proved success-

ful for many problems, they have several drawbacks:

(1) they usually require a signiﬁcant amount of task

speciﬁc knowledge, e.g. to design the state models for

HMMs, or choose the input features for CRFs; (2)

they require explicit (and often questionable) depen-

dency assumptions to make inference tractable, e.g.

the assumption that observations are independent for

HMMs; (3) for standard HMMs, training is generative,

even though sequence labelling is discriminative.

Recurrent neural networks (RNNs), on the other hand,

require no prior knowledge of the data, beyond the

choice of input and output representation. They can

be trained discriminatively, and their internal state

provides a powerful, general mechanism for modelling

time series. In addition, they tend to be robust to

temporal and spatial noise.

So far, however, it has not been possible to apply

RNNs directly to sequence labelling. The problem is

that the standard neural network objective functions

are deﬁned separately for each p oint in the training se -

quence; in other words, RNNs can only be trained to

make a series of independent label classiﬁcations. This

means that the training data must be pre-segmented,

and that the network outputs must be post-processed

to give the ﬁnal label sequence.

At present, the most eﬀective use of RNNs for se-

quence labelling is to combine them with HMMs in the

so-called hybrid approach (Bourlard & Morgan, 1994;

Bengio., 1999). Hybrid systems use HMMs to model

the long-range sequential structure of the data, and

neural nets to provide localised classiﬁcations. The

HMM component is able to automatically segment

the sequence during training, and to transform the

network classiﬁcations into label sequences. However,

as well as inheriting the aforeme ntioned drawbacks of

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余7页未读，立即下载

评论收藏

内容反馈

不脱发的程序猿

粉丝: 26w+
资源: 5887

Connectionist Temporal Classification Labelling Unsegmented.pdf

最新资源

Connectionist Temporal Classification Labelling Unsegmented.pdf

CTC的直观理解（Connectionist Temporal Classification连接时序分类），单行文本时序.htm

Connectionist Temporal Classification Layer：语音识别神经网络的CTC损失计算-matlab开发

Connectionist Temporal Classification: A Tutorial with Gritty Details

Connectionist Temporal Classification

CTC(Connectionist Temporal Classfication)详细介绍，中文版

大规模游戏社交网络节点相似性算法及其应用-2-4 基于大数据的复杂场景的语音识别的探索与实践.pdf

基于BERT的端到端语音识别模型开发指南.pdf

智能交通系统中车辆检测与识别技术的研究.pdf

基于python深度学习的手写汉语拼音识别（数据集采集及标注，算法构建、模型训练、预测与评估等）.zip

基于卷积循环神经网络的不定长验证码识别.pdf

11-3+一种面向自然场景下的低质文本识别方法.pdf

基于深度优化残差卷积神经网络的端到端语音识别.pdf

基于MATLAB的图片中文字的提取及识别（精）.pdf

基于循环神经网络的音素识别研究.pdf

深度学习在文字识别领域的应用.pdf

connectionist-temporal-classification:连接主义时间分类算法的 Python 实现，带有帮助验证实现正确性的工具

基于链接时序分类的日语语音识别_孙健_郭武.pdf

Python-PyTorchCTC是CTCConnectionistTemporalClassification的PyTorch实现

基于CTC准则的普通话识别及改进_张立民.pdf

classification神经网络识别,可识别腾讯验证码.zip

1303.5778.pdf

基于深度学习的银行卡号识别系统.pdf

Python-基于RNNCTC损失函数的端到端语音识别系统

【10】Towards End-to-End Speech Recognitionwith Recurrent Neural Networks.pdf

基于深度学习技术的图片文字提取技术的研究.pdf

【9】Speech recognition with deep recurrent neural networks.pdf

Hybrid LSTM-FSMN Networks for Acoustic Modeling.pdf

【11】Fast and accurate recurrent neural network acoustic models

最新资源