深度前馈卷积稀疏降噪自动编码器的多级委员会用于对象识别资源-CSDN文库

16 浏览量 2021-02-24 17:50:26 上传评论收藏 979KB PDF 举报

深度学习和无监督特征学习系统由于具有庞大的架构和每层包含大量特征，因此在基准测试中能取得不错的性能表现。然而，研究者发现在超过某个阈值之后，特征数量的增加对性能的贡献变得非常小。同时，池化层的大小对性能也有重要的影响。基于这些观察，本文提出了一种无监督的方法来提升分类结果，通过构建一个包含多阶段分类器的委员会，并在每一层使用较少数量的特征。网络通过逐层使用L-BFGS优化的去噪自编码器（dA）进行训练，不需要使用反向传播。此外，我们还对去噪自编码器进行了正则化处理，鼓励表示在每个编码层都是稀疏的。我们将该方法应用于STL-10数据集，该数据集包含很少的训练样本以及大量的未标记数据。实验结果表明，相较于现有的单个网络方法，我们的方法展现出了更高的性能。在引言部分，研究者指出无监督学习方法能够自动构建特征提取器，而不需要人为地设计它们。经典的方法如主成分分析（PCA）和K-means聚类在众多视觉应用中已被常规使用。在对象识别的背景下，当前机器视觉系统的一个关键问题在于无监督学习是否能够被用来学习鲁棒性和不变性的特征。传统上，尺度不变特征变换（SIFT）或加速鲁棒特征（SURF）被理解为一种从像素到斑块描述符的转换方式。然而，它们往往需要大量的手动调整和定制，这就凸显了完全无监督学习方法的优势。文章的主要内容涵盖了深度前馈卷积稀疏降噪自动编码器的多级委员会在对象识别中的应用。通过深度网络结构，结合了小数量特征的多阶段分类器，并且使用了特殊的优化算法L-BFGS，训练了卷积核。同时，为了使得每个编码层的表示稀疏化，引入了正则化技术。这些方法被应用于STL-10数据集，这是一个具有极少训练样本和大量未标记数据的数据集。实验结果表明，该方法在性能上超过了现有的方法，这是通过单个网络来完成的。关键词包括多阶段分类器、稀疏降噪自编码器、对象识别和深度学习。关键词提示了本研究的核心技术点，如多阶段分类器的使用表明了一种集成多个学习器来提高决策能力的策略，而稀疏降噪自编码器的使用则强调了模型在特征提取时对于数据内在结构的捕捉。对象识别作为目标任务，展现了无监督学习在计算机视觉领域的应用潜力。深度学习作为技术背景，意味着采用了深度神经网络进行学习和特征提取。本文的研究亮点在于提出了一种通过组合多级小规模特征分类器来提升对象识别性能的方法。这种新方法不仅在理论上有其创新之处，而且在实践中也展现出了比现有技术更好的性能。该研究的实施对深度学习领域特别是对象识别这一子领域具有重要的意义，预示着未来研究可能的方向，比如如何更高效地利用未标记数据，以及如何在深度网络设计中平衡模型的复杂性和性能。

资源推荐

资源详情

资源评论

Multistage Committees of Deep Feedforward

Convolutional Sparse Denoise Autoencoder for

Object Recognition

Shicao Luo

1.College of Information Science and Technology,

Donghua University

Shanghai 201620, China

Email: shicaoLuo@163.com

Yongsheng Ding

1,2

* , Kuangrong Hao

1,2

2. Engineering Research Center of Digitized Textile &

Apparel Technology

Shanghai 201620, China

* Email: ysding@dhu.edu.cn

Abstract—Deep learning and unsupervised feature learning

systems are known to achieve good performance in benchmarks

by using extremely large architectures with many features at

each layer. However, we found that the number of features’

contribution to performance is very small when it is more than

the threshold. Meanwhile, the size of pooling layer has an

important influence on performance. In this paper, we present an

unsupervised method to improve the classification result by going

deep and combining multistage classifiers in a committee with a

small amount of features at each layer. The network is trained

layer-wise via denoise autoencoder (dA) with L-BFGS to optimize

convolutional kernels and no backpropagation is used. In

addition, we regularize the dA encouraging representations to fit

sparse for each coding layer. We apply it on the STL-10 dataset

which has very few training examples and a large amount of

unlabeled data. Experimental results show that our method

presents higher performance than the existing ones on the

condition via individual network.

Keywords—multistage classifiers; sparse denoise autoencoder;

object recognition; deep learning

I. INTRODUCTION

Recent theoretical studies indicate that unsupervised

learning methods are able to automatically build feature

extractors instead of handcrafting them. Classical methods for

dimensionality reduction or clustering, such as principal

component analysis and K-means, have been used routinely in

numerous vision applications [1, 2].

In the context of object recognition, a key problem of

current machine vision systems is whether unsupervised

learning can be used to learn robust and invariant features.

Traditionally, scale invariant feature transform (SITF) or

speed up robust features (SURF) can be understood and

generalized as a way to go from pixels to patch descriptors [3].

However, it is often difficult to adapt to new settings. Unlike

the above mentioned artificial systems for feature learning,

primate visual system [24] can accomplish these tasks

effortlessly in common sense. The groundbreaking work of

Hubel and Wiesel [28] played a major role in the computer

vision community via Marr’s work [29] on building visual

hierarchies analogous to the primate visual system. To narrow

the gap with biological systems and play computer’s own

advantages, much recent research has focused on training deep,

multi-layered networks of feature, such as deep belief nets [4],

deep auto-encoders [5], deep convolutional neural networks

[6], hierarchical sparse coding [7, 8], and SIFT-based MFs

coding [25]. The main benefit of these models is their high

genericity, since deep learning approaches that learn to push

pixels through multiple layers of feature transforms, without

any use of prior knowledge. And, a trick called “receptive

field” dramatically reduces the number of parameters that

must be trained and is a key element of several state-of-the-art

systems [9, 10].

It has been proved that deep models are potentially more

capable than shallow models in handling complex tasks [11].

Deep belief nets [12] learn a hierarchy of features by greedy

layer-wise using the unsupervised restricted Boltzmann

machine. This pre-training method is useful to jump out of

local minimum. The learned weights are then further adjusted

to the current task using supervised information. To make

deep belief nets applicable to full-size images, convolutional

deep belief nets [13] are proposed to use a small receptive

fields and share the weights between the hidden and visible

layers among all locations in an image. Stacked denoising

autoencoders [14] build deep networks by stacking layers of

denoising autoencoders that train one-layer neural network to

reconstruct input data from partial random corruption. Spike-

and-slab sparse coding (S3C) [26] is a preexisting model

which gains very good performance, particularly when the

number of labeled examples is low.

In this paper, we propose an unsupervised learning model

for large size image recognition. In this network, we learn the

following key components: 1) A sparse and overcomplete

feature bank at each layer; 2) The factor of multistage

classifiers for voting. The network is trained layer-wise via

denoise autoencoder and no backpropagation is used. We

apply it on the STL-10 dataset which is an image recognition

dataset for developing unsupervised feature learning, deep

learning, and self-taught learning algorithms. In particular,

each class has few labeled training examples, but a very large

set of unlabeled examples. Zou [15] built a hierarchical

network to learns invariant features via simulated fixations in

video. With this method, it gains 4.5% improvement in STL-

10 classification accuracy. Coates [16] introduced “Selecting

Receptive Fields” method to limit the number of connections

from lower level features to higher ones and achieved the

565

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余5页未读，立即下载

评论收藏

内容反馈

weixin_38635166

粉丝: 8
资源: 876

深度前馈卷积稀疏降噪自动编码器的多级委员会用于对象识别

Sparse autoencoder

基于卷积自动编码器的多描述编码

使用深度降噪自动编码器进行可靠的音乐识别

深度学习 深度前馈网络

CS294A Lecture notes Sparse autoencoder （稀疏自编码器课程讲义，吴恩达）

深度前馈网路的交通信号检测

卷积码的编码.pdf

Pytorch-pytorch深度学习教程之变分自动编码器.zip

人工智能-深度学习-自编码器-多种自编码器实现-autoencoder-master.7z

深度前馈网络pdf

基于多层特征深度融合的卷积神经网络人脸识别方法.pdf

MIT深度学习基础知识 编码器-解码器架构分析.pdf

1.深度前馈神经网络1

知识堆叠降噪自编码器.docx

第二章 深度前馈神经网络-感知机.ppt

基于matlab深度学习工具箱来设计卷积神经网络用来对图像上的水体部分进行识别，并生成水体陆地二值化图像

深度学习之卷积神经网络

深度学习及卷积神经网络综述

卷积码 编码方法_结构特点及距离特性

不要再纠结卷积的公式啦！0公式深度解析全连接前馈网络与卷积神经网络.pdf

解析卷积神经网络 ——深度学习实践手册 魏秀参

【深度学习】卷积神经网络原理 深度学习原理.pdf

深度学习之卷积神经网络.pptx

不要再纠结卷积的公式啦！0公式深度解析全连接前馈网络与卷积神经网络.rar

基于深度学习的通信信号自动调制识别技术.pdf

毕业设计MATLAB_收缩式自动编码器.zip

基于大数据和深度学习的语音识别研究.pdf

MATLAB用于体脂数据上训练网络，开发用于模式识别的两层前馈网络，以及在MNIST数据上实现用于降维的自动编码器.rar

最新资源

深度学习深度前馈网络

MIT深度学习基础知识编码器-解码器架构分析.pdf

第二章深度前馈神经网络-感知机.ppt

卷积码编码方法_结构特点及距离特性

解析卷积神经网络 ——深度学习实践手册魏秀参

【深度学习】卷积神经网络原理深度学习原理.pdf