重构隐藏表示以实现可靠的特征提取资源-CSDN文库

106 浏览量 2021-03-08 04:22:51 上传评论收藏 2MB PDF 举报

《重构隐藏表示以实现可靠的特征提取》这篇研究论文聚焦于改进特征表示方法，旨在提出一种新的、鲁棒的特征提取策略。文章受到了自动编码器（Auto-Encoders）成功的启发，对基于传统自动编码器算法的普遍性质进行了理论分析和总结。作者指出所有基于传统自动编码器算法的共性特征：(1) 输入重构误差存在一个下限，这个下限可以作为重构输入的指导原则。当输入数据受到噪声干扰时，被污染输入的重构误差同样也有一个不可逾越的下限。这意味着即使在理想情况下，也无法完全消除重构误差。(2) 隐藏表示的重构达到理想状态是输入重构达到理想状态的必要条件。这表明优化隐藏层的表示对于整个模型的性能至关重要。(3) 通过最小化隐藏表示雅可比矩阵的Frobenius范数来优化可能会导致较差的局部最优解，这暗示了需要寻找更稳健的优化策略。基于上述分析，论文提出了一个新的模型——双去噪自动编码器（Double Denoising Auto-Encoders, DDAEs）。该模型创新地在输入和隐藏表示两个层面都引入了噪声和重构过程，从而增强了模型的灵活性和可扩展性，并且有望学习到更稳定、不变的特征表示。与仅处理输入噪声的去噪自动编码器（Denoising Auto-Encoders, DAEs）相比，DDAEs在应对噪声和不相关特征时表现出更高的鲁棒性。此外，论文还详细阐述了如何采用两种不同的预训练方法来训练DDAEs，即通过联合优化和独立优化目标函数。预训练是深度学习中常用的一种技术，它可以初始化网络权重，以便在网络后续的微调阶段获得更好的性能。通过对比和实验，DDAEs的预训练策略展示了其在处理数据噪声和复杂性方面的优越性。这篇研究论文对特征提取领域做出了重要贡献，提出的新模型DDAEs为理解和改进自动编码器提供了新的视角，同时也为实际应用中处理噪声和不完整数据提供了强大工具。这一工作不仅深化了我们对自动编码器基本原理的理解，也为未来在机器学习和深度学习领域的研究开辟了新的道路。

资源详情

资源评论

资源推荐

Reconstruction of Hidden Representation for Robust

Feature Extraction

ZENG YU and TIANRUI LI, Southwest Jiaotong University, China

NING YU, The College at Brockport State University of New York, USA

YI PAN, Georgia State University, USA

HONGMEI CHEN, Southwest Jiaotong University, China

BING LIU, University of Illinois at Chicago, USA

This article aims to develop a new and robust approach to feature representation. Motivated by the success

of Auto-Encoders, we rst theoretically analyze and summarize the general properties of all algorithms that

are based on traditional Auto-Encoders: (1) The reconstruction error of the input cannot be lower than a

lower bound, which can be viewed as a guiding principle for reconstructing the input. Additionally, when the

input is corrupted with noises, the reconstruction error of the corrupted input also cannot be lower than a

lower bound. (2) The reconstruction of a hidden representation achieving its ideal situation is the necessary

condition for the reconstruction of the input to reach the ideal state. (3) Minimizing the Frobenius norm of the

Jacobian matrix of the hidden representation has a deciency and may result in a much worse local optimum

value. We believe that minimizing the reconstruction error of the hidden representation is more robust than

minimizing the Frobenius norm of the Jacobian matrix of the hidden representation. Based on the above

analysis, we propose a new model termed Double Denoising Auto-Encoders (DDAEs), which uses corruption

and reconstruction on both the input and the hidden representation. We demonstrate that the proposed model

is highly exible and extensible and has a potentially better capability to learn invariant and robust feature

representations. We also show that our model is more robust than Denoising Auto-Encoders (DAEs) for

dealing with noises or inessential features. Furthermore, we detail how to train DDAEs with two dierent

pretraining methods by optimizing the objective function in a combined and separate manner, respectively.

Comparative experiments illustrate that the proposed model is signicantly better for representation learning

than the state-of-the-art models.

CCS Concepts: • Computing methodologies → Machine learning;•Machine learning approaches →

Neural networks;

Additional Key Words and Phrases: Deep architectures, auto-encoders, unsupervised learning, feature repre-

sentation, reconstruction of hidden representation

This work is supported by the National Science Foundation of China (Nos. 61773324, 61573292, 61572406).

Authors’ addresses: Z. Yu and T. Li (corresponding author), Southwest Jiaotong University, School of Information Sci-

ence and Technology, National Engineering Laboratory of Integrated Transportation Big Data Application Technology,

Chengdu, 611756, China; emails: zyu7@gsu.edu, trli@swjtu.edu.cn; N. Yu, The College at Brockport State University of

New York, Department of Computing Sciences, Brockport, NY, 14420; email: nyu@brockport.edu; Y. Pan, Georgia State

University, Department of Computer Science, Atlanta, 30302, GA; email: yipan@gsu.edu; H. Chen, Southwest Jiaotong

University, School of Information Science and Technology, National Engineering Laboratory of Integrated Transportation

Big Data Application Technology, Chengdu, 611756, China; email: hmchen@swjtu.edu.cn; B. Liu, University of Illinois at

Chicago, Department of Computer Science, Chicago, IL, 60607; email: liub@cs.uic.edu.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee

provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and

the full citation on the rst page. Copyrights for components of this work owned by others than ACM must be honored.

Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires

prior specic permission and/or a fee. Request permissions from permissions@acm.org.

2157-6904/2019/01-ART18 $15.00

https://doi.org/10.1145/3284174

ACM Transactions on Intelligent Systems and Technology, Vol. 10, No. 2, Article 18. Publication date: January 2019.

18:2 Z. Yu et al.

ACM Reference format:

Zeng Yu, Tianrui Li, Ning Yu, Yi Pan, Hongmei Chen, and Bing Liu. 2019. Reconstruction of Hidden Repre-

sentation for Robust Feature Extraction. ACM Trans. Intell. Syst. Technol. 10, 2, Article 18 (January 2019), 24

pages.

https://doi.org/10.1145/3284174

1 INTRODUCTION

Representation learning via deep neural networks has developed into an important area of

machine-learning research in recent years. This development has also witnessed a wide range of

successful applications in the elds of computer vision [21], speech recognition [13], and natural

language processing [38]. Reviews of recent progress can be found in [2, 4, 5, 23, 25, 28, 48].

A deep neural network usually has a deep architecture that uses at least one layer to learn the

feature representation of the given data. A representation learning procedure is applied to discover

multiple levels of representation: the higher the level, the more abstract the representation. It has

been shown that the performance of deep neural networks is heavily dependent on the multilevel

representation of the data [23]. In the past few years, researchers have endeavored to design a

variety of ecient deep learning algorithms that may capture some characteristics of the data-

generating distribution [24, 46, 47, 49]. Among these algorithms, the traditional Auto-Encoders

(AEs) [6] perhaps received the most research attention due to their conceptual simplicity, ease of

training, and inference and training eciency. They are used to learn the data-generating distri-

bution of the input data by minimizing the reconstruction error of the input



x ∈X

L(x, д(f (x))),

where f (.) is the encoder function, д(.) is the decoder function, and L(.) is the reconstruction

error. Recently, they have become one of the most promising approaches to representation learn-

ing for estimating the data-generating distribution. Since the appearance of Auto-Encoders, many

variants of representation learning algorithms based on Auto-Encoders have been proposed, e.g.,

Sparse Auto-Encoders [19, 45], Denoising Auto-Encoders (DAEs) [42], Higher-Order Contractive

Auto-Encoders [32], Variational Auto-Encoders [20], Marginalized Denoising Auto-Encoders [11],

Generalized Denoising Auto-Encoders [8], Generative Stochastic Networks [7], Masked Autoen-

coder for Distribution Estimation (MADE) [16], Laplacian Auto-Encoders [17], Adversarial Auto-

Encoders [27], Ladder Variational Auto-Encoders [39], and so on.

In an Auto-Encoder-based algorithm, minimizing the reconstruction error of the input with the

encoder and decoder functions is a common practice for feature learning. The learned features are

usually applied in subsequent tasks such as supervised classication [18]. In the past few years,

many research works have shown not only that the reconstruction of the input with the encoder

and decoder functions is an ecient way for learning feature representation but also its resulting

representations also substantially help the performance of the subsequent tasks. In general, a lower

value of the reconstruction error of the input has a better feature representation of the input.

In an ideal situation, the value of this reconstruction error is equal to 0; i.e., the input can be

completely reconstructed. In this article, we show that the reconstruction error of the input from

every traditional Auto-Encoder-based algorithm has a lower bound, which is greater than or equal

to 0.

As an important method of representation learning, minimizing the Frobenius norm of the Ja-

cobian matrix of the hidden representation has been widely used in deep learning models. The

rst application is the CAEs [33], which try to learn locally invariant features by minimizing the

Frobenius norm of the Jacobian matrix of hidden representation. After that, many frameworks

based on minimizing the Frobenius norm of the Jacobian matrix of hidden representation were

developed in computer vision tasks. Specically, Liu et al. [26] developed a multimodal feature

ACM Transactions on Intelligent Systems and Technology, Vol. 10, No. 2, Article 18. Publication date: January 2019.

Reconstruction of Hidden Representation for Robust Feature Extraction 18:3

learning model with stacked CAEs for video classication. To nd stable features, Schulz et al.

[35] designed a two-layer encoder that is regularized by an extension of a previous work on CAEs.

Geng et al. [15] proposed a novel deep supervised and contractive neural network for SAR image

classication by using the idea of minimizing the Frobenius norm of the Jacobian matrix of hid-

den representation. Shao et al. [36] introduced an enhancement deep feature fusion method for

rotating machinery fault diagnosis through a combination of DAEs and CAEs. However, we will

demonstrate that minimizing the Frobenius norm of the Jacobian matrix of hidden representation

has a deciency in learning feature representation.

To learn robust feature representation, minimizing the reconstruction error of hidden represen-

tation is also important and ecient. This idea has been emphasized by popular deep learning

algorithms such as Ladder Networks [29, 31] and Target Propagation Networks [3]. In order to

reconstruct the hidden representation, Ladder Networks need two streams of information to re-

construct the hidden representation: one is used to generate a clean hidden representation with an

encoder function; the other is utilized to reconstruct the clean hidden representation with a com-

binator function [29, 31]. The nal objective function is the sum of all the reconstruction errors

of hidden representation. It should be noted that reconstructing the hidden representation in each

layer needs to use information of two layers, which makes Ladder Networks dicult to be trained

with a layer-wise pretraining strategy. Training a deep learning model in a layer-wise manner, as

it is known, is an unsupervised learning approach, which may have many potential advantages.

To reconstruct the hidden representation, Target Propagation Networks [3] can be trained in a

layer-wise manner. Nevertheless, in the Target Propagation Networks, reconstructing hidden rep-

resentation is decomposed into two separate targets, which may be trapped in a local optimum.

To the best of our knowledge, reconstructing hidden representation as a whole and training it in

a layer-wise manner has not yet been investigated so far.

In this article, we rst study the general properties of all algorithms based on the traditional

Auto-Encoders. We aim to design a robust approach for feature representation based on these

properties. We follow the framework of layer-wise pretraining and consider the idea of recon-

struction of hidden representation. We propose a new deep learning model that takes advantage

of corruption and reconstruction. Our model consists of two separate parts: constraints on the in-

put (Constraints Part) and reconstruction on the hidden representation (Reconstruction Part). The

Constraints Part can be viewed as a traditional deep learning model such as the Auto-Encoder and

its variants. The Reconstruction Part can be viewed as explicitly regularizing the hidden represen-

tation or adding additional feedback to the pretraining stage. For simplicity and convenience, we

use a DAE as the Constraints Part to build our model. Because the best results are obtained by uti-

lizing the corruption in both input and hidden representation, we refer to it as Double Denoising

Auto-Encoders (DDAEs).

The contributions of this article are summarized as follows:

• We prove that for all algorithms based on traditional Auto-Encoders, the reconstruction

error of the input cannot be lower than a lower bound, which can serve as a guiding prin-

ciple for the reconstruction of the input. We also show that the necessary condition for

the reconstruction of the input to reach the ideal state is that the reconstruction of hidden

representation achieves its ideal condition. When the input is corrupted with noises, we

demonstrate that the reconstruction error of the corrupted input also cannot be lower than

a lower bound.

• We validate that minimizing the Frobenius norm of the Jacobian matrix of the hidden rep-

resentation has a deciency and may result in a much worse local optimum value. We

also show that minimizing reconstruction error of the hidden representation for feature

ACM Transactions on Intelligent Systems and Technology, Vol. 10, No. 2, Article 18. Publication date: January 2019.

18:4 Z. Yu et al.

representation is more robust than minimizing the Frobenius norm of the Jacobian matrix,

which may be the main reason the proposed DDAEs always outperform CAEs.

• We propose a new approach to learn robust feature representations of the input based on

the above evidence. Compared with the existing methods, DDAEs have the following ad-

vantages: (1) DDAEs are exible and extensible and have a potentially better capability of

learning invariant and robust feature representations; (2) for dealing with noises or some

inessential features, DDAEs are more robust than DAEs; and (3) DDAEs can be trained with

two dierent pretraining strategies by optimizing the objective function in a combining or

separate manner, respectively.

The rest of this article is organized as follows. Section 2 introduces the basic DAEs and CAEs.

Section 3 presents the lower bound of the reconstruction error of the input and the necessary con-

dition for the reconstruction of the input to reach its ideal state. Section 4 illustrates the defect

of CAEs and gives a theoretical proof on why DDAEs can outstrip CAEs. Section 5 describes the

proposed DDAEs framework. Section 6 compares the performance of DDAEs with other relevant

state-of-the-art representation learning algorithms using various testing datasets. Conclusions to-

gether with some further studies are summarized in the last section.

2 PRELIMINARIES

DDAEs are designed according to the traditional Auto-Encoders [6] that learn feature represen-

tation by minimizing the reconstruction error. For ease of understanding, we reveal DDAEs by

starting to describe some conventional Auto-Encoder variants and notations.

2.1 Denoising Auto-Encoders: Extracting Robust Features of Reconstruction

Similar to traditional Auto-Encoders [6], the DAEs [42] rst use the encoder and decoder pro-

cedures to train a one-layer neural network by minimizing the reconstruction error and then

stack a deep neural network with the trained layers. The only dierence between traditional Auto-

Encoders and DAEs is that DAEs train the neural network with corrupted input, while the tradi-

tional Auto-Encoders use the original input. The corrupted input

x ∈

is usually obtained from

a conditional distribution q(

x |x ) by injecting some noises into the original input x ∈

.Typi-

cally, the most widely used noises in the simulations are Gaussian noise

x = x + ϵ, ϵ ∼N(0, σ

I )

and masking noise, where ν%(ν is given by researchers) of the input components is set to 0.

To extract robust features, a DAE rst maps the corrupted input

x to a hidden representation

h ∈

by the encoder function f :

h = f (

x) = S

x + b

), (1)

where W ∈

×D

is a connection weight matrix, b

∈

is a bias vector of hidden represen-

tation, and S

is an activation function, typically a logistic siдmoid(τ ) =

1+e

−τ

. After that, the DAE

reversely maps the hidden representation h back to a reconstruction input x

∗

∈

through the

decoder function д:

∗

= д(h) = S



h + b

), (2)

where W



∈

×D

is a tied weight matrix, i.e., W



= W

; b

∈

is a bias vector; and S

an activation function, typically either the identity (yielding linear reconstruction) or a sigmoid.

Finally, the DAE learns the robust features by minimizing the reconstruction error on a training

set X = {x

, x

,...,x

DAE

(θ ) =



x ∈X

E[L(x, д( f (

x)))], (3)

ACM Transactions on Intelligent Systems and Technology, Vol. 10, No. 2, Article 18. Publication date: January 2019.

Reconstruction of Hidden Representation for Robust Feature Extraction 18:5

where θ = {W, b

, b

}, E (δ ) is the mathematical expectation of δ ,andL is the reconstruction er-

ror. Typically the squared error L(x, y) = x −y

is used when S

is the identity function and

the cross-entropy loss L(x, y) = −[



i=1

loд(y

) + (1 − x

)loд(1 −y

)] is selected when S

is the

sigmoid function.

It has been shown that DAEs can extract robust features by injecting some noises into the origi-

nal input and implicitly capture the data-generating distribution of input in the conditions in which

the reconstruction error is the squared error and the data are continuous valued with Gaussian

corruption noise [1, 8, 41].

2.2 Contractive Auto-Encoders: Extracting Locally Invariant Features

of Hidden Representation

To extract locally invariant features, the CAEs [33] penalize the sensitivity by adding an analytic

contractive penalty to the traditional Auto-Encoders. The contractive penalty is the Frobenius

norm of rst derivatives J

(x)

of the encoder function f (x ) with respect to the input x.

Formally, the objective optimized by a CAE is

CAE

(θ ) =



x ∈X



L(x, д(f (x ))) + α J

(x)



, (4)

where α is a hyperparameter that controls the strength of the regularization. For a sigmoid encoder,

the contractive penalty is simply computed:

J

(x)



j=1

(1 −h

))



i=1

. (5)

Compared with DAEs, the CAEs have at least two dierences: (1) the penalty is analytic rather than

stochastic, and (2) a hyperparameter α allows one to control the tradeo between reconstruction

and robustness. Actually, in an optimizing searching algorithm, it seems more likely that CAEs try

to nd invariant features by restricting step lengths to small numbers (i.e., numbers close to zero)

in each search.

3 LOWER BOUND OF THE RECONSTRUCTION ERROR OF THE INPUT

Generally, in an algorithm based on traditional Auto-Encoders, the smaller the reconstruction error

of the input is, the better the algorithm. Ideally, the value of reconstruction error of the input is

equal to 0. This means that the algorithm can completely reconstruct the input. However, in this

article, we prove that the reconstruction error of the input has a lower bound, which can be viewed

as a criterion for the reconstruction of the input. We also illustrate that the reconstruction of hidden

representation achieving its ideal condition is the necessary condition for the reconstruction of the

input to reach the ideal state. When the input is corrupted with noise, we demonstrate that the

reconstruction error of the corrupted input has a lower bound too.

3.1 Lower Bound and Necessary Condition

We present the lower bound of reconstruction error of the input and a rigorous theoretical analysis

below. We also reveal the necessary condition for the reconstruction of the input to reach the ideal

state.

T 1. Let L(x, y) = x −y

be the squared error. If we use the clean input x and clean

hidden representation h

to reconstruct themselves, then as x

∗

→ x, we have

L(x, д(f (x ))) ≥ L(h

, f (д(h

)))/J

(x)

, (6)

ACM Transactions on Intelligent Systems and Technology, Vol. 10, No. 2, Article 18. Publication date: January 2019.

剩余23页未读，继续阅读

评论收藏

内容反馈

weixin_38696922

粉丝: 3
资源: 929

重构隐藏表示以实现可靠的特征提取

评论0

最新资源

重构隐藏表示以实现可靠的特征提取

评论0

人脸图像特征提取matlab代码-INMFSC:此回购实现了JingJing等人针对“具有稀疏约束的增量非负矩阵分解进行图像表示”的特征提取的

核主元分析 (KPCA)的 MATLAB 实现 (降维、重构、特征提取、故障检测)

采用深度稀疏自动编码器实现高维矩阵降维，提取特征

xiaobobaotry_多维特征提取_多维特征_小波包_小波包变换_

基于深度学习的通信辐射源指纹特征提取算法.pdf

PCA-matlab_sae_特征数据提取_

Exercise10 Sparse Coding.zip_MATLAB 图像分类_sparse_稀疏编码提取_稀疏自编码器_自编

PCA.zip_ICA特征提取_PCA 图像_PCA分析法_主成分分析法_特征比较法

机器学习的特征提取与Ising模型的相变点_Feature extraction of machine learning and

SDAE-master.zip

相空间重构

PSR.zip_PSR重构_psr_结构重构

重构相空间

BSBL_EM.rar_BSBL_BSBM算法_BSBS-EM_信号重构_重构

使用matlab实现的小波变换彩色图像水印嵌入和提取程序

SAE_deepsae_sae_冗余特征_autoencoder_降维_

低秩表示代码

【信号隐藏】基于奇异值分解svd和小波变换算法求解水印嵌入提取matlab代码.zip

人工智能-深度学习-自编码器-多种自编码器实现-autoencoder-master.7z

基于MATLAB实现卷积神经网络CNN，并对图像进行特征提取+运行结果.zip

内接圆及外接圆下PHT分解重构

基于snake算法实现数字图像的边缘检测，图像分割以及特征提取附matlab代码.zip

AE（自动编码器）与VAE（变分自动编码器）的区别和联系？

最新资源