没有合适的资源?快使用搜索试试~ 我知道了~
使用嵌入了Softmax回归和多个神经网络的深度信任网络来学习用于人脸识别的分层表示
0 下载量 15 浏览量
2021-04-26
05:18:32
上传
评论
收藏 2.34MB PDF 举报
温馨提示
在人脸识别和分类中,基于标记数据不足的特征提取和分类是一个众所周知的难题。 为了解决这个问题,本文提出了一种新的半监督学习算法,称为深度信念网络,嵌入了Softmax回归算法(DBNESR)。 DBNESR首先通过深度学习来学习特征的层次表示,然后使用Softmax回归进行更有效的分类。 同时,我们基于监督学习设计了多种分类器:BP,HBPNN,RBF,HRBFNN,SVM和多分类决策融合分类器(MCDFC)-混合HBPNNs- HRBFNNs-SVM分类器。 实验证明:首先,提出的半监督深度学习算法DBNESR最适合用于人脸识别,具有最高和最稳定的识别率。 其次,半监督学习算法的效果要优于所有监督学习算法。 第三,混合神经网络比单一神经网络具有更好的效果。 第四,平均识别率和方差分别表示为BP。 最后,就其对硬智能智能任务建模的能力而言,它反映了DBNESR的层次结构表示。
资源推荐
资源详情
资源评论
Abstract— In face recognition and classication, feature extraction
and classication based on insufcient labeled data is a well-known
challenging problem. In this paper, a novel semi-supervised learning
algorithm named deep belief network embedded with Softmax regress
(DBNESR) is proposed to address this problem. DBNESR first learns
hierarchical representations of feature by deep learning and then
makes more efficient classification with Softmax regress. At the same
time, we design many kinds of classifiers based on supervised learning:
BP, HBPNNs, RBF, HRBFNNs, SVM and multiple classification
decision fusion classifier (MCDFC) ——hybrid HBPNNs- HRBFNNs
-SVM classifier. The conducted experiments validate: Firstly, the
proposed semi-supervised deep learning algorithm DBNESR is
optimal for face recognition with the highest and most stable
recognition rates; Second, the semi-supervised learning algorithm has
better effect than all supervised learning algorithms; Third, hybrid
neural networks has better effect than single neural network; Fourth,
the average recognition rate and the variance are respectively shown as
BP<HBPNNs ≈ RBF<HRBFNNs ≈ SVM<MCDFC<DBNESR and
BP>RBF>HBPNNs>HRBFNNs>SVM>MCDFC>DBNESR; At last,
it reflects hierarchical representations of feature by DBNESR in terms
of its capability of modeling hard articial intelligent tasks.
Index Terms—Face recognition, Semi-supervised, Hierarchical
representations, Hybrid neural networks, RBM, Deep belief
network, Deep learning
I. INTRODUCTION
Face recognition (FR) is one of the main areas of
investigation in biometrics and computer vision. It has a wide
range of applications, including access control, information
security, law enforcement and surveillance systems. FR has
caught the great attention from large numbers of research
groups and has also achieved a great development in the past
few decades [1-3]. However, FR suffers from some difculties
because of varying illumination conditions, different poses,
disguise and facial expressions and so on [4-6]. A plenty of FR
algorithms have been designed to alleviate these difculties
[7-9]. FR includes three key steps: image preprocessing, feature
extraction and classication. Image preprocessing is essential
process before feature extraction and also is the important step
in the process of FR. Feature extraction is mainly to give an
effective representation of each image, which can reduce the
computational complexity of the classication algorithm and
enhance the separability of the images to get a higher
recognition rate. While classication is to distinguish those
extracted features with a good classier. Therefore, an effective
face recognition system greatly depends on the appropriate
This work is supported by China National Science Foundation (Project no.
61171141)
representation of human face features and the good design of
classier [10].
To select the features that can highlight classification, many
kinds of feature selection methods have been presented, such as:
spectral feature selection (SPEC) [11], multi-cluster feature
selection (MCFS) [12], minimum redundancy spectral feature
selection (MRSF) [13], and joint embedding learning and
sparse regression (JELSR) [14]. In addition, wavelet transform
is popular and widely applied in face recognition system for its
multi-resolution character, such as 2-dimensional discrete
wavelet transform [15], discrete wavelet transform [16], fast
beta wavelet networks [17], and wavelet based feature selection
[18-19-20].
After extracting the features, the following work is to design
an effective classier. Classification aims to obtain the face
type for the input signal. Typically used classification
approaches include polynomial function, HMM [21-22], GMM
[23], K-NN [23], SVM [24], and Bayesian classifier [25]. In
addition, random weight network (RWN) is proposed in some
articles [26-27] and there are also other kinds of neural
networks used as the classier for FR [28-29].
In this paper, we first make image preprocessing to eliminate
the interference of noise and redundant information, reduce the
effects of environmental factors on images and highlight the
important information of images. At the same time, in order to
compensate the deciency of geometric features, it is well
known that the original face images often need to be well
represented instead of being input into the classier directly
because of the huge computational cost. Therefore, PCA and
2D PCA are used to extract geometric features from
preprocessed images, reduce their dimensionality for
computation and attain a higher level of separability. At last, we
propose a novel semi-supervised learning algorithm called deep
belief network embedded with Softmax regress (DBNESR) as
classier for FR, design many kinds of classifiers based on
supervised learning and make experiments to validate the
effectiveness of the algorithm.
The main contributions of this paper can be concluded as
follows:
1) A novel semi-supervised learning algorithm called deep
belief network embedded with Softmax regress (DBNESR) is
proposed. DBNESR first learns hierarchical representations [30]
of feature by deep learning and then makes more efficient
classification with Softmax regress.
2) Many kinds of classifiers based on supervised learning: BP,
HBPNNs, RBF, HRBFNNs, SVM and multiple classification
Learning Hierarchical Representations for Face
Recognition using Deep Belief Network Embedded with
Softmax Regress and Multiple Neural Networks
Hai-jun Zhang, Nan-feng Xiao
School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China
decision fusion classifier (MCDFC) ——hybrid HBPNNs-
HRBFNNs-SVM classifier are designed.
3) The analysis and experiments are performed on the precise
rate of face recognition. The conducted experiments validate:
Firstly, the proposed semi-supervised deep learning algorithm
DBNESR is optimal for face recognition with the highest and
most stable recognition rates; Second, the semi-supervised
learning algorithm has better effect than all supervised learning
algorithms; Third, hybrid neural networks has better effect than
single neural network; Fourth, the average recognition rate and
the variance are respectively shown as BP<HBPNNs ≈ RBF
<HRBFNNs ≈ SVM<MCDFC<DBNESR and BP>RBF>
HBPNNs>HRBFNNs>SVM>MCDFC>DBNESR; At last, it
reflects hierarchical representations of feature by DBNESR in
terms of its capability of modeling hard articial intelligent
tasks.
The remainder of this paper is organized as follows. Section
2 reviews the images preprocessing. Section 3 introduces the
feature extraction methods. Section 4 designs the classifiers of
supervised learning. Section 5 gives and designs the classifier
of semi-supervised learning proposed by us. Experimental
results are presented and discussed in Section 6. Section 7 gives
the concluding remarks.
II. IMAGES PREPROCESSING
Images often appear the phenomenon such as low contrast,
being not clear and so on in the process of generation,
acquisition, input, etc. of images due to the influence of
environmental factors such as the imaging system, noise and
light conditions so on. Therefore it needs to make images
preprocessing. The purpose of the preprocessing is to eliminate
the interference of noise and redundant information, reduce the
effects of environmental factors on images and highlight the
important information of images [31]. Images preprocessing
usually includes gray of images, images filtering, gray
equalization of images, standardization of images, compression
of images (or dimensionality-reduced) and so on [32]. The
process of images preprocessing is as following.
A. Face images filtering
We use median filtering to make smoothing denoising for
images. This method not only can effectively restrain the noise
but also can very well protect the boundary. Median filter is a
kind of nonlinear operation, it sorts a pixel point and all others
pixel points within its neighborhood as the size of grey value,
sets the median of the sequence as the gray value of the pixel
point, as shown in Eq.(1).
'
( , ) ( , )
s
f i j Med f i j
(1)
where,
s
is the filter window. Using the template of 3×3 makes
median filtering for the experiment in the back.
B. Histogram equalization
The purpose of histogram equalization is to make images
enhancement, improve the visual effect of images, make
redundant information of images after preprocessing less and
highlight some important information of images.
Set the gray range of image
( , )A x y
as
[0, ]L
, image
histogram for
( )
A
H r
, Therefore, the total pixel points are
0 0
( )
L
A
A H r dr
(2)
Making normalization processing for the histogram, the
probability density function of each grey value can be obtained:
0
( )
( )
A
H r
p r
A
(3)
The probability distribution function is
0 0
0
1
( ) ( ) ( )
L L
A
P r p r dr H r dr
A
(4)
Set the gray transformation function of histogram equalization
as the limited slope not reduce continuously differentiable
function
( )s T r
, input it into
( , )A x y
to get the output
( , )B x y
.
( )
B
H r
is the histogram of output image, it can get
( ) ( )
B A
H s ds H r dr
(5)
'
( ) ( )
( )
( )
A A
B
H r H r
H s
ds dr T r
(6)
where,
'
( )T r ds dr
. Therefore, when the difference
between the molecular and denominator of
( )
B
H r
is only a
proportionality constant,
( )
B
H r
is constant. Namely
'
0
( ) ( )
A
C
T r H r
A
(7)
0
0
( ) ( ) ( )
r
A
C
s T r H r dr CP r
A
(8)
In order to make the scope of
s
for
[0, ]L
, can get
C L
. For
discrete case the gray transformation function is as following:
0
( ) ( ) ( )
k k
i
k i
i i o
n
s T r CP r C p r C
n
(9)
where,
k
r
is the
kth
grayscale,
k
n
is the pixel number of
k
r
,
n
is
the total pixels number of images, the scope of
k
for
[0, 1]L
.
We make the histogram equalization experiment for the images
in the back.
C. Compression of images (or dimensionality-reduced)
It is well known that the original face images often need to be
well represented instead of being input into the classier
directly because of the huge computational cost. As one of the
popular representations, geometric features are often extracted
to attain a higher level of separability. Here we employ multi-
scale two-dimensional wavelet transform to generate the initial
geometric features for representing face images.
We make the multi-scale two-dimensional wavelet transform
experiment for the images in the back.
III. FEATURE EXTRACTION
There are two main purposes for feature extraction: One is to
extract characteristic information from the face images, the
feature information can classify all the samples; The second is
to reduce the redundant information of the images, make the
data dimensionality being on behalf of human faces as far as
possibly reduce, so as to improve the speed of subsequent
operation process. It is well known that image features are
usually classied into four classes: Statistical-pixel features,
visual features, algebraic features, and geometric features (e.g.
transform-coefcient features).
A. Extract features with PCA
Suppose that there are
N
facial images
1
{ }
N
i i
X
,
i
X
is
column vector of
M
dimension. All samples can be expressed
as following:
1 2
( , , , )
T
N
X X X X
(10)
Calculate the average face of all sample images as following:
1
1
N
i
i
X X
N
(11)
Calculate the difference of faces, namely the difference of each
face with the average face as following:
, 1, 2, ,
i i
d X X i N
(12)
Therefore, the images covariance matrix
C
can be represented
as following:
1
1 2 N
1 1
A=(d ,d , ,d )
N
T T
i i
i
C d d AA
N N
(13)
Using the theorem of singular value decomposition (SVD) to
calculate the eigenvalue
i
and orthogonal normalization
eigenvector
i
of
T
A A
, through Eq.(14) the eigenvalues of
covariance matrix
C
can be calculated.
1
,( 1, 2, , )
i i
i
u Av i N
(14)
Making all the eigenvalues
1 2
[ , , , ]
N
order in descend
according to the size, through the formula as following
1
1
min ,
k
j
j
k
N
j
j
u
t k t
u
(15)
where, usually set
90%a
, can get the eigenvalues face
subspace
1 2
, , ,
t
U u u u
. All the samples project to
subspace
U
, as following:
T
Z U X
(16)
therefore, using front
t
principal component instead of the
original vector
X
, not only make the facial features parameter
dimension is reduced, but also won't loss too much feature
information of the original images.
B. Extract features with 2D-PCA
Suppose sample set is
, 1, 2, , ; 1, 2, ,
i m n
j
S R i N j M
,
i
is the category,
j
is the sample of the
ith
category,
N
is the total number of
category,
M
is the total number of samples of each category,
K N M
is the number of all samples.
Let
S
be average of all samples as follows:
1 1
1
N M
i
j
i j
S S
K
(17)
Therefore, the images covariance matrix
G
can be represented
as follows:
1 1
1
( ) ( )
N M
i T i
j j
i j
G S S S S
K
(18)
and the generalized total scattered criterion
( )J X
can be
expressed by:
( )
T
J X X GX
(19)
Let
opt
X
be the unitary vector such that it maximizes the
generalized total scatter criterion
( )J X
, that is:
arg max ( )
opt
X
X J X
(20)
In general, there is more than one optimal solution. We usually
select a set of optimal solutions
1
{ , , }
t
X X
subjected to the
orthonormal constraints and the maximizing criterion
( )J X
,
where,
t
is smaller than the dimension of the coefcients matrix.
In fact, they are those orthonormal eigenvectors of the
matrix
G
corresponding to
t
largest eigenvalues.
Now for each sub-band coefcient matrix
i
S
, compute the
principal component of the matrix
i
S
as follows:
, 1, 2, ,
ij i j
y A x j t
(21)
Then we can get its reduced features matrix
1
[ , , ]
i i it
Y y y
,
1,2, ,i m
.
We extract features respectively with PCA and 2D-PCA and
compare their effects for the images in the back experiment.
IV. DESIGNING THE CLASSIFIERS OF SUPERVISED LEARNING
Usually the classifiers based on supervised learning are often
used for FR, in the paper we design two types of classifiers.
One is the type of supervised learning classifiers and the other
is semi-supervised learning classifiers [33].
A. Single BP neural network
The BP neural network is a kind of multilayer feed-forward
network according to the back-propagation algorithm for errors,
is currently one of the most widely used neural network models
[34]. The recognition and classification of the face images is an
important application for the BP neural network in the field of
pattern recognition and classification. The network consists
剩余15页未读,继续阅读
资源评论
weixin_38699593
- 粉丝: 6
- 资源: 912
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于多智能体深度强化学习的边缘协同任务卸载方法设计源码
- 基于BS架构的Java、Vue、JavaScript、CSS、HTML整合的毕业设计源码
- 基于昇腾硬件加速的AI大模型性能优化设计源码
- 基于Plpgsql与Python FastAPI的mini-rbac-serve权限管理系统后端设计源码
- 基于SpringBoot的轻量级Java快速开发源码
- 基于Python开发的物流调度算法设计源码
- 基于Java语言开发的推箱子游戏设计源码
- 基于C++与Python的跨平台log4x设计源码,简易易用功能强大的日志工具包
- 基于Python开发的安全即时通讯系统设计源码
- 基于Python的atrmstar项目设计源码及Shell、HTML集成方案
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功