没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
局部二进制模式(LBP)是一种有效的纹理描述符,在纹理分类和面部识别中具有成功的应用。 常规LBP描述符有许多扩展。 扩展之一是主要局部二进制图案,其目的是提取纹理图像中的主要局部结构。 第二个扩展表示Gabor变换域(LGBP)中的LBP描述符。 第三个扩展是多分辨率LBP(MLBP)。 另一个扩展是用于视频纹理提取的动态LBP。 在本文中,我们将传统的本地二进制模式扩展到金字塔变换域(PLBP)。 通过级联分层空间金字塔的LBP信息,PLBP描述符考虑了纹理分辨率的变化。 PLBP描述符显示了其在纹理表示方面的有效性。 对LBP,MLBP,LGBP和PLBP进行了全面比较。 比较了无采样,部分采样和空间金字塔采样方法构建PLBP纹理描述符的性能。 讨论了金字塔生成方法和金字塔级别对基于PLBP的图像分类性能的影响。 与现有的多分辨率LBP描述符相比,PLBP具有令人满意的性能和较低的计算成本。
资源推荐
资源详情
资源评论
PLBP: An effective local binary patterns texture descriptor with
pyramid representation
Xueming Qian
a
, Xian-Sheng Hua
b
, Ping Chen
a
, Liangjun Ke
a,
n
a
School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
b
Microsoft Research Asia, Beijing 10080, China
article info
Article history:
Received 8 August 2010
Received in revised form
11 March 2011
Accepted 30 March 2011
Available online 8 April 2011
Keywords:
Local binary pattern
Pyramid transform
Texture
Gaussian filter
Wavelet
abstract
Local binary pattern (LBP) is an effective texture descriptor which has successful applications in texture
classification and face recognition. Many extensions are made for conventional LBP descriptors. One of
the extensions is dominant local binary patterns which aim at extracting the dominant local structures
in texture images. The second extension is representing LBP descriptors in Gabor transform domain
(LGBP). The third extension is multi-resolution LBP (MLBP). Another extension is dynamic LBP for video
texture extraction. In this paper, we extend the conventional local binary pattern to pyramid transform
domain (PLBP). By cascading the LBP information of hierarchical spatial pyramids, PLBP descriptors take
texture resolution variations into account. PLBP descriptors show their effectiveness for texture
representation. Comprehensiv e comparisons are made for LBP, MLBP, LGBP, and PLBP. Performances
of no sampling, partial sampling and spatial pyramid sampling approaches for the construction of PLBP
texture descriptors are compared. The influences of pyramid generation approaches, and pyramid levels
to PLBP based image categorization performances are discussed. Compared to the existing multi-
resolution LBP descri ptors, PLBP is with satisfactory performances and with low computational costs.
& 2011 Elsevier Ltd. All rights reserved.
1. Introduction
Recently, Bag-of-Words (BoW) models have been shown their
effectiveness in image classification and retrieval [1–11]. BoW
based scene categorization approaches model objects in an image
as geometric-free structures. Thus the BoW based approaches are
robust to the illumination, occlusion, rotation, and resolution
variations. In many applications, the co-occurrences, dependences
and linkages of BoW are modeled [3,4,8–10]. However, the BoW
models are less discriminative for texture classification. How to
classify the texture images effectively is a challenging problem.
Usually, most texture classification approaches are based on the
assumption that the specified texture images to be classified are
identical to the training images in resolution, contrast, orienta-
tion, and other visual appearances. However, in the real world,
textures can occur at arbitrary spatial resolutions and orienta-
tions. Texture is also subjected to the variations of illuminations
and imaging conditions. The texture patches closing to camera
are usually with high resolutions while those far from camera
are with low resolutions. This can be shown from the examples
in Fig. 1.
In order to carry out effective texture classification, robust
texture descriptors are required. GIST is an effective texture
descriptor [12,13]. It is represented in multi-resolution and
multi-orientation Gabor transform domain. The GIST descriptor
cascades the magnitudes of all the sub-bands. Hierarchical wave-
let packet transform (HWVP) is utilized for texture representation
in [14]. HWVP and GIST descriptors have shown their effective-
ness in image categorization [15,31].
Despite of utilizing multi-resolution based texture representa-
tion [11–15], Ojala et al. represent texture features by observing
the statistical distributions of local binary patterns [17]. Uniform
and rotation-invariant uniform LBP descriptors are extended from
original LBP to extract the uniform and rotation stable local
binary patterns [17]. Various extensions are made for the con-
ventional LBP descriptors [19–25,28–30]. Due to their excellent
performances, LBP and its extensions have been successfully
utilized in image classification and face recognition. Zhang et al.
[21] represent local binary pattern in Gabor transform domain.
The marriage of Gabor transform and LBP descriptors further
improves the discriminative power of LBP descriptors. First,
multi-scale and multi-orientation Gabor filtering is carried out
for a given image, and then local binary patterns are extracted on
Gabor filtered images. Finally, all the LBP histograms of the Gabor
filtered sub-bands are concatenated into a single histogram
sequence [21]. By inheriting the advantages of Gabor filtering,
the local Gabor binary patterns (LGBP) are rotation invariant and
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/pr
Pattern Recognition
0031-3203/$ - see front matter & 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.patcog.2011.03.029
n
Corresponding author. Tel.: þ86 29 82667771.
E-mail addresses: qianxm@mail.xjtu.edu.cn,
qianxueming@gmail.com (X. Qian), xshua@microsoft.com,
huaxiansheng@gmail.com (X.-S. Hua), kelj163@163.com (L. Ke).
Pattern Recognition 44 (2011) 2502–2515
less sensitive to illumination variations. A volume local binary
patterns (VLBP) based dynamic texture recognition method is
proposed for video sequences [22,23]. Compared to the basic LBP,
VLBP uses both the spatial information and the temporal informa-
tion. It is a 3D-LBP which has advantages for representing
dynamic textures by extending conventional LBP from 2D image
to 3D video sequence. In Ref. [25], Lei et al. further extend VLBP to
Gabor transform domain. A spatial-temporal local binary pattern
is utilized to model dynamic background. Dominant local binary
pattern (DLBP) is aiming at capturing the dominating patterns in
texture images [19]. DLBP is extended from uniform LBP by
counting the occurrence frequencies of all rotation invariant
patterns defined in LBP groups. The DLBP is effective in describing
texture images with complicated shapes, curvature edges, cross-
ing boundaries and corners [19]. Similar to uniform LBP, DLBP is
constructed by sorting the occurrence frequencies of the patterns
in descending order.
Although many extensions have been made for conventional
LBP [19–25,28–30], their basic assumption is that texture resolu-
tion of an image is fixed as shown in Fig. 1(a). Actually, texture
patches in an image can be with various resolutions as shown in
Fig. 1(b). Each of the three examples shown in Fig. 1(a) is with
same resolution. There are resolution variations among the
images. The texture resolutions of each images shown in
Fig. 1(b) are varying significantly. Usually, texture information
in a fixed resolution does not have significant discriminative
power. For example, the LBP labels of the center pixels of the
three patches are all mapped to the same bin at the first
resolution (resolution #1), while the LBP labels are mapped to
different bins at the other resolutions as shown in Fig. 2. That is
to say, the coming of LBP labels of the other resolutions improves
the discriminative power. Thus considering resolution variations
can be contributive for texture classification. The rest of this
paper is organized as follows: In Section 2 the conventional
LBP and its extensions are briefly overviewed, and then LBP
with pyramid representation is introduced in detail. In Section 3
performance evaluation approach is given. In Section 4 experimental
results and discussions are given. Finally, conclusions are drawn in
Section 5.
2. Local binary pattern with pyramid representation
First, the conventional LBP descriptor and its extensions are
briefly reviewed, and then the proposed local binary pattern with
pyramid representation and its relationships with existing multi-
resolution LBP descriptors are illustrated in detail.
2.1. Conventional local binary pattern and its extensions
Ojala et al. represent the texture information by defining
texture T in a local neighborhood of a gray level image as the
joint distribution of the gray levels of P image pixels as follows:
T ¼tðg
c
,g
0
,...,g
P1
Þtðsðg
0
g
c
Þ,...,sðg
P1
g
c
ÞÞ ð1Þ
where gray value g
c
corresponds to the gray value of the center
pixel of a local neighborhood and g
p
ðp ¼0,...,P1Þ correspond to
Fig. 1. Texture images with various resolutions. In (a) each image is with same resolution, images are with same content but with various resolutions. (b) The texture
resolution are varying in each image.
LBP
(Resolution #1)
LBP
(Resolution #2)
LBP
(Resolution # n)
Local
Patch
Fig. 2. Diagram of LBP labels in different resolutions. The LBP labels of the center
pixels of the three patches are the same at the first resolution (Resolution #1),
thus are hard to be discriminated at resolution #1, while they are discriminative
with the coming of the LBP labels in the following resolutions.
X. Qian et al. / Pattern Recognition 44 (2011) 2502–2515 2503
the gray values of P equally spaced pixels on a circle of radius
R(R4 0) that form a circularly symmetric neighbor set. And
sðxÞ¼
0,xZ 0
1,xo 0
(
. Finally, according to the rule utilized in Ref.
[17], a binary factor 2
p
is assigned to each neighbor. The original
LBP value of a pixel is represented by
LBP ¼
X
P1
p ¼ 0
ðsðg
p
g
c
Þ2
p
Þð2Þ
The LBP descriptor labels the pixels of an image by determin-
ing the gray levels of the P neighbors (with radius R) of the center
pixel as shown in Fig. 3. Finally, the histogram of the labels is
utilized for texture description.
Ojala et al. [18] extend the original LBP descriptor to multi-
resolution LBP by using neighbors of different sizes. Using circular
neighborhoods and interpolating the pixel values allow any
radius and number of pixels in the neighborhood as shown in
Fig. 3. Thus the final multi-resolution LBP (denoted MLBP)is
constructed as follows:
MLBP ¼[
P,R
LBP
P,R
¼/LBP
P
1
,R
1
;...;LBP
P
S
,R
S
S ð3Þ
where P
i
and R
i
(i¼1,y,S) denote the neighbor number and
radius. The combination can improve the discriminative power
of the texture descriptors. However MLBP descriptors are
obtained from the same image with fixed resolution. MLBP
represents multi-resolution texture information by sparse sam-
pling [29]. Maenpaa et al. [28,29] have pointed out that this
representation approach is sensitive to noise, because sampling is
made at a single pixel position rather than an effective region.
Moreover, direct sampling usually causes aliasing effects [28,29].
To overcome the above two shortcomings, multi-resolution LBP
descriptor are proposed by utilizing low-pass filtering (LBPF) and
local averaging [28–30]. In LBPF, each sample in the neighborhood
can be made to collect intensity information from a large effective
area rather than a single pixel as shown in Fig. 6. The relationship
of LBPF and the proposed PLBP is deeply analyzed in Section 2.3.
Local Gabor binary pattern is extracted by carrying out Gabor
transform for the image with n-scale and m-orientation filtering
[21]. This method improves the discriminative power of original
LBP [17]. In this paper, we also represent LGBP in spatial pyramid
domain. Experimental results are given in Section 4.2.
2.2. Local binary pattern represented in spatial pyramid domain
Pyramid transform is an effective multi-resolution analysis
approach. In this paper, we represent local binary pattern in
spatial pyramid domain.
During pyramid transform, each pixel in the low spatial
pyramid is obtained by down sampling from its adjacent low-
pass filtered high resolution image as shown in Fig. 4(b). Thus in
the low-resolution images, a pixel corresponds to a region in its
high-resolutions. In Refs. [28,29], Maenpaa et al. call the region as
‘‘effective area’’. Sequential pyramid images are constructed as
shown in Fig. 5. Each neighboring two images are with resolution
variation rate 2. That is to say, the down sampling ratios in x- and
y-directions are both
ffiffiffi
2
p
. Pyramid image can be generated by low-
pass filters of wavelet transform, Gaussian smooth filtering
[28,29], symmetric weighting and block averaging [30].
The pyramid generation approach consists of low-pass filtering
and down sampling images of the preceding pyramid level. Let f(x, y)
(8,2)(8,1) (12,2) (8,3)
Fig. 3. Four Examples of the LBP. (a) the circular (8,1) with P¼8 and R ¼1; (b) the
circular (8,2) with P¼8 and R¼2; (c) the circular (12,2) with P¼12 and R¼2; and
(d) the circular (8,3) with P ¼8 and R ¼3. The pixel values are interpolated
whenever the sampling point is not in the center of a pixel.
Level 1
Level 2
Level 3
Level 4
Fig. 4. Diagram of pyramid transform and spatial pyramid sampling. (a) A four level spatial pyramids are shown. (b) The diagram of pyramid sampling in neighboring
3 resolutions. The down sampling ratios in x- and y- directions are both 2. The resolution variation of neighboring two pyramids is with a factor 4.
Fig. 5. Gaussian Pyramid images. The original image (level 1) and their Gaussian pyramids shown from left to right. The resolution variation of two neighboring pyramid is
2, that is to say the down sampling ratio is with a factor
ffiffiffi
2
p
in both x- and y- directions.
X. Qian et al. / Pattern Recognition 44 (2011) 2502–25152504
剩余13页未读,继续阅读
资源评论
weixin_38557980
- 粉丝: 7
- 资源: 925
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功