没有合适的资源?快使用搜索试试~ 我知道了~
在计算机视觉和多媒体搜索中,通常使用来自不同视图的多个功能来表示一个对象。 例如,要很好地表征自然场景图像,必须找到一组视觉特征来表示其颜色,纹理和形状信息,并将每个特征编码为矢量。 因此,我们在不同的空间中有一组向量来表示图像。 传统的频谱嵌入算法无法直接处理此类数据,因此我们必须将这些向量连接在一起作为新向量。 这种级联在物理上没有意义,因为每个功能都有特定的统计属性。 因此,我们开发了一种新的频谱嵌入算法,即多视图频谱嵌入(MSE),该算法可以以不同方式对不同特征进行编码,以实现物理上有意义的嵌入。 特别是,MSE发现了一个低维嵌入,其中每个视图的分布都足够平滑,MSE探索了不同视图的互补属性。 由于没有针对MSE的闭式解决方案,因此我们推导了基于交替优化的迭代算法来获取低维嵌入。 基于图像检索,视频注释和文档聚类的应用进行的经验评估证明了该方法的有效性。 ? 2010 IEEE。
资源推荐
资源详情
资源评论
1438 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 6, DECEMBER 2010
Multiview Spectral Embedding
Tian Xia, Dacheng Tao, Member, IEEE, Tao Mei, Member, IEEE, and Yongdong Zhang, Member, IEEE
Abstract—In computer vision and multimedia search, it is com-
mon to use multiple features from different views to represent an
object. For example, to well characterize a natural scene image, it
is essential to find a set of visual features to represent its color,
texture, and shape information and encode each feature into a
vector. Therefore, we have a set of vectors in different spaces to
represent the image. Conventional spectral-embedding algorithms
cannot deal with such datum directly, so we have to concatenate
these vectors together as a new vector. This concatenation is not
physically meaningful because each feature has a specific statis-
tical property. Therefore, we develop a new spectral-embedding
algorithm, namely, multiview spectral embedding (MSE), which
can encode different features in different ways, to achieve a
physically meaningful embedding. In particular, MSE finds a low-
dimensional embedding wherein the distribution of each view is
sufficiently smooth, and MSE explores the complementary prop-
erty of different views. Because there is no closed-form solution
for MSE, we derive an alternating optimization-based iterative
algorithm to obtain the low-dimensional embedding. Empirical
evaluations based on the applications of image retrieval, video an-
notation, and document clustering demonstrate the effectiveness
of the proposed approach.
Index Terms—Dimensionality reduction, multiple views,
spectral embedding.
I. INTRODUCTION
I
N COMPUTER vision and multimedia search [5], [6],
objects are usually represented in several different ways.
This kind of data is termed as the multiview data. A typical
example is a color image, which has different views from dif-
ferent modalities, e.g., color, texture, and shape. Different views
form different feature spaces, which have particular statistical
properties.
Manuscript received May 14, 2009; revised August 31, 2009 and November
18, 2009; accepted December 6, 2009. Date of publication February 17, 2010;
date of current version November 17, 2010. This work was supported in part
by the National Basic Research Program of China (973 Program) under Grant
2007CB311100; by the National High-Technology Research and Development
Program of China (863 Program) under Grant 2007AA01Z416; by the National
Natural Science Foundation of China under Grants 60873165, 60802028, and
60902090; by the Beijing New Star Project on Science and Technology under
Grant 2007B071; by the Co-building Program of Beijing Municipal Education
Commission; by the Nanyang Technological University Nanyang SUG Grant
under Project M58020010; by the Microsoft Operations PTE LTD-NTU Joint
R&D under Grant M48020065; and by the K. C. Wong Education Foundation
Award. This paper was recommended by Associate Editor S. Sarkar.
T. Xia and Y. Zhang are with the Center for Advanced Computing Tech-
nology Research, Institute of Computing Technology, Chinese Academy of
Sciences, Beijing 100190, China (e-mail: txia@ict.ac.cn; zhyd@ict.ac.cn).
D. Tao is with the School of Computer Engineering, Nanyang Technological
University, Singapore 639798 (e-mail: dctao@ntu.edu.sg).
T. Mei is with Microsoft Research Asia, Beijing 100190, China (e-mail:
tmei@microsoft.com).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCB.2009.2039566
Because of the popularity of multiview data in practical
applications, particularly in the multimedia domain, learning
from multiview data, which is also known as multiple-view
learning, has attracted more and more attentions. Although a
great deal of efforts have been carried out on multiview data
learning [1], including classification [21], clustering [4], [19],
and feature selection [20], little progress has been made in
dimensionality reduction, whereas it has many applications in
multimedia [28], e.g., image retrieval and video annotation.
Multimedia data generally have multiple modalities, and each
modality is usually represented in a high-dimensional feature
space which frequently leads to the “curse of dimensional-
ity” problem. In this case, multiview dimensionality reduction
provides an effective solution to solve or at least reduce this
problem.
In this paper, we consider the problem of spectral em-
bedding for multiple-view data based on our previous patch
alignment framework [29]. The major challenge is learning a
low-dimensional embedding to effectively explore the comple-
mentary nature of multiple views of a data set. The learned
low-dimensional embedding should be better than a low-
dimensional embedding learned by each single view of the
data set.
Existing spectral-embedding algorithms assume that sam-
ples are drawn from a vector space and thus cannot deal
with multiview data directly. A possible solution is to con-
catenate vectors from different views together as a new vec-
tor and then apply spectral-embedding algorithms directly on
the concatenated vector. However, this concatenation is not
physically meaningful because each view has a specific sta-
tistical property. This concatenation ignores the diversity of
multiple views and thus cannot efficiently explore the com-
plementary nature of different views. Another solution is the
distributed spectral embedding (DSE) proposed in [3]. DSE
performs a spectral-embedding algorithm on each view in-
dependently, and then based on the learned low-dimensional
representations, it learns a common low-dimensional embed-
ding which is “close” to each representation as much as
possible. Although DSE allows selecting different spectral-
embedding algorithms for different views, the original multiple-
view data are invisible to the final learning process, and
thus, it cannot well explore the complementary nature of dif-
ferent views. Moreover, its computational cost is dense be-
cause it conducts spectral-embedding algorithms for each view
independently.
To effectively and efficiently learn the complementary nature
of different views, we propose a new algorithm, i.e., multiview
spectral embedding (MSE), which learns a low-dimensional
and sufficiently smooth embedding over all views simultane-
ously. Empirical evaluations based on image retrieval, video
1083-4419/$26.00 © 2010 IEEE
XIA et al.: MULTIVIEW SPECTRAL EMBEDDING 1439
annotation, and document clustering show the effectiveness of
the proposed approach.
The rest of this paper is organized as follows. In Section II,
we provide a short review on related works. In Section III,
we present the proposed MSE and the solution of MSE. Ex-
perimental results are shown in Section IV, and Section V
concludes.
II. R
ELATED WORKS
In this section, first, we provide a short review on con-
ventional spectral-embedding algorithms, which are all for
single-view data. Although there are some previous works on
multiview spectral methods, they are about multiview learning
for clustering [4] and classification [21] but not for dimension-
ality reduction. As for spectral embedding for multiple-view
data, only some preliminary effort [3] is known to us; thus,
second, we give a brief introduction to the distributed method
proposed in [3].
A. Spectral Embedding
The task of dimensionality reduction is to find a low-
dimensional representation for high-dimensional observations.
It generally falls into two classes: linear methods, e.g., prin-
ciple component analysis and multidimensional scaling; and
nonlinear methods, e.g., locally linear embedding (LLE) [8] and
Laplacian eigenmaps (LE) [9].
Spectral methods for dimensionality reduction find the low-
dimensional representations by using eigenvectors of specially
constructed matrices [29]. Since, in the traditional problem set-
ting of dimensionality reduction, it is assumed that the data are
represented in a single vector space, the conventional spectral-
embedding algorithms can all be regarded as methods with a
single view.
Existing algorithms can be classified into two groups based
on whether they are supervised or unsupervised. The focus
of this paper is the latter. Representative algorithms include
Isomap [7], LLE [8], LE [9], Hessian eigenmaps [10], local
tangent space alignment [11], transductive component analysis
[2], discriminative locality alignment [30], and DLLE [27].
They perform well for single-view data but cannot deal with
multiview data directly.
B. Distributed Approach for Spectral Embedding With
Multiple Views
As mentioned earlier, spectral embedding with multiple
views is a new topic; it is first proposed in [3], and a distributed
approach, i.e., DSE, is proposed in it. In the following is a brief
summary of DSE.
Given a multiple-view datum with n objects having m views,
i.e., a set of matrices X = {X
(i)
∈
m
i
×n
}
m
i=1
, each represen-
tation X
(i)
is a feature matrix from view i. DSE assumes that
the low-dimensional embedding of each view X
(i)
is already
known, i.e., A = {A
(i)
∈
n×k
i
}
m
i=1
, k
i
<m
i
(1 ≤ i ≤ m).
DSE focuses on how to learn a consensus low-dimensional
embedding B ∈
n×k
based on A; the objective function of
DSE is defined as
min
B,P
m
i=1
A
(i)
− BP
(i)
2
s.t. B
T
B = I (1)
where P = {P
(i)
∈
k×k
i
}
m
i=1
is a set of mapping ma-
trices. The global optimal solution to DSE is given by
performing eigendecomposition of the matrix CC
T
, C =
[A
(1)
,...,A
(m)
].
III. MSE
In this section, we introduce a new spectral-embedding algo-
rithm, i.e., MSE, which finds a low-dimensional and sufficiently
smooth embedding over all views simultaneously. To better
present the technique details of the proposed MSE, we provide
important notations used in the rest of this paper. Capital letters,
e.g., X, represent matrices or sets, and [X]
ij
is the (i, j)th entry
of X. Lower case letters, e.g., x, represent vectors, and (x)
i
is the ith element of x. Superscript (i), e.g., X
(i)
and x
(i)
,
represents data from the ith view.
Based on the aforementioned notations, MSE can be de-
scribed as follows according to our previous patch alignment
framework [29]. Given a multiview data set with n objects
and m representations, i.e., a set of matrices X = {X
(i)
∈
m
i
×n
}
m
i=1
, wherein X
(i)
is the feature matrix for the ith view
representation, MSE finds a low-dimensional and sufficiently
smooth embedding of X, i.e., Y ∈
d×n
, wherein d<m
i
(1 ≤
i ≤ m) and d is a predefined number according to different
applications.
Fig. 1 shows the working principle of MSE. MSE first builds
a patch for a sample on a view. Based on the patches from
different views, the part optimization can be performed to get
the optimal low-dimensional embedding for each view. After-
ward, all low-dimensional embeddings from different patches
are unified as a whole one by global coordinate alignment.
Finally, the solution of MSE is derived by using the alternating
optimization.
A. Part Optimization
Given the ith view X
(i)
=[x
(i)
1
,...,x
(i)
n
] ∈
m
i
×n
, con-
sider an arbitrary point x
(i)
j
and its k related ones in the same
view (e.g., nearest neighbors) x
(i)
j
1
,...,x
(i)
j
k
; the patch of x
(i)
j
is
defined as X
(i)
j
=[x
(i)
j
,x
(i)
j
1
,...,x
(i)
j
k
] ∈
m
i
×(k+1)
.ForX
(i)
j
,
there is a part mapping f
(i)
j
: X
(i)
j
→ Y
(i)
j
, wherein Y
(i)
j
=
[y
(i)
j
,y
(i)
j
1
,...,y
(i)
j
k
] ∈
d×(k+1)
. To preserve the locality in the
projected low-dimensional space, the part optimization for the
jth patch on the ith view is
arg min
Y
(i)
j
k
l=1
y
(i)
j
− y
(i)
j
l
2
w
(i)
j
l
(2)
剩余8页未读,继续阅读
资源评论
weixin_38625464
- 粉丝: 5
- 资源: 937
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于Java语言的Zzyl-Together合作智慧养老项目设计源码
- 基于Thinkphp5框架的Java插件设计源码
- 基于Python、JavaScript和Vue的“大道无形,生育天地”主题网站设计源码
- 基于Netty4与Spring、MyBatis等流行框架的轻量级RESTful HTTP服务器设计源码
- 基于Jupyter Notebook的Python与Shell脚本分享设计源码
- 基于Java的Android平台Ecg绘图设计源码
- 基于中国大学MOOC《机器人操作系统入门》的ROS-Academy-for-Beginners设计源码
- open3d-0.15.2-cp38-cp38-win-amd64.whl
- Open3D-v0.17.0-cuda11.1-msvc2019-win64.zip
- IMG_20241105_235746.jpg
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功