多视图光谱嵌入_多视图谱嵌入资源-CSDN文库

172 浏览量 2021-02-24 01:34:57 上传评论收藏 864KB PDF 举报

资源推荐

资源详情

资源评论

1438 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 6, DECEMBER 2010

Multiview Spectral Embedding

Tian Xia, Dacheng Tao, Member, IEEE, Tao Mei, Member, IEEE, and Yongdong Zhang, Member, IEEE

Abstract—In computer vision and multimedia search, it is com-

mon to use multiple features from different views to represent an

object. For example, to well characterize a natural scene image, it

is essential to ﬁnd a set of visual features to represent its color,

texture, and shape information and encode each feature into a

vector. Therefore, we have a set of vectors in different spaces to

represent the image. Conventional spectral-embedding algorithms

cannot deal with such datum directly, so we have to concatenate

these vectors together as a new vector. This concatenation is not

physically meaningful because each feature has a speciﬁc statis-

tical property. Therefore, we develop a new spectral-embedding

algorithm, namely, multiview spectral embedding (MSE), which

can encode different features in different ways, to achieve a

physically meaningful embedding. In particular, MSE ﬁnds a low-

dimensional embedding wherein the distribution of each view is

sufﬁciently smooth, and MSE explores the complementary prop-

erty of different views. Because there is no closed-form solution

for MSE, we derive an alternating optimization-based iterative

algorithm to obtain the low-dimensional embedding. Empirical

evaluations based on the applications of image retrieval, video an-

notation, and document clustering demonstrate the effectiveness

of the proposed approach.

Index Terms—Dimensionality reduction, multiple views,

spectral embedding.

I. INTRODUCTION

N COMPUTER vision and multimedia search [5], [6],

objects are usually represented in several different ways.

This kind of data is termed as the multiview data. A typical

example is a color image, which has different views from dif-

ferent modalities, e.g., color, texture, and shape. Different views

form different feature spaces, which have particular statistical

properties.

Manuscript received May 14, 2009; revised August 31, 2009 and November

18, 2009; accepted December 6, 2009. Date of publication February 17, 2010;

date of current version November 17, 2010. This work was supported in part

by the National Basic Research Program of China (973 Program) under Grant

2007CB311100; by the National High-Technology Research and Development

Program of China (863 Program) under Grant 2007AA01Z416; by the National

Natural Science Foundation of China under Grants 60873165, 60802028, and

60902090; by the Beijing New Star Project on Science and Technology under

Grant 2007B071; by the Co-building Program of Beijing Municipal Education

Commission; by the Nanyang Technological University Nanyang SUG Grant

under Project M58020010; by the Microsoft Operations PTE LTD-NTU Joint

R&D under Grant M48020065; and by the K. C. Wong Education Foundation

Award. This paper was recommended by Associate Editor S. Sarkar.

T. Xia and Y. Zhang are with the Center for Advanced Computing Tech-

nology Research, Institute of Computing Technology, Chinese Academy of

Sciences, Beijing 100190, China (e-mail: txia@ict.ac.cn; zhyd@ict.ac.cn).

D. Tao is with the School of Computer Engineering, Nanyang Technological

University, Singapore 639798 (e-mail: dctao@ntu.edu.sg).

T. Mei is with Microsoft Research Asia, Beijing 100190, China (e-mail:

tmei@microsoft.com).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TSMCB.2009.2039566

Because of the popularity of multiview data in practical

applications, particularly in the multimedia domain, learning

from multiview data, which is also known as multiple-view

learning, has attracted more and more attentions. Although a

great deal of efforts have been carried out on multiview data

learning [1], including classiﬁcation [21], clustering [4], [19],

and feature selection [20], little progress has been made in

dimensionality reduction, whereas it has many applications in

multimedia [28], e.g., image retrieval and video annotation.

Multimedia data generally have multiple modalities, and each

modality is usually represented in a high-dimensional feature

space which frequently leads to the “curse of dimensional-

ity” problem. In this case, multiview dimensionality reduction

provides an effective solution to solve or at least reduce this

problem.

In this paper, we consider the problem of spectral em-

bedding for multiple-view data based on our previous patch

alignment framework [29]. The major challenge is learning a

low-dimensional embedding to effectively explore the comple-

mentary nature of multiple views of a data set. The learned

low-dimensional embedding should be better than a low-

dimensional embedding learned by each single view of the

data set.

Existing spectral-embedding algorithms assume that sam-

ples are drawn from a vector space and thus cannot deal

with multiview data directly. A possible solution is to con-

catenate vectors from different views together as a new vec-

tor and then apply spectral-embedding algorithms directly on

the concatenated vector. However, this concatenation is not

physically meaningful because each view has a speciﬁc sta-

tistical property. This concatenation ignores the diversity of

multiple views and thus cannot efﬁciently explore the com-

plementary nature of different views. Another solution is the

distributed spectral embedding (DSE) proposed in [3]. DSE

performs a spectral-embedding algorithm on each view in-

dependently, and then based on the learned low-dimensional

representations, it learns a common low-dimensional embed-

ding which is “close” to each representation as much as

possible. Although DSE allows selecting different spectral-

embedding algorithms for different views, the original multiple-

view data are invisible to the ﬁnal learning process, and

thus, it cannot well explore the complementary nature of dif-

ferent views. Moreover, its computational cost is dense be-

cause it conducts spectral-embedding algorithms for each view

independently.

To effectively and efﬁciently learn the complementary nature

of different views, we propose a new algorithm, i.e., multiview

spectral embedding (MSE), which learns a low-dimensional

and sufﬁciently smooth embedding over all views simultane-

ously. Empirical evaluations based on image retrieval, video

XIA et al.: MULTIVIEW SPECTRAL EMBEDDING 1439

annotation, and document clustering show the effectiveness of

the proposed approach.

The rest of this paper is organized as follows. In Section II,

we provide a short review on related works. In Section III,

we present the proposed MSE and the solution of MSE. Ex-

perimental results are shown in Section IV, and Section V

concludes.

II. R

ELATED WORKS

In this section, ﬁrst, we provide a short review on con-

ventional spectral-embedding algorithms, which are all for

single-view data. Although there are some previous works on

multiview spectral methods, they are about multiview learning

for clustering [4] and classiﬁcation [21] but not for dimension-

ality reduction. As for spectral embedding for multiple-view

data, only some preliminary effort [3] is known to us; thus,

second, we give a brief introduction to the distributed method

proposed in [3].

A. Spectral Embedding

The task of dimensionality reduction is to ﬁnd a low-

dimensional representation for high-dimensional observations.

It generally falls into two classes: linear methods, e.g., prin-

ciple component analysis and multidimensional scaling; and

nonlinear methods, e.g., locally linear embedding (LLE) [8] and

Laplacian eigenmaps (LE) [9].

Spectral methods for dimensionality reduction ﬁnd the low-

dimensional representations by using eigenvectors of specially

constructed matrices [29]. Since, in the traditional problem set-

ting of dimensionality reduction, it is assumed that the data are

represented in a single vector space, the conventional spectral-

embedding algorithms can all be regarded as methods with a

single view.

Existing algorithms can be classiﬁed into two groups based

on whether they are supervised or unsupervised. The focus

of this paper is the latter. Representative algorithms include

Isomap [7], LLE [8], LE [9], Hessian eigenmaps [10], local

tangent space alignment [11], transductive component analysis

[2], discriminative locality alignment [30], and DLLE [27].

They perform well for single-view data but cannot deal with

multiview data directly.

B. Distributed Approach for Spectral Embedding With

Multiple Views

As mentioned earlier, spectral embedding with multiple

views is a new topic; it is ﬁrst proposed in [3], and a distributed

approach, i.e., DSE, is proposed in it. In the following is a brief

summary of DSE.

Given a multiple-view datum with n objects having m views,

i.e., a set of matrices X = {X

(i)

∈

×n

}

i=1

, each represen-

tation X

(i)

is a feature matrix from view i. DSE assumes that

the low-dimensional embedding of each view X

(i)

is already

known, i.e., A = {A

(i)

∈

n×k

}

i=1

, k

(1 ≤ i ≤ m).

DSE focuses on how to learn a consensus low-dimensional

embedding B ∈

n×k

based on A; the objective function of

DSE is deﬁned as

min

B,P



i=1



(i)

− BP

(i)



s.t. B

B = I (1)

where P = {P

(i)

∈

k×k

}

i=1

is a set of mapping ma-

trices. The global optimal solution to DSE is given by

performing eigendecomposition of the matrix CC

, C =

(1)

,...,A

(m)

III. MSE

In this section, we introduce a new spectral-embedding algo-

rithm, i.e., MSE, which ﬁnds a low-dimensional and sufﬁciently

smooth embedding over all views simultaneously. To better

present the technique details of the proposed MSE, we provide

important notations used in the rest of this paper. Capital letters,

e.g., X, represent matrices or sets, and [X]

is the (i, j)th entry

of X. Lower case letters, e.g., x, represent vectors, and (x)

is the ith element of x. Superscript (i), e.g., X

(i)

and x

(i)

represents data from the ith view.

Based on the aforementioned notations, MSE can be de-

scribed as follows according to our previous patch alignment

framework [29]. Given a multiview data set with n objects

and m representations, i.e., a set of matrices X = {X

(i)

∈



×n

}

i=1

, wherein X

(i)

is the feature matrix for the ith view

representation, MSE ﬁnds a low-dimensional and sufﬁciently

smooth embedding of X, i.e., Y ∈

d×n

, wherein d<m

(1 ≤

i ≤ m) and d is a predeﬁned number according to different

applications.

Fig. 1 shows the working principle of MSE. MSE ﬁrst builds

a patch for a sample on a view. Based on the patches from

different views, the part optimization can be performed to get

the optimal low-dimensional embedding for each view. After-

ward, all low-dimensional embeddings from different patches

are uniﬁed as a whole one by global coordinate alignment.

Finally, the solution of MSE is derived by using the alternating

optimization.

A. Part Optimization

Given the ith view X

(i)

=[x

(i)

,...,x

(i)

] ∈

×n

, con-

sider an arbitrary point x

(i)

and its k related ones in the same

view (e.g., nearest neighbors) x

(i)

,...,x

(i)

; the patch of x

(i)

deﬁned as X

(i)

=[x

(i)

,...,x

(i)

] ∈

×(k+1)

.ForX

(i)

there is a part mapping f

(i)

: X

(i)

→ Y

(i)

, wherein Y

(i)

,...,y

(i)

] ∈

d×(k+1)

. To preserve the locality in the

projected low-dimensional space, the part optimization for the

jth patch on the ith view is

arg min

(i)



l=1



(i)

− y

(i)





(i)



(2)

剩余8页未读，继续阅读

评论收藏

内容反馈

weixin_38625464

粉丝: 5
资源: 937

多视图光谱嵌入

laplacian-特征图：使用Laplacian特征图进行光谱嵌入

Multi-view_Clustering:适用于7种多视图光谱聚类算法的MATLAB代码

在一个窗口中嵌入视图，视图中嵌入另一个视图

mfc视图区域嵌入嵌入外部exe(窗口程序)

MFC 如何将一个对话框嵌入到视图中

基于自定进度学习的多视图光谱聚类

多视图例子.多视图例子.多视图例子.

局部多视图光谱聚类

MFC对话框嵌入视图

MFC多文档多视图编程

双层视图筛选下多视图主动学习的高光谱图像分类.docx

twopanes_src.zip_视图 嵌入 对话框

计算机视觉中的多视图几何

LMSC_多视图聚类PID_LMSC多视图聚类

多视图的实现即一个文档有多个相关联的视图，不是分割视图

基于多视图几何的三维重建

计算机视觉中的多视图几何(中文版)

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

XCP协议的规范文档

GD32替换STM32注意事项.pdf

NPPJSONViewer.zip

蓝牙BLE协议中文版.pdf

CANoe通过CAPL脚本实现自动测试

AD20官方中文教程.pdf

最新资源

twopanes_src.zip_视图嵌入对话框

Qt 5实现串口调试助手（源工程文件、0积分下载）