没有合适的资源?快使用搜索试试~ 我知道了~
论文研究-A distance metric learning method for multi-dimensional cla...
需积分: 5 1 下载量 31 浏览量
2019-08-16
11:45:24
上传
评论
收藏 378KB PDF 举报
温馨提示
试读
25页
一种面向多维分类问题的距离度量方法,马忠臣,陈松灿,多维分类(MDC)是指学习个体输入样本与其多维输出离散变量之间的关系。不同的输出维度可以有不同的范围,且对应于异构的语义结构��
资源推荐
资源详情
资源评论
˖ڍመڙጲ
http://www.paper.edu.cn
一种面向多维分类问题的距离度量方法
马忠臣 ,陈松灿
南京航空航天大学计算机科学与技术学院,南京 211106
摘要:多维分类(MDC)是指学习个体输入样本与其多维输出离散变量之间的关系。不同的
输出维度可以有不同的范围,且对应于异构的语义结构。因此,MDC 比多标签分类(MLC)
更具一般性和挑战性。实现其有效学习的关键之一在于如何充分利用维间和维内的显式和/或
隐性关系。为此,一种有效的策略是首先对输出空间进行转化,然后在转化空间中学习。本文
首先提出了一种新的转换方式,使其具有以下有利特点:i)在转化空间中的后续学习相对原
空间容易,ii)可反映明确的维内关系,iii)可保持输出空间大小不变,iv)可克服现有 MDC
转换方法的缺点,v)对于 MDC 的每个输出维度,可分解。接下来,为了对转化问题的空间
依赖关系进行有效建模,我们提出了一种新颖的马氏距离度量学习方法,其可获得闭合解。有
趣的是,该方法本身具有独立的研究兴趣。最后,经广泛的实验证明,结合上述两个过程中的
度量方法在大多数情况下可以在分类性能方面胜过当前最优的 MDC 方法,同时在 MLC 数据
集上,我们的距离度量学习本身与其专门针对 MLC 的方法相比,也可以获得具有竞争力的分
类性能。
关键词:计算机应用技术,多维分类,问题转化,距离度量学习,闭合解
中图分类号: Computer Engineering
A distance metric learning method for
multi-dimensional classification
Zhongchen Ma , Songcan Chen
College of Computer Science and Technology, Nanjing University of Aeronautics and
Astronautics (NUAA), Nanjing 211106, China
Abstract: Multi-dimensional classification (MDC) refers to learning an association between
individual inputs and their multiple dimensional output discrete variables. Different output
dimensions can have different ranges and are usually heterogeneous to correspond to different
semantics. Thus MDC is more general and stronger challenging than multi-label classification
(MLC). One of keys to achieve its successful learning lies in how to take good advantage of
explicit and/or implicit relationships both between dimensions and within-dimensions. To
this end, one of the effective strategy is to firstly make a transformation for output space and
then learn in the transformed space. In this paper, we firstly propose a new transformation
approach to make it possess the following favorable characteristics: i) it is relatively easier for
Foundations: National Natural Science Foundation of China under the Grant Nos.(61672281),The Specialized Research
Fund for the Doctoral Program of Higher Education under the Grant Nos.(20133218110032)
Author Introduction: Zhongchen Ma(1990-),male,major research direction:pattern classification. Correspondence
author:Songcan Chen(1962-),male,professor,major research direction:pattern classification, information science.
- 1 -
˖ڍመڙጲ
http://www.paper.edu.cn
subsequent learning in the transformed space, ii) it can reflect the explicit within-dimensional
relationships, iii) it can keep the output space size invariant, iv) it can overcome the
drawbacks of existing transformation approaches for MDC, v) it is decomposable for each
output dimension of MDC. Next, for effectively modeling the dependency of the transformed
problem, we present a novel Mahalanobis distance metric learning method in which we can
obtain a closed-form solution. Interestingly, the method itself can be of independent interest.
Finally, we conduct extensive experiments whose results justify that our approach combining
the above two procedures can beat the state-of-the-art MDC methods in most cases in term
of classification performance, while on MLC data sets, our distance metric learning method
itself can obtain competitive classification performance compared to its counterparts designed
specifically for MLC.
Key words: computer application technology, Multi-dimensional classification, problem
transformation, distance metric learning, closed-form solution.
0 Introduction
Multi-dimensional classification (MDC) refers to learning an association between individual
inputs and their multiple dimensional output discrete variables. It can be applied to a variety
of real-world problems. For examples, in computer vision task [1], a landscape image may
present many information such as the month, season, or the type of subject; in information
retrieval task [2][3], documents can be classified into different kinds of categories like mood or
topic; in computational advertising task[4] a social media information may demonstrate the
user’s gender, age, personality, happiness or political polarity.
Clearly, the MDC problems allow class variables to be heterogeneous (different semantics)
and diverse (different ranges), such as age and gender. Therefore, MDC task is more general
than multi-label classification (MLC), where MLC only involves binary class variables, namely
labels, thus has a higher number of possible classes with respect to the same output dimensions.
Formally, for m class variables with each having K
1
, K
2
, ..., K
m
possible values, there are (K
1
×
K
2
× · · · × K
m
) possible classes in total in the MDC setting, compared to 2
m
possible classes in
MLC setting. In fact, mathematically MDC has a super-exponential large number of possible
classes, while multi-label classification has an exponential large number of possible classes.
Naturally, the MLC problems become special cases of MDC problems, however the reverse is
not true.
Like MLC, the core goal of MDC is to achieve effective classification performance by
modeling output structure. In modeling, a simplest assumption is that the class variables are
completely unrelated, thus it is sufficient to design a separate independent model for each class.
However, such an ideal assumption is hardly applicable to real world problems in general, as
- 2 -
˖ڍመڙጲ
http://www.paper.edu.cn
correlation (structure) often exists among class variables, for example, a user’s age can have
strong impact on his political polarity where the young are generally more radical and elders are
often more conservative. Even within each output dimension, there exists an explicit within-
dimension relationships among its values, where the values of a class variable are exclusive.
Therefore, one of keys to achieve its effective learning lies in how to take sufficient advantage of
explicit and/or implicit relationships both among output dimensions and among values within
each output dimension.
In order to model such output structures, there are two main strategies proposed: (i)
explicitly modeling the dependence structures between class variables, e.g., via imposing chain
structure [5][6][7], or using multi-dimensional Bayesian network structure [8][9] or adopting
Markov random field[10] (ii) implicitly modeling output structure by transformation approachs
[11][12][13].
A major limitation of the former strategy lies in requiring a pre-defined output structure
(e.g., chain or Bayesian network), thus partly losing flexibility of characterizing structure. In
contrast, the transformation approach of the latter strategy enjoys more flexibility due to its
ability to modeling various structures. What’s more, we also witness that such a transformation
method has demonstrated its convincing performance in [13]. Therefore, in this paper, we follow
such a transformation strategy to model output structures of MDC.
To the best of our knowledge, all the existing transformation methods can be classified as
label power-set (LP)-based transformation approach. LP [11] can transform the MDC problem
into a corresponding multi-class classification problem by defining a new compound class vari-
able whose range exactly contains all the possible combinations of values of the original class
variables. Though implicitly considering the interaction between different classes, LP suffers
from class imbalance and class overfitting problems, where the class imbalance refers to the
great differences in the total number of instances for different classes and the class overfitting
problem refers to zero instances for some classes. To address these issues of LP, [13] proposed
to firstly form super-class partitions by modeling the dependence between class variables and
then make each super-class partition correspond to a compound class variable defined by LP.
Although this super-class partitioning can reduce the the original problem to a set of sub-
problems, these newly formed subproblems still need to be transformed by LP, thereby, the
approach naturally suffers its problems.
In this paper, we firstly present a novel transformation approach to remove the above
drawbacks of LP-based transformation approach. Specifically, the output space of original
MDC problem is transformed to {0, 1}
L
with some additionally-imposed constraints (concretely
referring to Section 3), where L is the dimensionality of the transformed problem. In this way,
our transformation approach possesses the following favorable characteristics: i) it can keep the
space size of MDC invariant, ii) it can reflect the explicit within-dimensional relationships, iii)
- 3 -
˖ڍመڙጲ
http://www.paper.edu.cn
it is relatively easier for subsequent modeling in the transformed space, iv) it can overcome the
class overfitting and class imbalance problems suffered by LP-based transformation approach,
v) it is decomposable for each output dimension of MDC.
At the first glance, our transformed problem seems similar to a MLC problem due to each
class variable converted to binary (thus, naming it as the MLC-like transformation approach-
MLKT). However, a major difference from MLC is that our newly transformed problem is
imposed by additional constraints as mentioned in section later. To the best of our knowledge,
this is the first time to use this transformation approach for MDC.
The key of our next step is to effectively learn in the transformed space by modeling
output structure. Considering that our formed problem is partially similar to MLC, we can
turn to the methods for MLC to inspire us to develop a method well suited to our scenario. At
present, those methods for modeling output structure in MLC include probabilistic graphical
models[14, 15], output kernel methods [16], Mahalanobis distance metric methods [17, 18] and
others [19, 20, 21, 22, 23, 24, 25]. Among them, Mahalanobis distance metric based methods
[17] and [18] are relatively better suitable to our scenario due to its superiority in computational
efficiency and classification performance. However, unluckily, both [17] and [18] are directly
inapplicable to our scenario, because their training and predicting methods have to be re-
designed, which are quite non-trivial. Therefore, instead of adapting them to our scenario, we
invent specially an alternative novel metric learning method which can obtain a closed form
solution. Moreover, our metric learning method can also naturally be applicable to MLC, thus
itself can be of independent interest as well.
Finally, extensive experimental results justify that: our approach combined the above
two procedures has better classification accuracy than the state-of-the-art methods, while our
metric learning method itself also obtains competitive classification performance on MLC data
sets compared to its counterparts designed specifically for MLC.
Totally, our contributions can be summarized as follows:
• We present a new transformation approach for MDC, namely MLKT, which possesses
the following favorable characteristics: i) it can keep the space size of MDC invariant,
ii) it can reflect the explicit within-dimensional relationships, iii) it is relatively easier for
subsequent modeling in the transformed space, iv) it can overcome the class overfitting
and class imbalance problems suffered by LP-based transformation approach, v) it is
decomposable for each output dimension of MDC.
• We also present a novel distance metric learning method for the transformed problem.
Moreover, it has a closed form solution and can also be applicable to MLC.
• Empirical evaluations show that our method can beat state-of-the-art MDC methods in
most cases and our metric learning method itself has competitive classification perfor-
- 4 -
˖ڍመڙጲ
http://www.paper.edu.cn
mance in comparison with its counterparts designed specifically for MLC.
The rest of the paper is structured as follows: We firstly introduce the required background
in the field of multi-dimensional classification in Section 2. Then we introduce MLKT in Section
3. Next, we present the details of the distance metric learning method in Section 4. We then
experimentally evaluate the proposed schemes in Section 5. Finally, we give concluding remarks
in Section 6.
1 Background
In this section, we review basic multi-dimensional classifiers.
In MDC, we have N labeled instances D = {(x
i
, y
i
)}
N
i=1
from which we wish to build
a classifier that associates multiple class values with each data instance. The data instance
is represented by a vector of d values x = (x
1
, . . . , x
d
), each drawn from some input domain
X
1
× · · · × X
d
. And the classes are represented by a vector of m values y = (y
1
, . . . y
m
) from
the domain Y
1
× · · · × Y
m
where each Y
j
= {1, . . . , K
j
} is the set of possible values for the
jth class variable. Specifically, we seek to build a classifier f that assigns each instance x to a
vector y of class values:
f : X
1
× · · · × X
d
→ Y
1
× · · · × Y
m
x : (x
1
, . . . , x
d
) 7→ y : (y
1
, . . . y
m
)
Independent classifiers method (IC) is a straightforward method for MDC. It trains m
classifiers f := (f
1
, . . . , f
m
) for each class variable. Specifically, a standard multi-class classifier
f
j
learns to associate one of the values y
j
∈ Y
j
to each data instance, where f
j
: X
1
×· · ·×X
d
→
Y
j
. However, it is unable to capture the dependencies among classes and suffers low accuracies
as illustrated in [5, 26, 27].
MDC has attracted more attentions recently and many multi-dimensional classifiers for
modeling the output structure of MDC have been proposed in recent years. As presented in
the introduction section, there are two main strategies:
1. Explicit representation of the dependence structure between class variables.
Classifier chains model (CC)[5, 6, 7, 28], Classifier trellises (CT) [26] and Multi-dimensional
Bayesian network classifiers (MBCs) [8, 9] were recently proposed methods following this
strategy for MDC. Specifically,
classifier chains model (CC) learns m classifiers, one for each class variable. These clas-
sifiers are linked at random order, such that the jth classifier uses as input features not
only the instance, but also the output predictions of the previous j − 1 classifiers, namely
- 5 -
剩余24页未读,继续阅读
资源评论
weixin_39840515
- 粉丝: 446
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功