论文研究-Adistancemetriclearningmethodformulti-dimensionalclassification.pdf资源-CSDN文库

需积分: 5 31 浏览量 2019-08-16 11:45:24 上传评论收藏 378KB PDF 举报

资源推荐

资源详情

资源评论

˖ڍመ᝶஠ڙጲ

http://www.paper.edu.cn

一种面向多维分类问题的距离度量方法

马忠臣，陈松灿

南京航空航天大学计算机科学与技术学院，南京　 211106

摘要：多维分类（MDC）是指学习个体输入样本与其多维输出离散变量之间的关系。不同的

输出维度可以有不同的范围，且对应于异构的语义结构。因此，MDC 比多标签分类（MLC）

更具一般性和挑战性。实现其有效学习的关键之一在于如何充分利用维间和维内的显式和/或

隐性关系。为此，一种有效的策略是首先对输出空间进行转化，然后在转化空间中学习。本文

首先提出了一种新的转换方式，使其具有以下有利特点：i）在转化空间中的后续学习相对原

空间容易，ii）可反映明确的维内关系，iii）可保持输出空间大小不变，iv）可克服现有 MDC

转换方法的缺点，v）对于 MDC 的每个输出维度，可分解。接下来，为了对转化问题的空间

依赖关系进行有效建模，我们提出了一种新颖的马氏距离度量学习方法，其可获得闭合解。有

趣的是，该方法本身具有独立的研究兴趣。最后，经广泛的实验证明，结合上述两个过程中的

度量方法在大多数情况下可以在分类性能方面胜过当前最优的 MDC 方法，同时在 MLC 数据

集上，我们的距离度量学习本身与其专门针对 MLC 的方法相比，也可以获得具有竞争力的分

类性能。

关键词：计算机应用技术，多维分类，问题转化，距离度量学习，闭合解

中图分类号： Computer Engineering

A distance metric learning method for

multi-dimensional classiﬁcation

Zhongchen Ma , Songcan Chen

College of Computer Science and Technology, Nanjing University of Aeronautics and

Astronautics (NUAA), Nanjing 211106, China

Abstract: Multi-dimensional classiﬁcation (MDC) refers to learning an association between

individual inputs and their multiple dimensional output discrete variables. Diﬀerent output

dimensions can have diﬀerent ranges and are usually heterogeneous to correspond to diﬀerent

semantics. Thus MDC is more general and stronger challenging than multi-label classiﬁcation

(MLC). One of keys to achieve its successful learning lies in how to take good advantage of

explicit and/or implicit relationships both between dimensions and within-dimensions. To

this end, one of the eﬀective strategy is to ﬁrstly make a transformation for output space and

then learn in the transformed space. In this paper, we ﬁrstly propose a new transformation

approach to make it possess the following favorable characteristics: i) it is relatively easier for

Foundations: National Natural Science Foundation of China under the Grant Nos.(61672281)，The Specialized Research

Fund for the Doctoral Program of Higher Education under the Grant Nos.(20133218110032)

Author Introduction: Zhongchen Ma(1990-），male，major research direction：pattern classiﬁcation. Correspondence

author：Songcan Chen（1962-），male，professor，major research direction：pattern classiﬁcation, information science.

- 1 -

˖ڍመ᝶஠ڙጲ

http://www.paper.edu.cn

subsequent learning in the transformed space, ii) it can reﬂect the explicit within-dimensional

relationships, iii) it can keep the output space size invariant, iv) it can overcome the

drawbacks of existing transformation approaches for MDC, v) it is decomposable for each

output dimension of MDC. Next, for eﬀectively modeling the dependency of the transformed

problem, we present a novel Mahalanobis distance metric learning method in which we can

obtain a closed-form solution. Interestingly, the method itself can be of independent interest.

Finally, we conduct extensive experiments whose results justify that our approach combining

the above two procedures can beat the state-of-the-art MDC methods in most cases in term

of classiﬁcation performance, while on MLC data sets, our distance metric learning method

itself can obtain competitive classiﬁcation performance compared to its counterparts designed

speciﬁcally for MLC.

Key words: computer application technology, Multi-dimensional classiﬁcation, problem

transformation, distance metric learning, closed-form solution.

0 Introduction

Multi-dimensional classiﬁcation (MDC) refers to learning an association between individual

inputs and their multiple dimensional output discrete variables. It can be applied to a variety

of real-world problems. For examples, in computer vision task [1], a landscape image may

present many information such as the month, season, or the type of subject; in information

retrieval task [2][3], documents can be classiﬁed into diﬀerent kinds of categories like mood or

topic; in computational advertising task[4] a social media information may demonstrate the

user’s gender, age, personality, happiness or political polarity.

Clearly, the MDC problems allow class variables to be heterogeneous (diﬀerent semantics)

and diverse (diﬀerent ranges), such as age and gender. Therefore, MDC task is more general

than multi-label classiﬁcation (MLC), where MLC only involves binary class variables, namely

labels, thus has a higher number of possible classes with respect to the same output dimensions.

Formally, for m class variables with each having K

, K

, ..., K

possible values, there are (K

× · · · × K

) possible classes in total in the MDC setting, compared to 2

possible classes in

MLC setting. In fact, mathematically MDC has a super-exponential large number of possible

classes, while multi-label classiﬁcation has an exponential large number of possible classes.

Naturally, the MLC problems become special cases of MDC problems, however the reverse is

not true.

Like MLC, the core goal of MDC is to achieve eﬀective classiﬁcation performance by

modeling output structure. In modeling, a simplest assumption is that the class variables are

completely unrelated, thus it is suﬃcient to design a separate independent model for each class.

However, such an ideal assumption is hardly applicable to real world problems in general, as

- 2 -

˖ڍመ᝶஠ڙጲ

http://www.paper.edu.cn

correlation (structure) often exists among class variables, for example, a user’s age can have

strong impact on his political polarity where the young are generally more radical and elders are

often more conservative. Even within each output dimension, there exists an explicit within-

dimension relationships among its values, where the values of a class variable are exclusive.

Therefore, one of keys to achieve its eﬀective learning lies in how to take suﬃcient advantage of

explicit and/or implicit relationships both among output dimensions and among values within

each output dimension.

In order to model such output structures, there are two main strategies proposed: (i)

explicitly modeling the dependence structures between class variables, e.g., via imposing chain

structure [5][6][7], or using multi-dimensional Bayesian network structure [8][9] or adopting

Markov random ﬁeld[10] (ii) implicitly modeling output structure by transformation approachs

[11][12][13].

A major limitation of the former strategy lies in requiring a pre-deﬁned output structure

(e.g., chain or Bayesian network), thus partly losing ﬂexibility of characterizing structure. In

contrast, the transformation approach of the latter strategy enjoys more ﬂexibility due to its

ability to modeling various structures. What’s more, we also witness that such a transformation

method has demonstrated its convincing performance in [13]. Therefore, in this paper, we follow

such a transformation strategy to model output structures of MDC.

To the best of our knowledge, all the existing transformation methods can be classiﬁed as

label power-set (LP)-based transformation approach. LP [11] can transform the MDC problem

into a corresponding multi-class classiﬁcation problem by deﬁning a new compound class vari-

able whose range exactly contains all the possible combinations of values of the original class

variables. Though implicitly considering the interaction between diﬀerent classes, LP suﬀers

from class imbalance and class overﬁtting problems, where the class imbalance refers to the

great diﬀerences in the total number of instances for diﬀerent classes and the class overﬁtting

problem refers to zero instances for some classes. To address these issues of LP, [13] proposed

to ﬁrstly form super-class partitions by modeling the dependence between class variables and

then make each super-class partition correspond to a compound class variable deﬁned by LP.

Although this super-class partitioning can reduce the the original problem to a set of sub-

problems, these newly formed subproblems still need to be transformed by LP, thereby, the

approach naturally suﬀers its problems.

In this paper, we ﬁrstly present a novel transformation approach to remove the above

drawbacks of LP-based transformation approach. Speciﬁcally, the output space of original

MDC problem is transformed to {0, 1}

with some additionally-imposed constraints (concretely

referring to Section 3), where L is the dimensionality of the transformed problem. In this way,

our transformation approach possesses the following favorable characteristics: i) it can keep the

space size of MDC invariant, ii) it can reﬂect the explicit within-dimensional relationships, iii)

- 3 -

˖ڍመ᝶஠ڙጲ

http://www.paper.edu.cn

it is relatively easier for subsequent modeling in the transformed space, iv) it can overcome the

class overﬁtting and class imbalance problems suﬀered by LP-based transformation approach,

v) it is decomposable for each output dimension of MDC.

At the ﬁrst glance, our transformed problem seems similar to a MLC problem due to each

class variable converted to binary (thus, naming it as the MLC-like transformation approach-

MLKT). However, a major diﬀerence from MLC is that our newly transformed problem is

imposed by additional constraints as mentioned in section later. To the best of our knowledge,

this is the ﬁrst time to use this transformation approach for MDC.

The key of our next step is to eﬀectively learn in the transformed space by modeling

output structure. Considering that our formed problem is partially similar to MLC, we can

turn to the methods for MLC to inspire us to develop a method well suited to our scenario. At

present, those methods for modeling output structure in MLC include probabilistic graphical

models[14, 15], output kernel methods [16], Mahalanobis distance metric methods [17, 18] and

others [19, 20, 21, 22, 23, 24, 25]. Among them, Mahalanobis distance metric based methods

[17] and [18] are relatively better suitable to our scenario due to its superiority in computational

eﬃciency and classiﬁcation performance. However, unluckily, both [17] and [18] are directly

inapplicable to our scenario, because their training and predicting methods have to be re-

designed, which are quite non-trivial. Therefore, instead of adapting them to our scenario, we

invent specially an alternative novel metric learning method which can obtain a closed form

solution. Moreover, our metric learning method can also naturally be applicable to MLC, thus

itself can be of independent interest as well.

Finally, extensive experimental results justify that: our approach combined the above

two procedures has better classiﬁcation accuracy than the state-of-the-art methods, while our

metric learning method itself also obtains competitive classiﬁcation performance on MLC data

sets compared to its counterparts designed speciﬁcally for MLC.

Totally, our contributions can be summarized as follows:

• We present a new transformation approach for MDC, namely MLKT, which possesses

the following favorable characteristics: i) it can keep the space size of MDC invariant,

ii) it can reﬂect the explicit within-dimensional relationships, iii) it is relatively easier for

subsequent modeling in the transformed space, iv) it can overcome the class overﬁtting

and class imbalance problems suﬀered by LP-based transformation approach, v) it is

decomposable for each output dimension of MDC.

• We also present a novel distance metric learning method for the transformed problem.

Moreover, it has a closed form solution and can also be applicable to MLC.

• Empirical evaluations show that our method can beat state-of-the-art MDC methods in

most cases and our metric learning method itself has competitive classiﬁcation perfor-

- 4 -

˖ڍመ᝶஠ڙጲ

http://www.paper.edu.cn

mance in comparison with its counterparts designed speciﬁcally for MLC.

The rest of the paper is structured as follows: We ﬁrstly introduce the required background

in the ﬁeld of multi-dimensional classiﬁcation in Section 2. Then we introduce MLKT in Section

3. Next, we present the details of the distance metric learning method in Section 4. We then

experimentally evaluate the proposed schemes in Section 5. Finally, we give concluding remarks

in Section 6.

1 Background

In this section, we review basic multi-dimensional classiﬁers.

In MDC, we have N labeled instances D = {(x

, y

)}

i=1

from which we wish to build

a classiﬁer that associates multiple class values with each data instance. The data instance

is represented by a vector of d values x = (x

, . . . , x

), each drawn from some input domain

× · · · × X

. And the classes are represented by a vector of m values y = (y

, . . . y

) from

the domain Y

× · · · × Y

where each Y

= {1, . . . , K

} is the set of possible values for the

jth class variable. Speciﬁcally, we seek to build a classiﬁer f that assigns each instance x to a

vector y of class values:

f : X

× · · · × X

→ Y

× · · · × Y

x : (x

, . . . , x

) 7→ y : (y

, . . . y

)

Independent classiﬁers method (IC) is a straightforward method for MDC. It trains m

classiﬁers f := (f

, . . . , f

) for each class variable. Speciﬁcally, a standard multi-class classiﬁer

learns to associate one of the values y

∈ Y

to each data instance, where f

: X

×· · ·×X

→

. However, it is unable to capture the dependencies among classes and suﬀers low accuracies

as illustrated in [5, 26, 27].

MDC has attracted more attentions recently and many multi-dimensional classiﬁers for

modeling the output structure of MDC have been proposed in recent years. As presented in

the introduction section, there are two main strategies:

1. Explicit representation of the dependence structure between class variables.

Classiﬁer chains model (CC)[5, 6, 7, 28], Classiﬁer trellises (CT) [26] and Multi-dimensional

Bayesian network classiﬁers (MBCs) [8, 9] were recently proposed methods following this

strategy for MDC. Speciﬁcally,

classiﬁer chains model (CC) learns m classiﬁers, one for each class variable. These clas-

siﬁers are linked at random order, such that the jth classiﬁer uses as input features not

only the instance, but also the output predictions of the previous j − 1 classiﬁers, namely

- 5 -

剩余24页未读，继续阅读

评论收藏

内容反馈

weixin_39840515

粉丝: 446
资源: 1万+

论文研究-A distance metric learning method for multi-dimensional cla...

最新资源

论文研究-A distance metric learning method for multi-dimensional cla...

Distance Metric Learning with Application to Clustering with Side-Information

论文研究-A simple wet electrochemical lift-off method of GaN epitaxial film with damage-free of multi-quantum wells.pdf

论文研究-A supervised learning framework for pancreatic islet segmentation with multi-scale color-texture features and rolling guidance filters.pdf

论文研究-A peer to peer application identification method using machine learning .pdf

论文研究-Adaptive window method for multi-scalar multiplication under resource-constrained environments.pdf

Large Margin Multi-Task Metric Learning

HOTA - A Higher Order Metric for Evaluating Multi-object Tracking

PyPI 官网下载 | pytorch-metric-learning-1.0.0.dev4.tar.gz

PyPI 官网下载 | pytorch-metric-learning-0.9.89.dev0.tar.gz

Deep Metric Learning for Few-Shot Image Classification A Selec

M-Tree An Efficient Access Method for Similarity Search in Metric Spaces.PDF

Python库 | pytorch-metric-learning-0.9.87.dev0.tar.gz

An Overview of Distance Metric Learning (by Liu Yang)

论文研究-Image Tag Completion via Multi-task Learning.pdf

Deep Metric Learning for Few-Shot Image Classification A Selecti

A Discriminative Metric Learning Based Anomaly Detection Method

An information geometry approach for distance metric learning

Decomposition based Transfer Distance Metric Learning for Image Classification

Co-metric: A Metric Learning Algorithm for Data with Multiple Views

论文研究-基于国标Soft-Metric的算法研究与FPGA实现 .pdf

Metric Learning A Survey

Mahalanobis distance教程.pdf

Qt 5实现串口调试助手 （源工程文件、0积分下载）

【SystemVerilog】路科验证V2学习笔记（全600页）.pdf

AutoSAR标准协议4.2.2

光伏-储能并网系统仿真.rar

NPPJSONViewer.zip

最新资源

Qt 5实现串口调试助手（源工程文件、0积分下载）