支持向量回归的类似Bagging的度量学习资源-CSDN文库

144 浏览量 2021-03-11 19:45:23 上传评论收藏 1010KB PDF 举报

在机器学习和模式识别领域中，度量学习一直扮演着至关重要的角色。度量学习主要关注于如何根据数据自身的特点来学习出最适合该数据的度量，从而提高学习任务的性能。度量学习的一个主要应用领域是支持向量机（Support Vector Machines，简称SVM），这是一种基于核技巧的强大机器学习方法，在分类和回归任务中表现优异。然而，在传统的SVM中，我们经常采用固定的度量标准，比如欧几里得距离，但这种方法往往忽视了数据本身的特性，因此不一定总是最适当的。为了解决这一问题，研究者们提出了一种新的学习策略，即从给定数据中学习任务依赖的度量（metric），这种方法已经证明可以带来更有利的学习性能。本文的工作就是在这类研究的基础上进行的，特别关注于支持向量回归（Support Vector Regression，简称SVR）的任务，并提出了一个称之为SVRML（Support Vector Regression Metric Learning）的学习算法。该算法不仅能够在验证数据集上最小化误差，而且还能在学习得到的度量矩阵上强制实现稀疏性，这是度量学习中的一个重要考虑因素。在机器学习中，集成学习是一种通过构建并结合多个学习器来解决同一个问题的策略，以期获得比单个学习器更好的预测性能。Bagging是一种典型的集成学习技术，其基本思想是通过对数据集进行重采样（例如自助采样）得到多个不同的子集，并在这些子集上训练出多个基学习器，最后将多个学习器的预测结果进行组合以得到最终结果。这样的方法在许多学习任务中被证明是非常有效的。本文受到Bagging思想的启发，提出了一种基于类似Bagging策略的有效集成度量学习框架。在这种框架中，将学习到的度量（正定矩阵）作为基学习器，专门针对SVR进行了改进。特别是对原始的Bagging方法的重采样机制进行了特别的修改，以便适应SVRML算法。通过在各种数据集上进行的实验表明，本文提出的方法在支持向量回归任务上，相较于单独使用SVRML算法以及基于Bagging的集成度量学习，有着更好的性能表现。这些实验结果充分验证了所提方法的有效性。此外，本文所提出的方法具有广泛的适用性。它不仅可以应用于支持向量回归，而且对于k-近邻分类、k-均值聚类等其他机器学习任务也有潜在的适应性。因此，本文的研究为度量学习领域提供了新的研究思路和方法，具有很高的理论和应用价值。关键词方面，除了涉及度量学习、支持向量回归和集成学习等核心主题外，也特别提到了距离度量学习、基于距离的核方法，这些都反映了本文的研究范围和深度。总而言之，本文所展现的研究成果，对于进一步提升机器学习任务的性能，尤其是对于那些度量选择至关重要的任务，具有重要的意义。

资源推荐

资源详情

资源评论

Bagging-like metric learning for support vector regression

Peng-Cheng Zou

⇑

, Jiandong Wang, Songcan Chen

, Haiyan Chen

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

article info

Article history:

Received 26 December 2013

Received in revised form 28 February 2014

Accepted 2 April 2014

Available online 19 April 2014

Keywords:

Distance metric learning

Support vector regression

Ensemble learning

Bagging

Distance-based kernel

abstract

Metric plays an important role in machine learning and pattern recognition. Though many available off-

the-shelf metrics can be selected to achieve some learning tasks at hand such as for k-nearest neighbor

classiﬁcation and k-means clustering, such a selection is not necessarily always appropriate due to its

independence on data itself. It has been proved that a task-dependent metric learned from the given data

can yield more beneﬁcial learning performance. Inspired by such success, we focus on learning an embed-

ded metric specially for support vector regression and present a corresponding learning algorithm

termed as SVRML, which both minimizes the error on the validation dataset and simultaneously enforces

the sparsity on the learned metric matrix. Further taking the learned metric (positive semi-deﬁnite

matrix) as a base learner, we develop a bagging-like effective ensemble metric learning framework in

which the resampling mechanism of original bagging is specially modiﬁed for SVRML. Experiments on

various datasets demonstrate that our method outperforms the single and bagging-based ensemble

metric learnings for support vector regression.

1. Introduction

Metric learning plays an important role in many learning tasks

including k-nearest neighbor classiﬁcation, k-means clustering and

kernel-based algorithms such as support vector machines [1–5].In

recent years, many studies have demonstrated empirically and

theoretically that it is often beneﬁcial for a learning task to learn

a metric from the given data, instead of using an off-the-shelf

one such as Euclidean distance metric.

Depending on the availability of the given data, these methods

roughly fall into two main categories: unsupervised metric learn-

ing and supervised metric learning. Each unsupervised metric

learning method is essentially to learn a distance metric without

supervised information [6,7]. While in supervised metric learning,

more information about data such as label information is used to

learn the metric and it is better to capture the idiosyncrasies of

the data of interest [8,9]. We pay particular attention to the super-

vised methods in this paper.

Supervised distance metric learning can be further divided into

task-independent and task-dependent metric learnings. The task-

independent methods usually include two separated learning

steps: in the ﬁrst step, a metric is learned by solving an optimiza-

tion problem with the supervised information. Then the second

step uses the learned metric to solve a subsequent task. The classi-

cal Linear Discriminant Analysis (LDA) though as a dimensionality

reduction method can also be viewed as a pseudo-metric learning

method [10]. The metric learned by LDA can be used in many

subsequent tasks such as k-nearest neighbor classiﬁcation. In

addition, MMC by Xing et al. learns a metric by minimizing the

distances in equivalence constraints and maximizing the distances

in inequivalence constraints. Then the metric learned by MMC is

used in different clustering tasks [1].

Though the task-independent methods have used the

supervised information when learning the metrics, such a two step

method cannot guarantee the learned metric is optimal for the

subsequent task. Therefore, a more desirable method is to learn

the metric directly via incorporating the speciﬁc subsequent task,

just as the task-dependent distance metric learning. It is similar

to the feature selection problem that embedding methods can

usually achieve better performance than ﬁlter methods [11]. The

task-independent metric learning is corresponding to the ﬁlter

method and the task-dependent metric learning is corresponding

to the embedding method. One of the most representative works

is Large Margin Nearest Neighbor (LMNN) [2], in which the learned

metric is tailored specially for k-nearest neighbor classiﬁcation

and leads to signiﬁcant improvement compared to k-nn with

task-independent metrics. Several related researches have also

been proposed, such as Neighborhood components analysis

(NCA) [4], multi-task LMNN [12] and Non-linear LMNN [13], etc.

It should be noted that most of the existing task-dependent

metric learning methods are designed for classiﬁcation tasks

especially k-nn. Similar to classiﬁcation, regression is another

important task in machine learning and its performance is also

http://dx.doi.org/10.1016/j.knosys.2014.04.002

⇑

Corresponding authors. Tel.: +86 15850685790.

E-mail addresses: zou_pc@163.com (P.-C. Zou), s.chen@nuaa.edu.cn (S. Chen).

Knowledge-Based Systems 65 (2014) 21–30

Contents lists available at ScienceDirect

Knowledge-Based Systems

journal homepage: www.elsevier.com/locate/knosys

highly dependent on the chosen metric. However, these methods

designed for classiﬁcation tasks cannot be used directly for

regression tasks. Only few of the metric learning methods have

been proposed specially for regression tasks so far. A typical one

is MLKR [14] which learns a metric specially for kernel regression.

Unfortunately, the improvement of regression performance

achieved by MLKR is limited that it is still difﬁcult to achieve a

comparable performance with some sophisticated methods such

as support vector regression [15] on many datasets.

To explore further the metric learning method for regression

tasks, we consider learning a metric via incorporation of support

vector regression (SVR) which is one of the most popular regression

algorithms. Metric is also important for SVR especially with kernels.

Typical kernels for SVR have no prior knowledge about the meaning

of the features and are assumed to be isotropic. Therefore, we focus

on learning an embedded metric in SVR to improve the regression

performance. We propose a corresponding learning algorithm

termed as SVRML, which minimizes the error on the validation

set and enforces the sparsity on the learned metric matrix simulta-

neously. The learning process combines the Mahalanobis [16]

metric learning with the training of SVR. More importantly, to make

the metric learned by SVRML more effective, we propose a bagging-

like ensemble metric learning framework. It extends the original

bagging algorithm [17] in which a positive semi-deﬁnite matrix is

taken as a base-learner rather than either classiﬁer or regressor.

The proposed SVRML algorithm has the following desirable

properties: (1) SVRML learns a sparse Mahalanobis metric which

is capable of removing potential redundancy or noise in data. (2)

SVRML can parallelly learn multiple base metrics by using a bag-

ging-like ensemble metric learning framework and obtain an aggre-

gated metric to achieve better generalization performance for SVR.

(3) It is easy to implement and can be treated as an alternative fea-

ture selection method to provide a convenient way to pre-process

the data automatically. The primary contributions of this work

are therefore as follows: (1) We propose a task-dependent metric

learning algorithm for SVR. (2) We develop an effective bagging-like

ensemble metric learning framework in which the resampling

mechanism of original bagging is specially modiﬁed for SVRML.

The rest of this paper is organized as follows: we provide an

overview of the related work in Section 2. Section 3 explains

how to learn an embedded metric for SVR. The bagging-like

ensemble metric learning framework is discussed detailedly in

Section 4. Experimental studies are shown in Section 5. Finally,

we draw the conclusions and list our future works in Section 6.

2. Related works

Over the last decade, several task-dependent metric learning

algorithms have been proposed [2,4,18,14]. However, only few of

them are designed specially for regression tasks. Support vector

regression which is very popular for regression tasks also depend

heavily on the metric. As far as we know, our work is the ﬁrst to

combine metric learning with support vector regression. Our pro-

posed method SVRML is also in the family of task-dependent dis-

tance metric learning.

Weinberger and Tesauro constructed a metric learning algo-

rithm for kernel regression termed as MLKR [14] which learns a

task-speciﬁc (pseudo-)metric over the input space where small

distances between two vectors imply similar target values. This

metric in MLKR is learned by directly minimizing the leave-one-

out regression error. Similarly, Xu et al. [19] proposed a metric

learning algorithm for support vector classiﬁcation by minimizing

the 0–1 classiﬁcation error. Inspired by these work, we consider

learning a metric for SVR by minimizing the regression error on a

validation set. But one drawback of them is that they incline to

overﬁt the validation data [8].

As a remedy, ensemble learning is an alternate method we can

use to combine with the metric learning process, as ensemble learn-

ing is able to improve the generalization performance of learning

systems [20]. Some ensemble learning methods such as boosting

[21] have already been introduced into metric learning. For example,

Shen et al. [22] proposed a boosting-based technique BoostMetric to

learn a metric using trace-one rank-one matrices as weak learners.

Chang [23] developed a metric base-learner speciﬁc to the boosting

framework by improving a loss function iteratively. Mu et al. [24]

proposed a local discriminative metrics ensemble learning algo-

rithm. But none of them focus on regression tasks. To ﬁll the gap,

we propose a bagging-like ensemble framework designed specially

for SVRML to improve the regression performance. Different from

the existing methods such as BoostMetric which iteratively learns

the base metrics, our framework retains the parallelism like bagging.

In our framework, the resampling mechanism of original bagging is

specially modiﬁed for SVRML to achieve better performance.

In addition to the above, our work is also inspired by the kernel-

parameter selection methods for SVR. For example, Chang and Lin

[25] derived various leave-one-out bounds for SVR parameter

selection to improve the generalization performance. The kernel-

parameter selection for SVR can be analyzed on the metric learning

perspective that the adjusting of the inner product leads to

different distance metrics. Different from choosing a single or a

few kernel-parameters, our method optimizes the entire metric

matrix and learns a nonlinear metric.

3. Metric learning for support vector regression (SVRML)

3.1. Support vector regression

Our method is based on L2-SVR [15], one of the most commonly

used varieties of SVR. Given a set of training examples fx

; y

‘

i¼1

size ‘, where the input vector x

2 R

, and the target value y

2 R ,

L2-SVR solves the primal problem:

min

w;b;n;n



w þ

‘

i¼1

‘

i¼1

ðn



s:t: 

 n



6 w

/ x

ðÞþb  y

þ n

; i ¼ 1; ...;‘:

ð1Þ

In order to solve the above problem effectively, practically we

solve the dual problem of (1) instead:

min

;





ðÞ





ðÞ

i¼1







i¼1







s:t:

i¼1





Þ¼0; i ¼ 1; ...;‘;

;



P 0; i ¼ 1; ...;‘;

ð2Þ

where kðx

; x

Þ¼/ðx

/ðx

Þ is the kernel function.

K ¼ K þ I=C and

dd

is an identity matrix. The ﬁnal prediction function is

gðxÞ¼w

/ðxÞþb ¼

‘

i¼1





Þkðx

; xÞþb: ð3Þ

As the convenience of narrative, we do not distinguish L2-SVR

from SVR in the following sections any longer. Many kernel func-

tions are used for SVR. In fact, any function kð; Þ can be used as

a well-deﬁned kernel if only it is positive semi-deﬁnite. In this

paper, we use the popular kernel function RBF kernel uniformly

due to its popularity and particularity that it depends on the dis-

tance function directly. The RBF kernel is deﬁned as follows:

kðx

; x

Þ¼exp d

ðx

; x

; ð4Þ

where dð; Þ is the distance metric of data. In the RBF kernel, it is

commonly the squared Euclidean distance with a kernel width

parameter

> 0Þ. When training the SVR, the prediction

performance can be improved by choosing an effective parameter

22 P.-C. Zou et al. / Knowledge-Based Systems 65 (2014) 21–30

剩余9页未读，继续阅读

评论收藏

内容反馈

weixin_38640117

粉丝: 1
资源: 926

支持向量回归的类似Bagging的度量学习

基于度量学习的改进最小二乘支持向量机

度量学习算法

机器学习中的相似性度量方法研究

机器学习原理及应用课程教学大纲.docx

这是一个关于房屋出租价格预测的竞赛题，属于机器学习算法中的回归问题.zip

统计学习导论 基于R应用_r_统计学习导论_

统计学习方法的思维导图

机器学习方法有哪些.pdf

常见的机器学习方法（R语言）.zip_R 机器学习_R 语言_R语言

个人机器学习(Machine Learning, ML)笔记.zip

机器学习原理及应用课程教学大纲.pdf

机器学习算法在森林生长收获预估中的应用.pdf

南大出品 机器学习基础入门教程 机器学习导论 第08章 集成学习 共14页.pdf

PDF-MachineLearningInAction-英文版.rar

2017年最新机器学习入门与实战精品高清全套视频教程附讲义作业(anaconda2 4.3Pytyhon2.7 jupyter) 70课

统计学习基础 数据挖掘、推理与预测.zip

机器学习常见面试.docx

machine-learning机器学习入门课程代码

深度学习原理推导与代码实现.pdf

监督学习,无监督学习,数据预处理, 模型评估与改进,管道构建.zip

机器学习与概率图模型-王立威

机器学习环境系统设计课程所有文档和代码

模型预测的利器——随机森林

吴恩达机器学习2022 Advanced Learning Algorithms week2 C2

RapidMiner数据分析与挖掘实战

最新资源

统计学习导论基于R应用_r_统计学习导论_

南大出品机器学习基础入门教程机器学习导论第08章集成学习共14页.pdf

统计学习基础数据挖掘、推理与预测.zip