AdaboostSVM_SMOTETWSVM资源-CSDN文库

Adaboost

5星 · 超过95%的资源需积分: 32 37 浏览量 2012-07-04 15:51:00 上传评论 2 收藏 334KB PDF 举报

资源详情

资源评论

Engineering Applications of Artiﬁcial Intelligence 21 (2008) 785–795

AdaBoost with SVM-based component classiﬁers

Xuchun Li



, Lei Wang, Eric Sung

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore

Received 28 May 2006; received in revised form 22 May 2007; accepted 13 July 2007

Available online 14 September 2007

Abstract

The use of SVM (Support Vector Machine) as component classiﬁer in AdaBoost may seem like going against the grain of the Boosting

principle since SVM is not an easy classiﬁer to train. Moreover, Wickramaratna et al. [2001. Performance degradation in boosting. In:

Proceedings of the Second International Workshop on Multiple Classiﬁer Systems, pp. 11–21] show that AdaBoost with strong

component classiﬁers is not viable. In this paper, we shall show that AdaBoost incorporating properly designed RBFSVM (SVM with the

RBF kernel) component classiﬁers, which we call AdaBoostSVM, can perform as well as SVM.

Furthermore, the proposed AdaBoostSVM demonstrates better generalization performance than SVM on imbalanced classiﬁcation

problems. The key idea of AdaBoostSVM is that for the sequence of trained RBFSVM component classiﬁers, starting with large s values

(implying weak learning), the s values are reduced progressively as the Boosting iteration proceeds. This effectively produces a set of

RBFSVM component classiﬁers whose model parameters are adaptively different manifesting in better generalization as compared to

AdaBoost approach with SVM component classiﬁers using a ﬁxed (optimal) s value. From benchmark data sets, we show that our

AdaBoostSVM approach outperforms other AdaBoost approaches using component classiﬁers such as Decision Trees and Neural

Networks. AdaBoostSVM can be seen as a proof of concept of the idea proposed in Valentini and Dietterich [2004. Bias-variance analysis

of support vector machines for the development of SVM-based ensemble methods. Journal of Machine Learning Research 5, 725–775]

that Adaboost with heterogeneous SVMs could work well. Moreover, we extend AdaBoostSVM to the Diverse AdaBoostSVM to address

the reported accuracy/diversity dilemma of the original Adaboost. By designing parameter adjusting strategies, the distributions of

accuracy and diversity over RBFSVM component classiﬁers are tuned to maintain a good balance between them and promising results

have been obtained on benchmark data sets.

r 2007 Published by Elsevier Ltd.

Keywords: AdaBoost; Support Vector Machine; Component classiﬁer; Diversity

1. Introduction

One of the major developments in machine learning in

the past decade is the Ensemble method, which ﬁnds a

highly accurate classiﬁer by combining many moderately

accurate component classiﬁers. Two of the commonly used

techniques for constructing Ensemble classiﬁers are Boost-

ing (Schapire, 2002) and Bagging (Breiman, 1996).

Compared with Bagging, Boosting performs better when

the data do not have much noise (Opitz and Maclin, 1999;

Bauer and Kohavi, 1999). As the most popular Boosting

method, AdaBoost (Freund and Schapire, 1997) creates a

collection of component classiﬁers by maintaining a set of

weights over training samples and adaptively adjusting

these weights after each Boosting iteration: the weights of

the training samples which are misclassiﬁed by current

component classiﬁer will be increased while the weights of

the training samples which are correctly classiﬁed will be

decreased. Several ways have been proposed to implement

the weight update in Adaboost (Kuncheva and Whitaker,

2002).

The success of AdaBoost can be attributed to its ability

to enlarge the margin (Schapire et al., 1998), which could

enhance the generalization capability of AdaBoost. Many

studies that use Decision Trees (Dietterich, 2000) or Neural

Networks (Schwenk and Bengio, 2000; Ratsch, 2001)as

component classiﬁers in AdaBoost have been reported.

These studies show good generalization performance of

ARTICLE IN PRESS

www.elsevier.com/locate/engappai

0952-1976/$ - see front matter r 2007 Published by Elsevier Ltd.

doi:10.1016/j.engappai.2007.07.001



Corresponding author. Tel.: +65 9092 7335.

E-mail addresses: uchunli@pmail.ntu.edu.sg, xuchunli@pmail.ntu.

edu.sg (X. Li).

these AdaBoost. Still, some difﬁculties remain. When

Decision Trees are used as component classiﬁers, what

should be the suitable tree size? When Radial Basis

Function (RBF) Neural Networks are used as component

classiﬁers, how could the complexity be controlled to avoid

overﬁtting? Moreover, we have to decide on the optimum

number of centers and the width of the RBFs? All of these

have to be carefully tuned for AdaB oost to achieve better

performance. Furthermore, diversity is known to be an

important factor which affects the generalization perfor-

mance of Ensemble classiﬁers (Melville and Mooney, 2005;

Kuncheva and Whitaker, 2003). Some methods are

proposed to quantify the diversity (Kuncheva and Whi-

taker, 2003; Windeatt, 2005). It is also known that there is

an accuracy/diversity dilemma in AdaBoost (Dietterich,

2000), which means that the more accurate the two

component classiﬁers become, the less they can disagree

with each other. Only when the accuracy and diversity are

well balanced, can the AdaBoost demonstrate excellent

generalization performance. However, the existing Ada-

Boost algorithms do not explicitly take sufﬁcient measures

to deal with this problem.

Support Vector Mach ine (SVM) (Vapnik, 1998)is

developed from the theory of Structural Risk Minimiza-

tion. By using a kernel trick to map the training samples

from an input space to a high-dimensional feature space,

SVM ﬁnds an optimal separating hy perplane in the feature

space and uses a regularization parameter, C, to control its

model complexity and training error. One of the popular

kernels used in SVM is the RBF kernel, which has a

parameter known as Gaussian width, s. In contrast to the

RBF networks, SVM with the RBF kernel (RBFSVM in

short) can automatically determine the number and

location of the centers an d the weight values (Scholkopf

et al., 1997). Also, it can effectively avoid overﬁtting by

selecting proper values of C and s. From the performance

analysis of RBFSVM (Valentini and Dietterich, 2004), we

know that s is a more important parameter compared to C:

although RBFSVM cannot learn well when a very low

value of C is used, its performance largely depends on the s

value if a roughly suitable C is given. This means that, over

a range of suitable C, the performance of RBFSVM can be

changed by simply adjusting the value of s.

Therefore, in this paper, we try to answer the following

questions: Can the SVM be used as an effective component

classiﬁer in AdaBoost? If yes, what will be the general-

ization performance of this AdaBoost? Will this AdaBoost

show some advantage s over the existing ones, especially on

the aforementioned problems? Furthermore, compared

with the individual SVM, what is the beneﬁt of using an

AdaBoost as a combination of multiple SVMs? In this

paper, RBFSVM is adopted as component classiﬁer for

AdaBoost. As mentioned above, there is a parameter s in

RBFSVM which has to be set beforehand. An intuitive way

is to simply apply a single s to all RBFSVM component

classiﬁers. However, we observed that this way cannot lead

to successful AdaBoost due to the over-weak or over-

strong RBFSVM component classiﬁers encountered in

Boosting process. Although there may exist a single best s,

we ﬁnd that AdaBoost with this single best s obtained by

cross-validation cannot lead to the best generalization

performance and also doing cross-validation for it will

increase the computational load. Therefore, using a single

s in all RBFSVM component classiﬁers should be avoided

if possible.

The following fact opens the door for us to avoid

searching the single best s and help AdaBoost achieve even

better generalization performance. It is known that the

classiﬁcation performance of RBFSVM can be conveni-

ently changed by adjusting the kernel parameter, s.

Enlightened by this, the proposed AdaBoostSVM approach

adaptively adjusts the s values in RBFSVM component

classiﬁers to obtain a set of moderately accurate

RBFSVMs for AdaBoost. As will be shown later, this

gives rise to a better SVM-based AdaBoost. Compared

with the existing AdaBoost approaches with Neural

Networks or Deci sion Tree component classiﬁers, our

proposed AdaBoostSVM can achieve better generalization

performance and it can be seen as a proof of concept of the

idea suggested by Valentini and Dietterich (2004) that

Adaboost with heterogeneous SVMs could work well.

Furthermore, compared wi th individual SVM, Ada-

BoostSVM can achieve much better generalization perfor-

mance on imbalanced data sets. We argue that in

AdaBoostSVM, the Boosting mechanism forces some

RBFSVM component classiﬁers to focus on the misclassi-

ﬁed samples from the minority class, and this can prevent

the minority class from being considered as noise in the

dominant class and be wrongly class iﬁed. This also

justiﬁes, from another perspective, the signiﬁcance of

exploring AdaBoost with SVM component classiﬁers.

Furthermore, since AdaBoostSVM provides a conveni-

ent way to control the classiﬁcation accuracy of each

RBFSVM component classiﬁer by sim ply adjusting the s

value, it also provides an opportunity to deal with the well-

known accuracy/diveristy dilemma in Boosting methods.

This is a happy ‘‘discovery’’ found during the investigation

of AdaBoost with RBFSVM-based component classiﬁers.

Through some parameter adjusting strategies, we can tune

the distributions of accuracy and diversity over these

component classiﬁers to ach ieve a good balance. We also

propose an improved version of AdaBoostSVM called

Diverse AdaBoostSVM in this paper. It is observed that,

beneﬁting from the balance between accuracy and diver-

sity, it can give better generalization performance than

AdaBoostSVM.

2. Background

2.1. AdaBoost

Given a set of training samples, AdaBoost (Schapire and

Singer, 1999) maintains a weight distribution, W , over

these samples. This dist ribution is initially set uniform.

ARTICLE IN PRESS

X. Li et al. / Engineering Applications of Artiﬁcial Intelligence 21 (2008) 785–795786

Then, AdaBoost calls ComponentLearn algorithm repeat-

edly in a series of cycles (Table 1). At cycle t, AdaBoost

provides training samples with a distribution W

ComponentLearn. In response, the ComponentLearn

trains a classiﬁer h

. The distribution W

is updated after

each cycle according to the prediction results on the

training samples. ‘‘Easy’’ samples that are correctly

classiﬁed h

get lower weights, and ‘‘hard’’ samples that

are misclassiﬁed get higher weights. Thus, AdaBoost

focuses on the samples with higher weights, which seem

to be harder for ComponentLea rn. This process continues

for T cycles, and ﬁnally, AdaBoost linearly combines all

the component classiﬁers into a single ﬁnal hypothesis f.

Greater weights are given to component classiﬁers with

lower training errors. The important theoretical property

of AdaB oost is that if the component classiﬁers consis-

tently have accuracy only slightly better than half, then the

training error of the ﬁnal hypothesis drops to zero

exponentially fast. This means that the component

classiﬁers need to be only slightl y better than random.

2.2. Support Vector Machine

SVM was developed from the theory of Structural Risk

Minimization. In a binary class iﬁcation problem, the

decision function of SVM is

f ðxÞ¼hw; fðxÞi þ b, (1)

where fðxÞ is a mapping of sample x from the input space

to a high-dimensional feature space. h; i denotes the dot

product in the feature space. The optimal values of w and b

can be obtained by solvi ng the following optimization

problem:

minimize: gðw; xÞ¼

kwk

þ C

i¼1

ð2Þ

subject to: y

ðhw; fðx

Þi þ bÞX1  x

; x

X0, ð3Þ

where x

is the ith slack variable and C is the regularization

parameter. According to the Wolfe dual form, the above

minimization problem can be written as

minimize: WðaÞ¼

i¼1

j¼1

kðx

; x

Þð4Þ

subject to:

i¼1

¼ 0; 8i : 0pa

pC, ð5Þ

where a

is a Lagrange multiplier which corresponds to the

sample x

, kð; Þ is a kernel function that implicitly maps

the input vectors into a suitable feature space

kðx

; x

Þ¼hfðx

Þ; fðx

Þi. (6)

Compared with RBF networks (Scholkopf et al., 1997),

SVM automatically calculates the number and location of

centers, weights, and thresholds in the followin g way: by

the use of a suitable kernel function (in this paper, the RBF

kernel, kðx

; x

Þ¼expðkx

 x

=2s

Þ, is used), the sam-

ples are mapped nonlinearly into a high-dimensional

feature space. In this space, an optimal separating hyper-

plane is constructed by the support vectors. Support

vectors correspond to the centers of RBF kernels in the

input space. The generalization performance of SVM is

mainly affected by the kernel parameters, for example, s,

and the regularization parameter, C. They have to be set

beforehand.

3. Proposed algorithm: AdaBoostSVM

This work aims to employ RBFSVM as component

classiﬁer in AdaBoost. But how should we set the s value

for these RBFSVM component classiﬁers during the

AdaBoost iterations? Problems are encountered when

applying a single s to all RBFSVM component classiﬁers.

In detail, having too large a value of s often results in too

weak a RBFSVM component classiﬁer. Its classiﬁcation

accuracy is often less than 50% and cannot meet the

requirement on a component classiﬁer given in AdaBoost.

On the other hand, a smaller s often makes the RBFSVM

component classiﬁer stronger and boos ting them may

become inefﬁcient because the errors of these component

classiﬁers are highly correlated. Furthermore, too small a

value of s can even make RBFSVM overﬁt the training

samples. Hence, ﬁnding a suitable s for these SVM

component classiﬁers in AdaBoost becomes a problem.

By using model selection techniques such as k-fold or leave-

one-out cross- validation, a single best s may be found for

these component classiﬁers. How ever, the process of model

selection is time consuming and should be avo ided if

possible. Hence, it seems that SVM component classiﬁers

do not perform optimally if only one single value of s is

used.

ARTICLE IN PRESS

Table 1

Algorithm: AdaBoost (Schapire and Singer, 1999)

1. Input: a set of training samples with labels fðx

; y

Þ; ...; ðx

; y

Þg,a

ComponentLearn algorithm, the number of cycles T.

2. Initialize: the weights of training samples: w

¼ 1=N, for all

i ¼ 1; ...; N.

3. Do for t ¼ 1; ...; T

(1) Use the ComponentLearn algorithm to train a component classiﬁer,

, on the weighted training samples.

(2) Calculate the training error of h

: 

i¼1

; y

ðx

Þ.

(3) Set weight for the component classiﬁer h

: a

lnð

1



Þ.

(4) Update the weights of training samples: w

tþ1

expfa

ðx

Þg

i ¼ 1; ...; N

where C

is a normalization constant, and

i¼1

tþ1

¼ 1.

4. Output: f ðxÞ¼signð

t¼1

ðxÞÞ.

X. Li et al. / Engineering Applications of Artiﬁcial Intelligence 21 (2008) 785–795 787

剩余10页未读，继续阅读

评论收藏

内容反馈

Elijah_Yi

2017-12-12

不是代码啊

AdaboostSVM

评论6

最新资源

AdaboostSVM

评论6

最新资源

相关推荐

基于SVM的AdaBoost

SVM和Adaboost算法介绍

一种基于AdaBoost的SVM分类器(1).pdf

基于肤色分割和改进的AdaBoostSVM算法的人脸检测

AdaBoost实战代码

多分类数据集生成matlab代码

论文研究-一种基于AdaBoost-SVM的流量分类方法.pdf

adaboost详解及matlab示例代码

SVM 算法选股以及 Adaboost 增强

adaboost 演示demo（基于Matlab，学习算法包括决策树、神经网络、线性回归、在线贝叶斯分类器等）

基于肤色分割和改进的AdaBoostSVM算法的人脸检测 (2011年)

基于SVM和Adaboost解决实时流量识别问题 (2013年)

AdaBoostMATLAB代码

论文研究-基于混合采样的非平衡数据集分类研究.pdf

支持向量机与AdaBoost的结合算法研究

20余套微信小程序源代码（含前端和后端）

python实现好看的GUI界面pyqt5

软件工程课程设计报告期末大作业-超市管理系统.zip

3.2EStudy启动后无法显示界面的解决办法.zip

完整功能测试报告模板.doc

VISIO电气电子元件库.rar

软件测试报告(参考样例).doc

交互式多模型 IMM（CT/CV混合）及代码实现

双向全桥DC－DC电路simulink仿真模型

VM虚拟机镜像集合windowsxp ,windows 7,windows 8，linux

虚拟同步发电机并网和孤岛simulink仿真

软件测试述职报告ppt.ppt

DB9/DB15/DB25/DB37/DB50/DB62/DB78封装PDF.zip

《软件工程-周勇》答案.pdf