Adaboost 用 matlab 实现一例。Mathworks 网站找到的。
可以直接复制下面代码到 matlab 中,不过会缺少两个图形。
或者把下面每个函数(共 6 个)分别放在 6 个 m 文件中,然后运行 demo 文件。
说明----------目录中包含一下文件。
1. ADABOOST_te.m
2. ADABOOST_tr.m
3. demo.m
4. likelihood2class.m
5. threshold_te.m
6. threshold_tr.m
本工程的目的是为被称做 AdaBoost 的学习算法提供一个源文件,以提高用户定义的分类的
性能。要使用 adaboost,首先两个函数必须用合适的参数运行。对于每个源文件的解释都
可以通过"help"命令得到。要看它们是如何工作的,运行 demo.m,即
>> demo。demo.m 中的前三行说明了训练集和测试集的长度以及若分类器的数量。要发现
bug,请马上向作者发送一封邮件。
Cuneyt Mertayak
e mail: cuneyt.mertayak@gmail.com
版本:1.0 日期:03/09/2008
%
% DEMONSTRATION OF ADABOOST_tr and ADABOOST_te
%
% Just type "demo" to run the demo.
%
% Using adaboost with linear threshold classifier
% for a two class classification problem.
%
% Bug Reporting: Please contact the author for bug reporting and comments.
%
% Cuneyt Mertayak
% email: cuneyt.mertayak@gmail.com
% version: 1.0
% date: 21/05/2007
1
% Creating the training and testing sets
%
tr_n = 200;
te_n = 200;
weak_learner_n = 20;
tr_set = abs(rand(tr_n,2))*100;
te_set = abs(rand(te_n,2))*100;
tr_labels = (tr_set(:,1)-tr_set(:,2) > 0) + 1;//前大于后,则为 2,前小于后,则为 1
te_labels = (te_set(:,1)-te_set(:,2) > 0) + 1;
% Displaying the training and testing sets
figure;
subplot(2,2,1);
hold on; axis square;
indices = tr_labels==1;
plot(tr_set(indices,1),tr_set(indices,2),'b*');
indices = ~indices;
plot(tr_set(indices,1),tr_set(indices,2),'r*');
title('Training set');
subplot(2,2,2);
hold on; axis square;
indices = te_labels==1;
plot(te_set(indices,1),te_set(indices,2),'b*');
indices = ~indices;
plot(te_set(indices,1),te_set(indices,2),'r*');
title('Testing set');
% Training and testing error rates
tr_error = zeros(1,weak_learner_n);
te_error = zeros(1,weak_learner_n);
for i=1:weak_learner_n
adaboost_model = ADABOOST_tr(@threshold_tr,@threshold_te,tr_set,tr_labels,i);%第二个
%w_learner 来时,i 就变成了 2。包括训练+自测+改权重
[L_tr,hits_tr] = ADABOOST_te(adaboost_model,@threshold_te,tr_set,tr_labels);
tr_error(i) = (tr_n-hits_tr)/tr_n;%tr 误差,就是拿自己训练的样本,查看结果
[L_te,hits_te] = ADABOOST_te(adaboost_model,@threshold_te,te_set,te_labels);
te_error(i) = (te_n-hits_te)/te_n;%te 误差,就是拿别的样本,用训练的方法来检验,注
意%te_labels 的获取
2
end
subplot(2,2,3);
plot(1:weak_learner_n,tr_error);
axis([1,weak_learner_n,0,1]);
title('Training Error');
xlabel('weak classifier number');
ylabel('error rate');
grid on;
subplot(2,2,4); axis square;
plot(1:weak_learner_n,te_error);
axis([1,weak_learner_n,0,1]);
title('Testing Error');
xlabel('weak classifier number');
ylabel('error rate');
grid on;
function adaboost_model = ADABOOST_tr(tr_func_handle, te_func_handle, train_set, labels,
no_of_hypothesis)
%
% ADABOOST TRAINING: A META-LEARNING ALGORITHM
% adaboost_model = ADABOOST_tr(tr_func_handle,te_func_handle,
% train_set,labels,no_of_hypothesis)
%
% 'tr_func_handle' and 'te_func_handle' are function handles for
% training and testing of a weak learner, respectively. The weak learner
% has to support the learning in weighted datasets. The prototypes
% of these functions has to be as follows.
%
% model = train_func(train_set,sample_weights,labels)
% train_set: a TxD-matrix where each row is a training sample in
% a D dimensional feature space.
% sample_weights: a Tx1 dimensional vector, the i-th entry
% of which denotes the weight of the i-th sample.
% labels: a Tx1 dimensional vector, the i-th entry of which
% is the label of the i-th sample.
% model: the output model of the training phase, which can
% consists of parameters estimated.
%
% [L,hits,error_rate] = test_func(model,test_set,sample_weights,true_labels)
% model: the output of train_func
% test_set: a KxD dimensional matrix, each of whose row is a
3
% testing sample in a D dimensional feature space.
% sample_weights: a Dx1 dimensional vector, the i-th entry
% of which denotes the weight of the i-th sample.
% true_labels: a Dx1 dimensional vector, the i-th entry of which
% is the label of the i-th sample.
% L: a Dx1-array with the predicted labels of the samples.
% hits: number of hits, calculated with the comparison of L and
% true_labels.
% error_rate: number of misses divided by the number of samples.
%
%
% 'train_set' contains the samples for training and it is NxD matrix
% where N is the number of samples and D is the dimension of the
% feature space. 'labels' is an Nx1 matrix containing the class
% labels of the samples. 'no_of_hypothesis' is the number of weak
% learners to be used.
%
% The output 'adaboost_model' is a structure with the fields
% - 'weights': 1x'no_of_hypothesis' matrix specifying the weights
% of the resulted weighted majority voting combination
% - 'parameters': 1x'no_of_hypothesis' structure matrix specifying
% the special parameters of the hypothesis that is
% created at the corresponding iteration of
% learning algorithm
%
% Specific Properties That Must Be Satisfied by The Function pointed
% by 'func_handle'
% ------------------------------------------------------------------
%
% Note: Labels must be positive integers from 1 upto the number of classes.
% Node-2: Weighting is done as specified in AIMA book, Stuart Russell et.al. (sec edition)
%
% Bug Reporting: Please contact the author for bug reporting and comments.
%
% Cuneyt Mertayak
% email: cuneyt.mertayak@gmail.com
% version: 1.0
% date: 21/05/2007
%
adaboost_model = struct('weights',zeros(1,no_of_hypothesis),...
'parameters',[],a); %cell(1,no_of_hypothesis));
sample_n = size(train_set,1);
4