% adasyn: The implementation of ADASYN method.
% Input: TrainingData, Nr-by-D matrix for training data
% TrainingLabel, Nr-by-1 vector for training class label
% beta, the balance level
% kNN, the number of nearest neighbors are considered
% Output: AdasynData: the generated synthetic minority class data
% AdasynID: the generated synthetic minority class label
% Date: 02/08/2015
% By Bo Tang (btang@ele.uri.edu) and Haibo He (he@ele.uri.edu)
% For any questions and/or comments for this code/paper, please feel free
% to contact Prof. Haibo He, Electrical Engineering, University of Rhode Island,
% Email: he@ele.uri.edu
% Web: http://www.ele.uri.edu/faculty/he/
function [adasynData, adasynID] = adasyn(trainingData, trainingLabel, beta, kNN)
numClass = length(unique(trainingLabel));
if (numClass ~= 2)
error('error in adasyn: the input trainingLabel must be two-class!');
return
end
[maxV, maxIX] = max([length(find(trainingLabel == 1)) length(find(trainingLabel == 2))]);
[minV, minIX] = min([length(find(trainingLabel == 1)) length(find(trainingLabel == 2))]);
majorID = maxIX;
minorID = minIX;
[c,v] = find(trainingLabel == majorID);
training_data_major = trainingData(c, :);
num_major = length(c); %% number of major class data
[c,v] = find(trainingLabel == minorID);
training_data_minor = trainingData(c, :);
num_minor = length(c); %% number of minor class data
num_data = num_major + num_minor; %% number of all training data
N = round((num_major - num_minor) * beta); %% number of synthetic minor class data
kNN1 = 11;
ratio = zeros(num_minor, 1);
for T = 1 : num_minor
dist_all = sqrt(sum((ones(num_data, 1) * training_data_minor(T, :) - trainingData).^2, 2));
[dist_sort, ind_sort] = sort(dist_all);
ind_nn = ind_sort(2 : kNN1 + 1);
ind_majority = find(trainingLabel(ind_nn) == majorID);
ind_minority = find(trainingLabel(ind_nn) == minorID);
ratio(T) = length(ind_majority) / kNN1;
end
ratio_scale = ratio;
if (abs (sum(ratio_scale)) < 1e-6)
ratio_normalized = ones(length(ratio_scale) ,1) ./ length(ratio_scale);
else
ratio_normalized = (ratio_scale ./ sum(ratio_scale));
end
%% consider ratio_normalized as a prior pdf for sampling new data
pdf = ratio_normalized;
cumDist = cumsum(pdf);
Diff = cumDist * ones(1, N) - ones(length(pdf), 1) * rand(1, N);
Diff = (Diff <= 0) * 2 + Diff;
[C, I] = min(Diff);
sampleDataIndex = I';
adasynData =[];
ind_nn =[];
for T = 1 : length(sampleDataIndex)
%% calc the distance of one minor class data (used for sampling new minor class data) to all other minor class data
dist_all = sqrt(sum((ones(num_minor, 1) * training_data_minor(sampleDataIndex(T), :) - training_data_minor).^2, 2));
[dist_sort, ind_sort] = sort(dist_all);
ind_nn(T, :) = ind_sort(2 : kNN+1);
random_select = randperm(kNN);
ind_select = random_select(1); % ceil(rand(1,round((N/length(training_id_minor)))) * k_nn);
ind_smote = ind_nn(T, ind_select);
temp_mat = training_data_minor(ind_smote, :);
% create the data
vec_smt = training_data_minor(sampleDataIndex(T), :) - temp_mat;
adasynData(T, :) = training_data_minor(sampleDataIndex(T), :) - (rand) * (vec_smt);
end
adasynID = minorID * ones(size(adasynData,1), 1);
balancedTrainingDataAdaSYN = [trainingData; adasynData];
balancedTrainingLabelAdaSYN = [trainingLabel; adasynID];
end
没有合适的资源?快使用搜索试试~ 我知道了~
(ADASYN) sampling approach for learning from imbalanced data set...
共3个文件
m:2个
mat:1个
1星 需积分: 21 20 下载量 33 浏览量
2018-06-18
23:41:12
上传
评论
收藏 22KB ZIP 举报
温馨提示
(ADASYN) oversampling approach for learning from imbalanced data sets.
资源推荐
资源详情
资源评论
收起资源包目录
ADASYN.zip (3个子文件)
ADASYN
adasyn.m 3KB
SimData.mat 18KB
demo1.m 1KB
共 3 条
- 1
资源评论
- xn123342021-03-24骗人的资源
liebi
- 粉丝: 1
- 资源: 7
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 基于MATLAB的钢板表面缺陷检测系统
- MS SQL里生成行政区域县区信息表和相应数据
- delphi实现DBGrid全选和反选功能
- 25C11F41-2B2A-4D1A-AAA8-7C654526B129.pdf
- Android Studio Jellyfish(android-studio-2023.3.1.18-cros.deb)
- MVC+EF框架+EasyUI实现权限管理源码程序
- python第66-75天,Day66-75.rar
- python后端服务project-of-tornado.rar
- python测验,hello-tornado.rar
- 基于SpringBoot+Vue3快速开发平台、自研工作流引擎源码设计.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功