China Communications • May 2018
173
widely used in many research fields, includ-
ing the predictions of the software defect
distribution [1][2][3][4][5]. Software defect
distribution prediction plays an important role
in the whole process of software development.
Timely and accurate prediction of defective
software modules will greatly improve the ef-
fective allocation of software testing resources
[6]. There are many researchers has collected
many training samples by extracting the soft-
ware metric attributes of the software modules,
and constructed the software defect distribu-
tion prediction model by employing different
kinds of machine learning technology. The
machine learning technology has been intro-
duced to the eld of software defect prediction
in [7][8]. David Bowes et al.[9] performed
a sensitivity analysis to compare the perfor-
mance of different algorithms, e.g., Random
Forest, Naive Bayes, RPart and SVM classi-
ers, when predicting defects in NASA, open
source and commercial datasets. Yongquan
Yan et al.[10] proposed a method which can
give practice guide to forecast software aging
using machine learning algorithm.Liqiang
Zhang et al.[11] present a measurement frame-
work for evaluating these metrics.Zibin Zheng
Abstract: During the prediction of software
defect distribution, the data redundancy
caused by the multi-dimensional measurement
will lead to the decrease of prediction accura-
cy. In order to solve this problem, this paper
proposed a novel software defect prediction
model based on neighborhood preserving
embedded support vector machine (NPE-
SVM) algorithm. The model uses SVM as the
basic classier of software defect distribution
prediction model, and the NPE algorithm is
combined to keep the local geometric struc-
ture of the data unchanged in the process of
dimensionality reduction. The problem of pre-
cision reduction of SVM caused by data loss
after attribute reduction is avoided. Compared
with single SVM and LLE-SVM prediction
algorithm, the prediction model in this paper
improves the F-measure in aspect of software
defect distribution prediction by 3%~4%.
Keywords: data redundancy; SVM; NPE al-
gorithm; dimensionality reduction
I. INTRODUCTION
In recent years, the technology of machine
learning developed so rapidly and has been
Software Defect Distribution Prediction Model Based
on NPE-SVM
Hua Wei
1,2
, Chun Shan
3
, Changzhen Hu
3
, Huizhong Sun
4,
*
, Min Lei
4,5
1
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
2
China Information Technology Security Evaluation Center, Beijing 100085, China
3
Beijing Key Laboratory of Software Security Engineering Technology, School of Software, Beijing Institute of Technology,
Beijing 100081, China
4
Information Security Center, Beijing University of Posts and Telecommunications, Beijing 100876, China
5
Guizhou University, Guizhou Provincial Key Laboratory of Public Big Data, Guizhou Guiyang 550025, China
* The corresponding author , email:sunhuizhong@bupt.edu.cn
Received: Oct. 31, 2017
Revised: Jan. 24, 2018
Editor: Liang Jin
NETWORKS & SECURITY
评论0