没有合适的资源?快使用搜索试试~ 我知道了~
An early intestinal cancer prediction Algorithm Based on Deep Be...
0 下载量 153 浏览量
2021-02-08
08:10:03
上传
评论
收藏 1.12MB PDF 举报
温馨提示
试读
13页
An early intestinal cancer prediction Algorithm Based on Deep Belief network
资源详情
资源评论
资源推荐
1
SCIENTIFIC REPORTS | (2019) 9:17418 | https://doi.org/10.1038/s41598-019-54031-2
www.nature.com/scientificreports
An Early Intestinal Cancer
Prediction Algorithm Based on
Deep Belief Network
Jing-Jing Wan
1
, Bo-Lun Chen
2,3*
, Yi-Xiu Kong
2,3
, Xing-Gang Ma
1,3
& Yong-Tao Yu
2,3
The incidence of colorectal cancer (colorectal cancer, CRC) in China has increased in recent years, and its
mortality rate has become one of the highest among all cancers. CRC also increasingly aects people’s
health and quality of life, and the workloads of medical doctors have further increased due to the lack
of sucient medical resources in China. The goal of this study was to construct an automated expert
system using a deep learning technique to predict the probability of early stage CRC based on the
patient’s case report and the patient’s attributes. Compared with previous prediction methods, which
are either based on sophisticated examinations or have high computational complexity, this method
is shown to provide valuable information such as suggesting potentially important early signs to assist
in early diagnosis, early treatment and prevention of CRC, hence helping medical doctors reduce the
workloads of endoscopies and other treatments.
CRC is a common malignant tumor in China. As people’s living standards have continued to improve and changes
in people’s eating habits, the incidence and mortality of CRC have continued to rise, seriously endangering the
health and quality of life of the Chinese people. According to Chinese cancer statistics from 2015, the incidence
and mortality of CRC ranked h among all malignant tumors, including nearly 400,000 new cases and nearly
200,000 deaths, a mortality of 50%
1
. In addition, a recently published study showed a signicant increase in the
annual rate of CRC incidence among young people
2
. Due to its high morbidity and mortality, CRC prevention is
an urgent problem that needs to be addressed.
CRC prognosis is closely related to its early diagnosis. Most CRC cases can be cured when they are discovered
at an early stage; the 5-year survival rate aer early diagnosis can be as high as 90%. In contrast, when discovered
only in the later stages, the 5-year survival rate is less than 10%
3
. In the clinic, early diagnosis and early treatment
are generally conducted by screening to reduce the incidence and mortality of CRC. Colonoscopy is the primary
means of early diagnosis. However, domestic and foreign studies have shown that CRC screening programs for
early diagnosis are not suciently accurate; only a small number of cases are screened out among a large number
of people, resulting in low screening compliance among patients
4,5
.
In addition, in China, the heavy workloads of medical professionals are well known
6
, and a series of social
and economic problems have been reported
7–10
. ese problems are mainly due to the insuciency of medical
resources in China and the inecient allocation of medical resources. Moreover, such causes will likely be dif-
cult to address in the short term. erefore, we believe that a technical approach can partially reduce doctors’
workloads—that is, by freeing doctors from repetitive work that does not require in-depth thinking. e goal of
this study is to reduce doctors’ workloads by designing an automated forecasting system to assist them to make
decisions more easily.
Previous early CRC predictions were conducted on a case-by-case basis, using either statistical analyses or
patient records. However, a generalized predictive mechanism has yet to be developed because we do not yet
fully understand the mechanism of CRC
11
. us, a solution to the prediction problem has great practical value.
For example, biological eld research has linked the protein interaction network and the metabolic network
node through an interaction relationship. Revealing the hidden interactions in such networks has high experi-
mental costs; however, the results of the prediction methods can guide experiments and increase their success
rates, thereby reducing their costs. Studying disease-gene network losses and predicting suspicious links aids in
1
Department of Gastroenterology, The Aliated Huai’an Hospital of Xuzhou Medical University, the Second People’s
Hospital of Huai’an, Huaian, 223002, China.
2
College of Computer Engineering, Huaiyin Institute of Technology,
Huaian, 223003, China.
3
These authors contributed equally: Bo-Lun Chen, Yi-Xiu Kong, Xing-Gang Ma and Yong-Tao
Yu. *email: [email protected]
OPEN
2
SCIENTIFIC REPORTS | (2019) 9:17418 | https://doi.org/10.1038/s41598-019-54031-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
exploring the mechanism behind the disease, in predicting and evaluating corresponding treatments, and nding
new drug targets, thereby opening up new avenues for drug research and development
12
.
e medical industry has incorporated high tech solutions such as articial intelligence and sensing technol-
ogies, making medical services increasingly intelligent. e recent policy of “New Healthcare Reform” in China
has made intelligent healthcare care accessible to ordinary people. Intelligent healthcare aims to capitalize on
articial intelligence technology to assist in various types of medical decision making, including disease risk pre-
diction, intelligent healthcare consultation, medical image analysis, electronic medical record information extrac-
tion, medical health data analysis, medical insurance evaluation, and making recommendations for medication.
In 2017, Esteva developed a deep neural network that can successfully classify skin cancer from sample data
13
,
demonstrating that deep learning methods have great potential for use in medical elds. Intelligent systems that
can make early disease predictions or help provide information for doctors during the diagnosis process are val-
uable in both scientic research and clinical medicine.
In recent years, many research teams have attempted to pursue machine learning methods to classify cancer
patients as high or low risk. ese technologies can play important roles in research and treatment of cancer
diseases
14
. e purpose of machine learning methods is to detect key features from complex sample data and to
reveal their contributions. Machine learning methods such as articial neural networks, Bayesian networks, sup-
port vector machines (SVM), and decision trees have been widely used in cancer research and provide eective
and accurate basic models for early prediction of various types of cancers.
e dimensions of the sample data increase with the number of examination data items during the early diag-
nosis of cancer. However, because the specic examination items collected vary on a case-by-case basis, it is nat-
ural to see data sparseness in the constructed sample dataset. Consequently, the noise in the data also increases,
which inevitably negatively impacts the performances of early CRC prediction algorithms. In addition, because
of the high dimensionality of the sample data, the time complexity of traditional prediction algorithms is usually
high. erefore, we intend to devise a method to eectively address both data sparsity and high dimensionality
and to eliminate noise in prediction problems, allowing us to learn which sample features play key roles in early
CRC prediction.
Wang et al. dened the problem of feature selection as a combinatorial optimization or search problem in
intelligent healthcare, rather than the commonly used ltering, packaging and embedded feature selection meth-
ods
15
. ey applied several feature selection methods, including exhaustive search, heuristic search and hybrid
methods. e heuristic search methods include feature ordering metrics either with or without data extraction.
Kleogiannis et al. combined an SVM with a genetic algorithm (GA) to perform feature selection and parameter
optimization
16
. Duan proposed a backward elimination feature extraction method similar to the SVM recursive
feature elimination method (SVM-RFE)
17
. e method classies the feature ranking scores by statistically analyz-
ing the weight vectors of the plurality of linear SVMs trained on subsamples of the original training data at each
step. Zhong et al. used an SVM to analyze protein characteristics based on the Pearson correlation coecient to
eliminate redundant features
18
. Fong et al. combined the particle swarm optimization algorithm with three dif-
ferent classication methods—pattern network, decision tree and naive Bayes—to search for the optimal feature
subset
19
. e results show that the method achieves high classication precision on specic datasets. Inspired by
evolutionary algorithms, Mohapatra et al. proposed a modied cat swarm optimization (MCSO) algorithm to
extract features from datasets, applied it to several biomedical datasets, and achieved favorable results
20
. Metsis
et al. proposed a feature extraction method based on a structural sparse induction specication and compared it
with existing feature extraction methods on four published ACGH datasets
21
. Boreto et al. proposed an analytical
geometric feature extraction method to supervise variational correlation learning (suvrel) using a variational
method that determines the tensor of the metric to dene the distance-based similarity during pattern classica-
tion
22
. e variational method was applied to a cost function that penalizes the distance within the large class and
the distance within the preferred class. eir approach yields a metric tensor that minimizes the cost function.
Bennasar et al. introduced the joint mutual information maximization (JMIM) and the normalized joint mutual
information maximization (NJMIM) methods, both of which use the maximum value of mutual information and
minimum criteria, thus alleviating the theoretical and experimental overestimation of the meanings of features
23
.
Xu et al. used the minimum redundancy maximum correlation (MRMR) metric, forward feature extraction and
an SVM, and found that this combination outperformed other classiers such as Bayesian decision theory, K
nearest neighbor and random forest
24
.
In addition, to address the sparsity and noise of the data in such problems, the matrix decomposition tech-
nique is a commonly used method at present; its implementation is relatively simple and its prediction accu-
racy is relatively high. e most famous matrix decomposition methods include singular value decomposition
(SVD)
25,26
, principal component analysis (PCA)
27
, independent component analysis (ICA)
28
, and others. Among
these, SVD requires completing the data to avoid the sample sparseness problem; however, this operation not only
increases the required data storage space but also potentially violates the practical signicance of the sample data
in a specic environment. Meanwhile, because SVD is a highly complex algorithm, it is not applicable to networks
with large sample sizes. erefore, based on SVD, Simon Funk proposed the LFM model by optimizing the diag-
onal array of the eigenvalues of the sample data matrix into a decomposed matrix by optimizing the evaluation
index RMSE in the training matrix
29
. In real prediction systems, no uniform standard exists for each new data
sample; therefore, Koren added the user’s historical scores based on LFM and proposed the SVD++ model
30
.
However, the above series of feature extraction models do not consider the existence of negative values in the
sample data. In a prediction system, negative values in the sample matrix have no practical meaning in a real situ-
ation. For example, during early cancer diagnosis, a certain patient attribute or a certain indicator with a negative
value may be meaningless when reconstructing the sample data. erefore, Lee and Seung proposed a nonnega-
tive matrix factorization method (NMF)
31,32
, which nds the low rank of the matrix and then decomposes it into
a nonnegative matrix. is method not only greatly reduces the dimensionality of the matrix but also removes
3
SCIENTIFIC REPORTS | (2019) 9:17418 | https://doi.org/10.1038/s41598-019-54031-2
www.nature.com/scientificreports
www.nature.com/scientificreports/
redundant data, making the decomposed result more interpretable in practice. NMF technology has been widely
applied in the health care
33
, medical imaging
34–36
and biomedical elds
37,38
; however, this technology has not
attracted widespread attention in early cancer prediction. erefore, this paper integrates NMF and combines it
with a deep learning method to facilitate early CRC detection.
Multiple examples of deep learning applications exist in medical research, most of which focus on automat-
ically identifying tumor images or detecting gene sequences, and these algorithms have achieved good results.
Xiao et al. developed a deep learning-based 5-class model to make cancer predictions using RNA sequence data
39
.
Danaee et al. used a deep learning approach (a stacked denoising autoencoder) to analyze gene expression data
and identify genes potentially correlated with breast cancer
40
. Some researchers have applied deep learning tech-
niques to analyze cancer imagery. Bychkov et al. proposed a deep learning method to analyze CRC images, and
their results showed that state-of-the-art deep learning techniques are able to extract more prognostic informa-
tion from the tissue morphology of CRC than can an experienced medical professional
41
. Cruz-Roa et al. pre-
sented and evaluated a deep learning model for automated basal cell carcinoma cancer detection that learns the
image representation, performs image classication, and interprets the results
42
. Coudray et al. discovered that a
deep learning method can classify and predict the mutation of non–small cell lung cancer from histopathology
images
43
. Other researchers have also employed deep learning methods to investigate other types of medical data
related to cancer prediction. Mamoshina et al. used deep neural networks (DNNs) to analyze ‘omics data and
achieved state-of-the-art results
44
. Burke et al. used articial neural networks to analyze the American College
of Surgeons’ Patient Care Evaluation (PCE) data and obtained improved predictions of patient 5-year survival
rates
45
.
However, in real conditions, especially those in developing countries, examination data such as tumor imagery
and genetic testing data are not easily obtained. Given the constraints on patients’ economic and medical con-
ditions, numerous patients do not have access to these techniques. In addition, test procedures such as tumor
imaging and genetic testing are typically performed only for patients already strongly suspected of having cancer.
erefore, during the most important period (i.e., the prevention and early diagnosis period), these data provide
minimal help. In this paper, we attempt to use the simplest and most commonly available test data—the medical
examination report—to create a new prediction system to help doctors make decisions. e medical examination
report is a basic test that almost every patient undergoes; thus, our early cancer prediction system can be applied
to a broader range of patients.
CRC is a multifactor disease. In CRC prediction, combining data such as age, gender, family history of CRC,
BMI, past history and other attributes and patient case reports using deep learning techniques in an expert system
to predict the likelihood of early cancer will greatly reduce missed diagnoses by clinicians during endoscopy and
treatment and will also provide eective help for early diagnosis, early treatment and prevention of CRC.
is paper explores and analyzes patient data from a deep learning perspective combined with patient attrib-
utes and case reports to construct an expert system to predict the probability of early cancer. Due to its relatively
eective dimensional reduction and noise cancellation techniques, this method shows great promise for appli-
cation in real scenarios. By greatly reducing missed clinician diagnoses during endoscopy and treatment, it will
provide eective help for the early diagnosis, early treatment and prevention of CRC.
Results
e sample dataset includes each sample’s attributes (e.g., age, gender, smoking history, and drinking history),
endoscopic features (e.g., lesion location, polyp size, and no leaf) and blood attributes (e.g., white blood cells and
hemoglobin). ere are 50 features in all categories.
We compare early cancer prediction (ECP) using four classic machine learning algorithms, i.e., an (SVM),
KNN, ensembles for boosting (EB), and random forest (RF), and three deep learning methods, i.e., a CNN, a
recurrent neural network (RNN1), and a recursive neural network (RNN2). Each method’s performance is aver-
aged over 100 runs in which the data are randomly separated into a training set (containing 90% of the links)
and a test set (including 10% of the links). Normally, precision and recall are not necessarily related; however, in
large-scale datasets, these two indicators are correlated. A false negative example (FN) means that the predic-
tion model incorrectly predicted a sample from the positive category as a negative category. Specically, in this
experiment, a FN means that a sample from a cancer patient was classied as being from a noncancer patient. In
the clinic, the false negative rate (FNR) is important because it may lead to a missed diagnosis. erefore, in this
paper, we mainly use the F1_Score and FNR as the evaluation metrics of the algorithms. e experimental results
are as follows:
From Table1, we can see that our ECP algorithm achieves the highest F1_Score on the real sample data-
set. Both the Precision and Recall of our method outperform other algorithms. In addition, the FNR is the
smallest among all algorithms. Aer dimensional reduction by a nonnegative matrix, we reduced the original
50-dimensional matrix to 14 dimension and extracted the hidden features. is idea facilitates eective early
diagnosis, early treatment and prevention of cancer. erefore, our algorithm not only reduces the spatial com-
plexity of the sample but also achieves better prediction results. False negatives can also be caused by instability
in the patient’s condition, and related data may be collected during the window period of other diseases, resulting
in data noise.
Next, we analyze the multidimensional features of the original dataset. In this paper, we input m attributes
and n samples, where X
ij
corresponds to the j
th
attribute eigenvalue of the i
th
sample. Here, k is a hypothetical
number of important features in the NMF, which is generally less than the number of attributes. Aer NMF
decomposition, W
ik
corresponds to the correlation probability of the i
th
sample and the k
th
important feature, and
H
kj
corresponds to the probabilistic correlation of the j
th
attribute and the k
th
important feature. e result of the
NMF is as follows:
剩余12页未读,继续阅读
weixin_38635682
- 粉丝: 0
- 资源: 968
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0