没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
有史以来最大的死亡研究表明,心脏病已成为世界上第一大杀手。 在25-69岁年龄段的死亡中,约有25%死于心脏病。 如果包括所有年龄段,心脏病约占所有死亡的19%。 它是导致男性和女性死亡的主要原因。 尽管数量有所不同,但它也是所有地区死亡的主要原因。 由心脏病引起的死亡比例在印度南部最高(25%),在印度中部最低(12%)。 对于许多研究人员而言,心脏病生存能力的预测一直是一个具有挑战性的研究问题。 自相关研究的早期以来,已经在几个相关领域中取得了很大的进步。 因此,本手稿的主要目的是报告一个研究项目,我们在其中利用那些可用的技术进步来开发心脏病生存能力的预测模型。 我们使用了三种流行的数据挖掘算法CART(分类和回归树),ID3(迭代二分法3)和从决策树或基于规则的分类器中提取的决策表(DT),以使用大型数据集开发预测模型。 我们还使用了10倍交叉验证方法来衡量无偏估计。
资源推荐
资源详情
资源评论
Electronic copy available at: https://ssrn.com/abstract=2991237
Research Article Vikas Chaurasia, et al, Carib.j.SciTech,2013,Vol.1,208-217
Early Prediction of Heart Diseases Using Data Mining
Techniques
Authors & Affiliation:
Vikas Chaurasia
Research Scholar, Sai Nath
University, Ranchi, Jharkhand,
India.
Saurabh Pal
Head, Dept. of MCA,
VBS Purvanchal University,
Jaunpur, India
Correspondence To:
Vikas Chaurasia
Keywords:
Heart disease, Survivability,
Data Mining, CART, ID3,
Decision table.
© 2013. The Authors.
Published under Caribbean
Journal of Science and
Technology
ISSN 0799-3757
http://caribjscitech.com/
ABSTRACT
Largest-
ever study of deaths shows heart diseases have emerged as the
number one killer in world. About 25 per cent of deaths in the age group
of 25-
69 years occur because of heart diseases. If all age groups are
included
, heart diseases account for about 19 per cent of all deaths. It is
the leading cause of death among males as well as females. It is also the
leading cause of death in all regions though the numbers vary. The
proportion of deaths caused by heart disease is
the highest in south India
(25 per cent) and lowest - 12 per cent - in the central region of India.
The prediction of heart disease survivability has been a challenging
research problem for many researchers. Since the early dates of the
related research,
much advancement has been recorded in several related
fields. Therefore, the main objective of this manuscript is to report on a
research project where we took advantage of those available
technological advancements to develop prediction models for heart
disease survivability.
We used three popular data mining algorithms CART (Classification
and Regression Tree), ID3 (Iterative Dichotomized 3) and decision table
(DT) extracted from a decision tree or rule-based classifier to develop
the prediction models using a large dataset. We also used 10-fold cross-
validation methods to measure the unbiased estimate.
Electronic copy available at: https://ssrn.com/abstract=2991237
Electronic copy available at: https://ssrn.com/abstract=2991237
Research Article Vikas Chaurasia, et al, Carib.j.SciTech,2013,Vol.1,208-217
209
Introduction
According to a recent study by the Registrar General of India (RGI) and the Indian Council of Medical Research (ICMR), about
25 percent of deaths in the age group of 25- 69 years occur because of heart diseases. In 2008, five out of the top ten causes for
mortality worldwide, other than injuries, were non-communicable diseases; this will go up to seven out of ten by the year 2030.
By then, about 76% of the deaths in the world will be due to non-communicable diseases (NCDs) [1]. Cardiovascular diseases
(CVDs), also on the rise, comprise a major portion of non-communicable diseases. In 2010, of all projected worldwide deaths, 23
million are expected to be because of cardiovascular diseases. In fact, CVDs would be the single largest cause of death in the
world accounting for more than a third of all deaths [2].
Source: Global Burden of Diseases 2004. Projected Deaths 2030, Baseline Scenario. World Health Organization, 2008. Number of
deaths in '000s
Figure 1: Mortality from major communicable and non-communicable diseases, 2030
Cardiovascular disease includes coronary heart disease (CHD), cerebrovascular disease (stroke), Hypertensive heart disease,
congenital heart disease, peripheral artery disease, rheumatic heart disease, inflammatory heart disease. The major causes of
cardiovascular disease are tobacco use, physical inactivity, an unhealthy diet and harmful use of alcohol [3]. Several researchers
are using statistical and data mining tools to help health care professionals in the diagnosis of heart disease [4].
Complex data mining benefits from the past experience and algorithms defined with existing software and packages, with certain
tools gaining a greater affinity or reputation with different techniques [5]. This technique is routinely use in large number of
industries like engineering, medicine, crime analysis, expert prediction, Web mining, and mobile computing, besides others utilize
Data mining [6]. Medical diagnosis is regarded as an important yet complicated task that needs to be executed accurately and
efficiently. The automation of this system would be extremely advantageous. Data mining is an essential step of knowledge
discovery. In recent years it has attracted great deal of interest in Information industry [7]. Data mining combines statistical
analysis, machine learning and database technology to extract hidden patterns and relationships from large databases [8]. Data
mining uses two strategies: supervised and unsupervised learning. In supervised learning, a training set is used to learn model
parameters whereas in unsupervised learning no training set is used. Each data mining technique serves a different purpose
depending on the modeling objective. The two most common modeling objectives are classification and prediction. Classification
models predict categorical labels (discrete, unordered) while prediction models predict continuous-valued functions [9]. Several
data mining techniques are used in the diagnosis of heart disease such as Naïve Bayes, Decision Tree, neural network, kernel
density, bagging algorithm, and support vector machine showing different levels of accuracies.
This paper presents a new model that enhances the Decision Tree accuracy in identifying heart disease patients. Decision Tree
algorithms include CART (Classification and Regression Tree), ID3 (Iterative Dichotomized 3) and C4.5. These algorithms differ
in selection of splits, when to stop a node from splitting, and assignment of class to a non-split node (Ho T. J., 2005). The rest of
the paper is divided as follows: the background section investigates applying data mining techniques in the diagnosis of heart
Electronic copy available at: https://ssrn.com/abstract=2991237
剩余9页未读,继续阅读
资源评论
weixin_38661650
- 粉丝: 7
- 资源: 928
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功