没有合适的资源?快使用搜索试试~ 我知道了~
一种挖掘不完整数据的新方法
0 下载量 28 浏览量
2021-03-04
11:23:33
上传
评论
收藏 91KB PDF 举报
温馨提示
试读
6页
知识发现过程中使用的数据通常包括杂音和不完整的信息。 这些数据的不同类别的边界是模糊且不明显的。 当这些数据被聚类或分类时,我们经常得到覆盖物而不是分区,这通常使我们的信息系统不安全。 本文研究了不完全数据的最优划分。 首先,讨论了机盖与机壁分区的关系,定义了机盖与机壁分区之间的距离。 其次,通过梳理和分割方法研究了给定覆盖的最优分区,讨论了从三个不同的分区集族中获取最优分区的问题。 最后,给出了相应的最优算法。 实际的无线信号包含很多噪声,并且基于传统方法对这些数据进行聚类时,边界中会存在许多错误。 在我们的实验中,该方法大大提高了正确率,实验结果证明了该方法的有效性。
资源推荐
资源详情
资源评论
Vol.30 No.4 JOURNAL OF ELECTRONICS (CHINA) August 2013
NEW METHOD OF MINING INCOMPLETE DATA
1
Wang Lunwen Zhang Xianji Wang Lunwu Zhang Lin
*
(Research Division 404, Electronic Engineering Institute, Hefei 230037, China)
*
(Department of Computer Science and Technology, Anhui University, Hefei 230039, China)
Abstract The data used in the process of knowledge discovery often includes noise and incomplete
information. The boundaries of different classes of these data are blur and unobvious. When these data
are clustered or classified, we often get the coverings instead of the partitions, and it usually makes our
information system insecure. In this paper, optimal partitioning of incomplete data is researched.
Firstly, the relationship of set cover and set partition is discussed, and the distance between set cover
and set partition is defined. Secondly, the optimal partitioning of given cover is researched by the
combing and parting method, acquiring the optimal partition from three different partitions set family
is discussed. Finally, the corresponding optimal algorithm is given. The real wireless signals offten
contain a lot of noise, and there are many errors in boundaries when these data is clustered based on the
tradional method. In our experimant, the proposed method improves correct rate greatly, and the
experimental results demonstrate the method’s validity.
Key words Clustering; Incomplete Information; Partition; Data Mining
CLC index TP18
DOI 10.1007/s11767-013-3006-5
I. Introduction
It is a puzzle to apply data mining on incom-
plete information for it is fuzzy in different mode
edges. What is more, it has intersection in different
mode. However, incomplete information occurs in
every field, so it is of great significance to research
on effective mining method. When clustering on
incomplete information, the mode is always cover
instead of partition. But we hope to receive parti-
tion results in discursion or learning process. Ref. [1]
makes use of transitive closure method to obtain
correspond partition. In this way, although parti-
tion is obtained, it is too coarse to use. Classical
rough set method can not manage incomplete noisy
information effectively. Nowadays, this method has
been extended, such as extended rough set mode
based on tolerance relation, similarity relation, and
limited tolerance relation. However, these extended
1
Manuscript received date: January 10, 2013; revised date:
May 25, 2013.
Supported by the National Natural Science Foundation of
China (No. 61273302) and partially by the Natural Sci-
ence Foundation of Anhui Province (No. 1208085MF98,
1208085MF94).
Communication author:Wang Lunwen, born in 1966, male,
Ph.D., Associate Professor. Research Division 404,
Huangshan Road 460, Hefei 230037, China.
Email: wanglunwen@163.com.
modes have limitations for they only can solve
certain level of problem, they can not manage
complicated incomplete information system
[2]
.
Based on set pair theory, Ref. [3] proposed variable
precision rough set
(),al which provides heuristic
attribute reduction algorithm based on similarity in
positive field. Ref. [4] proposed minimal consis-
tency overriding method. This method fills up
incomplete information and convert overriding
problem to partition problem based on data con-
sistency. But different fill up methods of different
incomplete information will affect reduced number
of rule set. Ref. [5] researched on fuzzy clustering
methods and compared their performance. It pro-
vided Oprimal Completion Strategy Fuzzy C-
Means (OCSFCM) and Nearest Prototype Strategy
FCM (NPSFCM) aimed at incomplete information
and improved clustering results. But the stability is
not good enough. Ref. [6] proposed a visible clus-
tering method aimed at incomplete information to
easy mining data distribution and structure. How-
ever, it must have manual intervention. Ref. [7]
discussed structure and property of fuzzy subset
from granularity and hierarchy concept. These
methods above discussed partition problem of in-
complete information mostly from macroscopic
view, so they are of low efficiency.
资源评论
weixin_38597970
- 粉丝: 4
- 资源: 920
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功