正逼近：粗糙集理论中属性约简的加速器资源-CSDN文库

研究论文

142 浏览量 2021-02-22 21:57:08 上传评论收藏 1.64MB PDF 举报

资源推荐

资源详情

资源评论

This article appeared in a journal published by Elsevier. The attached

copy is furnished to the author for internal non-commercial research

and education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or

licensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of the

article (e.g. in Word or Tex form) to their personal website or

institutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies are

encouraged to visit:

http://www.elsevier.com/copyright

Author's personal copy

Artiﬁcial Intelligence 174 (2010) 597–618

Contents lists available at ScienceDirect

Artiﬁcial Intelligence

www.elsevier.com/locate/artint

Positive approximation: An accelerator for attribute reduction in rough

set theory

Yuhua Qian

a,c

, Jiye Liang

a,∗

,WitoldPedrycz

, Chuangyin Dang

Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Taiyuan, 030006, Shanxi, China

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada

Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Hong Kong

article info abstract

Article history:

Received 15 July 2009

Received in revised form 6 April 2010

Accepted 7 April 2010

Available online 9 April 2010

Keywords:

Rough set theory

Attribute reduction

Decision table

Positive approximation

Granular computing

Feature selection is a challenging problem in areas such as pattern recognition, machine

learning and data mining. Considering a consistency measure introduced in rough set

theory, the problem of feature selection, also called attribute reduction, aims to retain the

discriminatory power of original features. Many heuristic attribute reduction algorithms

have been proposed however, quite often, these methods are computationally time-

consuming. To overcome this shortcoming, we introduce a theoretic framework based on

rough set theory, called positive approximation, which can be used to accelerate a heuristic

process of attribute reduction. Based on the proposed accelerator, a general attribute

reduction algorithm is designed. Through the use of the accelerator, several representative

heuristic attribute reduction algorithms in rough set theory have been enhanced. Note

that each of the modiﬁed algorithms can choose the same attribute reduct as its original

version, and hence possesses the same classiﬁcation accuracy. Experiments show that these

modiﬁed algorithms outperform their original counterparts. It is worth noting that the

performance of the modiﬁed algorithms becomes more visible when dealing with larger

data sets.

1. Introduction

Feature selection, also called attribute reduction, is a common problem in pattern recognition, data mining and machine

learning. In recent years, we encounter databases in which both the number of objects becomes larger and their dimen-

sionality (number of attributes) gets larger as well. Tens, hundreds, and even thousands of attributes are stored in many

real-world application databases [6,12,37]. Attributes that are irrelevant to recognition tasks may deteriorate the perfor-

mance of learning algorithms [44,45]. In other words, storing and processing all attributes (both relevant and irrelevant)

could be computationally very expensive and impractical. To deal with this issue, as was pointed out in [20], some at-

tributes can be omitted, which will not seriously impact the resulting classiﬁcation (recognition) error, cf. [20]. Therefore,

the omission of some attributes could not only be tolerable but even desirable relatively to the costs involved in such

cases [32].

In feature selection, we encounter two general strategies, namely wrappers [16] and ﬁlters. The former employs a

learning algorithm to evaluate the selected attribute subsets, and the latter selects attributes by being guided by some sig-

niﬁcance measures such as information gain [23,46], consistency [6,41], distance [15], dependency [30], and others. These

Corresponding author. Tel.: +86 0351 7018176; fax: +86 0351 7018176.

E-mail addresses: jinchengqyh@sxu.edu.cn (Y.H. Qian), ljy@sxu.edu.cn (J.Y. Liang), pedrycz@ee.ualberta.ca (W. Pedrycz), mecdang@cityu.edu.hk

(C.Y. Dang).

0004-3702/$ – see front matter

doi:10.1016/j.artint.2010.04.018

Author's personal copy

598 Y.H. Qian et al. / Artiﬁcial Intelligence 174 (2010) 597–618

measures can be divided into two main categories: distance-based measures and consistency-based measures [20]. Rough

set theory by Pawlak [33–36] is a relatively new soft computing tool for the analysis of a vague description of an object,

and has become a popular mathematical framework for pattern recognition, image processing, feature selection, neuro-

computing, data mining and knowledge discovery from large data sets [7,11,31]. Attribute reduction in rough set theory

offers a systematic theoretic framework for consistency-based feature selection, which does not attempt to maximize the

class separability but rather attempts to retain the discernible ability of original features for the objects from the universe

[13,14,53].

Generally speaking, one always needs to handle two types of data, viz. those that assume numerical values and symbolic

values. For numerical values, there are two types of approaches. One relies on fuzzy rough set theory, and the other is

concerned with the discretization of numerical attributes. In order to deal with numerical attributes or hybrid attributes,

several approaches have been developed in the literature. Pedrycz and Vukovich regarded features as granular rather than

numerical [37]. Shen and Jenshen generalized the dependency function in classical rough set framework to the fuzzy case

and proposed a fuzzy-rough QUICKREDUCT algorithm [13,14,48]. Bhatt and Gopal provided a concept of fuzzy-rough sets

formed on compact computational domain, which is utilized to improve the computational eﬃciency [3,4]. Hu et al. pre-

sented a new entropy to measure of the information quantity in fuzzy sets [21] and applied this particular measure to

reduce hybrid data [22]. Data discretization is another important approach to deal with numerical values, in which we usu-

ally discretize numerical values into several intervals and associate the intervals with a set of symbolic values, see [5,28]. In

the “classical” rough set theory, the attribute reduction method takes all attributes as those which assume symbolic values.

Through preprocessing of original data, one can use the classical rough set theory to select a subset of features that is the

most suitable for a given recognition problem.

In the last twenty years, many techniques of attribute reduction have been developed in rough set theory. The concept

of the

β-reduct proposed by Ziarko provides a suite of reduction methods in the variable precision rough set model [60].

An attribute reduction method was proposed for knowledge reduction in random information systems [57]. Five kinds of

attribute reducts and their relationships in inconsistent systems were investigated by Kryszkiewicz [18], Li et al. [24] and

Mi et al. [29], respectively. By eliminating some rigorous conditions required by the distribution reduct, a maximum distri-

bution reduct was introduced by Mi et al. in [29]. In order to obtain all attribute reducts of a given data set, Skowron [49]

proposed a discernibility matrix method, in which any two objects determine one feature subset that can distinguish them.

According to the discernibility matrix viewpoint, Qian et al. [42,43] and Shao et al. [47] provided a technique of attribute

reduction for interval ordered information systems, set-valued ordered information systems and incomplete ordered infor-

mation systems, respectively. Kryszkiewicz and Lasek [17] proposed an approach to discovery of minimal sets of attributes

functionally determining a decision attribute. The above attribute reduction methods are usually computationally very ex-

pensive, which are intolerable for dealing with large-scale data sets with high dimensions. To support eﬃcient attribute

reduction, many heuristic attribute reduction methods have been developed in rough set theory, cf. [19,20,22,25,26,39,52,

54–56]. Each of these attribute reduction methods can extract a single reduct from a given decision table.

For convenience,

from the viewpoint of heuristic functions, we classify these attribute reduction methods into four categories: positive-region

reduction, Shannon’s entropy reduction, Liang’s entropy reduction and combination entropy reduction. Hence, we review

only four representative heuristic attribute reduction methods.

(1) Positive-region reduction

The concept of positive region was proposed by Pawlak in [33], which is used to measure the signiﬁcance of a condition

attribute in a decision table. While the idea of attribute reduction using positive region was originated by J.W. Grzymala-

Busse in [9] and [10], and the corresponding algorithm ignores the additional computation required for selecting signiﬁcant

attributes. Then, Hu and Cercone [19] proposed a heuristic attribute reduction method, called positive-region reduction,

which remains the positive region of target decision unchanged. The literature [20] gave an extension of this positive-

region reduction for hybrid attribute reduction in the framework of fuzzy rough set. Owing to the consistency of ideas and

strategies of these methods, we regard the method from [19] as their representative. These reduction methods are the ﬁrst

attempt to heuristic attribute reduction algorithms in rough set theory.

(2) Shannon’s entropy reduction

The entropy reducts have ﬁrst been introduced in 1993/1994 by Skowron in his lectures at Warsaw University. Based

on the idea, Slezak introduced Shannon’s information entropy to search reducts in the classical rough set model [50–52].

Wang et al. [54] used conditional entropy of Shannon’s entropy to calculate the relative attribute reduction of a decision

information system. In fact, several authors also have used variants of Shannon’s entropy or mutual information to measure

uncertainty in rough set theory and construct heuristic algorithm of attribute reduction in rough set theory [22,55,56]. Here

The attribute reduct obtained preserves a particular property of a given decision table. However, as Prof. Bazan said, from the viewpoint of stabilityof

attribute reduct, the selected reduct may be of bad quality [1,2]. To overcome this problem, Bazan developed a method for dynamic reducts to get a stable

attribute reduct from a decision table. How to accelerate the method for dynamic reducts is an interesting topic in further work.

Author's personal copy

Y.H. Qian et al. / Artiﬁcial Intelligence 174 (2010) 597–618 599

we only select the attribute reduction algorithm in the literature [54] as their representative. This reduction method remains

the conditional entropy of target decision unchanged.

(3) Liang’s entropy reduction

Liang et al. [25] deﬁned a new information entropy to measure the uncertainty of an information system and applied

the entropy to reduce redundant features [26]. Unlike Shannon’s entropy, this information entropy can measure both the

uncertainty of an information system and the fuzziness of a rough decision in rough set theory. This reduction method can

preserve the conditional entropy of a given decision table. In fact, the mutual information form of Liang’s entropy also can

be used to construct a heuristic function of an attribute reduction algorithm. For simplicity, we here ignore its discussion.

(4) Combination entropy reduction

In general, the objects in an equivalence class cannot be distinguished each other, but the objects in different equivalence

classes can be distinguished each other in rough set theory. Therefore, in a broad sense, the knowledge content of a given

attribute set can be characterized by the entire number of pairs of the objects which can be distinguished each other on

the universe. Based on this consideration, Qian and Liang [39] presented the concept of combination entropy for measuring

the uncertainty of information systems and used its conditional entropy to select a feature subset. This reduction method

can obtain an attribute subset that possesses the same number of pairs of the elements which can be distinguished each

other as the original decision table. This measure focuses on a completely different point of view, which is mainly based on

the intuitionistic knowledge content nature of information gain.

Each of these above methods preserves a particular property of a given information system or a given decision table.

However, these above methods are still computationally very expensive, which are intolerable for dealing with large-scale

data sets with high dimensions. In this paper, we are not concerned with how to discretize numerical attributes and con-

struct a heuristic function for attribute reduction. Our objective is to focus on how to improve the time eﬃciency of a

heuristic attribute reduction algorithm. We propose a new rough set framework, which is called positive approximation.

The main advantage of this approach stems from the fact that this framework is able to characterize the granulation struc-

ture of a rough set using a granulation order. Based on the positive approximation, we develop a common accelerator for

improving the time eﬃciency of a heuristic attribute reduction, which provides a vehicle of making algorithms of rough

set based feature selection techniques faster. By incorporating the accelerator into each of the above four representative

heuristic attribute reduction methods, we construct their modiﬁed versions. Numerical experiments show that each of the

modiﬁed methods can choose the same attribute subset as that of the corresponding original method while greatly reducing

computing time. We would like to stress that the improvement becomes more profoundly visible when the data sets under

discussion get larger.

The study is organized as follows. Some basic concepts in rough set theory are brieﬂy reviewed in Section 2. In Section 3,

we establish the positive approximation framework and investigate some of its main properties. In Section 4, through

analyzing the rank preservation of four representative signiﬁcance measures of attributes, we develop a general modiﬁed

attribute reduction algorithm based on the positive approximation. Experiments on nine public data sets show that these

modiﬁed algorithms outperform their original counterparts in terms of computational time. Finally, Section 5 concludes this

paper by bringing some remarks and discussions.

2. Preliminaries

In this section, we will review several basic concepts in rough set theory. Throughout this paper, we suppose that the

universe U is a ﬁnite nonempty set.

Let U be a ﬁnite and nonempty set called the universe and R

⊆ U × U an equivalence relation on U .ThenK =U , R is

called an approximation space [33–36]. The equivalence relation R partitions the set U into disjoint subsets. This partition

of the universe is called a quotient set induced by R,denotedbyU

/R. It represents a very special type of similarity between

elements of the universe. If two elements x

, y ∈ U (x = y) belong to the same equivalence class, we say that x and y are

indistinguishable under the equivalence relation R, i.e., they are equal in R. We denote the equivalence class including x

[x]

. Each equivalence class [x]

may be viewed as an information granule consisting of indistinguishable elements [59].

The granulation structure induced by an equivalence relation is a partition of the universe.

GivenanapproximationspaceK

=U , R and an arbitrary subset X ⊆ U , one can construct a rough set of the set on the

universe by elemental information granules in the following deﬁnition:



R X =



{[x]

|[x]

⊆ X},

RX =



{[x]

|[x]

∩ X = ∅},

where R X and RX are called R-lower approximation and R-upper approximation with respect to R, respectively. The order

pair

R X, RX is called a rough set of X with respect to the equivalence relation R. Equivalently, they also can be written



R X ={x |[x]

⊆ X},

RX ={x |[x]

∩ X = ∅}.

剩余22页未读，继续阅读

评论收藏

内容反馈

weixin_38606076

粉丝: 4
资源: 942

正逼近：粗糙集理论中属性约简的加速器

粗糙集属性约简算法

粗糙集属性约简方法

粗糙集属性约简

F-邻域粗糙集及其约简.docx

一种基于FPGA的改进粗糙集属性约简方法.pdf

一种粗糙集属性约简的伪粒子群算法.pdf

函数逼近：理论与数值方法

粗糙集属性约简代码

粗糙集属性约简python

粗糙集属性约简调研论文

粗糙集的基本应用

基于粗糙集理论的疾病综合诊断

粗糙集在心脏研究中的应用

数据挖掘粗糙集理论与方法

粗糙集属性约简（MATLAB算法）

work.rar_属性约简 程序_粗糙集_粗糙集 约简_粗糙集属性约简_粗糙集约简

RST.zip_属性约简_粗糙集_粗糙集 特征_粗糙集 MATLAB_粗糙集约简

邻域粗糙集属性约简_粗糙集_邻域粗糙集_邻域属性约简

基于变精度粗糙集属性约简

形式上下文中一致逼近表示空间中的属性约简

基于粗糙集理论和BP神经网络的粮食产量预测方法.pdf

数据仓库与数据挖掘教程(第2版)第八章集合论方法.pptx

最新资源

work.rar_属性约简程序_粗糙集_粗糙集约简_粗糙集属性约简_粗糙集约简

RST.zip_属性约简_粗糙集_粗糙集特征_粗糙集 MATLAB_粗糙集约简