没有合适的资源?快使用搜索试试~ 我知道了~
(2021_天津大学学报)基于对称不确定性和三路交互信息的特征子集选择算法_顾翔元1
需积分: 0 0 下载量 107 浏览量
2022-08-03
12:34:18
上传
评论
收藏 623KB PDF 举报
温馨提示
试读
7页
摘要:由于在评价冗余特征时只考虑对称不确定性或最大信息系数等某一种度量标准,使得现有的一些特征子集选择算法存在性能不理想的问题.针对该问题,提出了一种基于对称不
资源详情
资源评论
资源推荐
第 54 卷 第 2 期
2021 年 2 月
天津大学学报(自然科学与工程技术版)
Journal of Tianjin University(Science and Technology)
Vol. 54 No. 2
Feb. 2021
收稿日期:2019-10-19;修回日期:2020-03-03.
作者简介:顾翔元(1990— ),男,博士研究生,gxiangyuan@tju.edu.cn.
通信作者:郭继昌,jcguo@tju.edu.cn.
基金项目:国家自然科学基金资助项目(61771334).
Supported by the National Natural Science Foundation of China(No. 61771334).
DOI:10.11784/tdxbz201910030
基于对称不确定性和三路交互信息的特征子集选择算法
顾翔元,郭继昌,李重仪,肖利军
(天津大学电气自动化与信息工程学院,天津 300072)
摘 要:由于在评价冗余特征时只考虑对称不确定性或最大信息系数等某一种度量标准,使得现有的一些特征子集
选择算法存在性能不理想的问题.针对该问题,提出了一种基于对称不确定性和三路交互信息的特征子集选择算
法.首先,计算特征与类标签的对称不确定性,按照其值大小对特征作降序排序处理,并消除不相关特征;然后,
计算特征间的对称不确定性以及特征与类标签的三路交互信息,并与特征与类标签的对称不确定性一起,经过比较
和排序等运算以消除冗余特征而得到选取的特征.在评价冗余特征上同时考虑对称不确定性和三路交互信息两种度
量标准,并结合比较和排序等运算,可以减少将相关特征当作冗余特征而消除的情况,使得一些效果显著的相关特
征得以保留.为验证所提算法的性能,采用 J48、IB1 和 Naïve Bayes 3 种分类器将其与另外 4 种特征子集选择算法在
3 个 UCI 数据集和 9 个 ASU 数据集上进行实验.实验结果表明,所提算法能够在选取特征数和用时均较少的情况下
取得很好的特征选择效果.
关键词:特征子集选择;三路交互信息;对称不确定性;特征选择;排序
中图分类号:TP391 文献标志码:A 文章编号:0493-2137(2021)02-0214-07
Feature Subset Selection Algorithm Based on Symmetric Uncertainty
and Three-Way Interaction Information
Gu Xiangyuan,Guo Jichang,Li Chongyi,Xiao Lijun
(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
Abstract:
It is known that only one metric is considered for evaluating redundant features such as symmetric
uncertainty or maximum information coefficient and existing feature subset selection algorithms used for evaluation
are not able to deliver the desired results. So our objective is to solve this problem and a feature subset selection
algorithm based on symmetric uncertainty and three-way interaction information(SUTII) is proposed. First ,
symmetric uncertainty between features and the class label is evaluated,and features are arranged in descending orde
r
by ranking,and irrelevant features are removed. Then three-way interaction information among features and the class
label and symmetric uncertainty between features are calculated and they are used jointly with symmetric uncertaint
y
between features and the class label in a way of comparison and ranking calculation to remove redundant features. In
this study,evaluating redundant features,both three-way interaction information and symmetric uncertainty are
considered,and comparison and ranking calculation are adopted. The simulation that relevant feature are considered
as redundant features and removed is decreased and some informative relevant features are retained. For validating the
performance,SUTII is compared with four feature subset selection algorithms. Three classifiers J48,IB1,
Naïve
Bayes,three UCI datasets,and nine ASU datasets are used in the experiment. Experimental results demonstrate tha
t
SUTII can achieve better feature selection performance by means of few selected features and by consuming less time.
Keywords:feature subset selection;three-way interaction information;symmetric uncertainty;feature selection;
ranking
曹将
- 粉丝: 21
- 资源: 308
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0