基于ACO的高实用性项目集挖掘方法资源-CSDN文库

157 浏览量 2021-04-05 05:31:38 上传评论收藏 929KB PDF 举报

基于ACO的高实用性项目集挖掘方法当今数据挖掘领域中，高实用性项目集挖掘（High-Utility Itemsets Mining, HUIM）是主要的研究热点之一。与传统的频繁项目集挖掘（Frequent Itemsets Mining, FIM）不同，高实用性项目集挖掘不仅考虑项目（商品）出现的频率因素，更注重数量和利润因子，以揭示最有利润的商品组合。高实用性项目集挖掘方法旨在找出那些能够带来较大利润的产品组合，这对零售、库存管理、推荐系统等领域有着重要的应用价值。早期的研究提出了多种挖掘高实用性项目集的方法，但是这些方法在面对大量不同商品以及大型数据库时，往往需要处理指数级的搜索空间。为了应对这一挑战，研究人员提出了基于两种进化计算（Evolutionary Computation, EC）技术的算法：遗传算法（Genetic Algorithm, GA）和粒子群优化算法（Particle Swarm Optimization, PSO）。这些算法能够在限制的时间内获得大量的高实用性项目。然而，遗传算法和粒子群优化算法并不能保证提供的解决方案就是全局最优解。在这一背景下，本研究提出了一种基于另一种进化计算技术——蚁群优化算法（Ant Colony Optimization, ACO）的新型算法来解决这一问题。ACO是一种启发式搜索算法，它的基本思想是模拟自然界蚂蚁觅食的行为，通过蚂蚁之间信息素的相互作用来寻找优化的路径。与GA和PSO不同，ACO算法以一种构造性的方式产生可行解，尽可能避免生成不合理解。因此，一个定义良好的ACO方法总是能够高效地获得合适的解。为了进一步提高挖掘效率，本研究扩展了ACO算法，提出了蚁群系统（Ant Colony System, ACS），并在此基础上提出了高实用性项目集挖掘的蚁群系统（High-Utility Itemset Mining by ACS, HUIM-ACS）。该算法能够有效地寻找高实用性项目集。一般而言，进化计算算法无法保证所提供的解决方案就是全局最优解。但是，通过将完整的解空间映射到路由图中，并引入两个剪枝过程，设计的HUIM-ACO算法确保了能够找到全局最优解。由于技术原因，原文的OCR扫描结果中存在一些文字识别错误或漏识别的情况，需进行合理的解读和校正。例如，原文中出现的“JID:KNOSYS[m5G;November14,2016;12:51]ARTICLEINPRESSKnowledge-BasedSystems000(2016)1–12”实际上是文章的一些出版信息，包括接收日期、修订日期、接受日期和在线可用日期等，这些信息虽然对于整篇文档的理解不是至关重要，但它们为研究论文的发表历史提供了重要记录。根据给出的文件内容，我们可以总结出以下知识点： 1. 高实用性项目集挖掘（HUIM）概念及其重要性。 2. 高实用性项目集挖掘与传统频繁项目集挖掘（FIM）的区别。 3. 高实用性项目集挖掘在商业领域的应用。 4. 进化计算（EC）技术在挖掘高实用性项目集中的应用。 5. 遗传算法（GA）与粒子群优化（PSO）在处理高实用性项目集挖掘问题上的优势与局限。 6. 蚁群优化算法（ACO）的原理及其在解决高实用性项目集挖掘中的潜在优势。 7. 蚁群系统（ACS）及其在高实用性项目集挖掘中的扩展应用。 8. 高实用性项目集挖掘的蚁群系统（HUIM-ACS）算法的设计与实现。 9. 解决方案的全局最优性问题及其在进化计算算法中的地位。 10. 文献出版信息的解读，了解研究论文的发表背景。上述知识点为理解和研究基于ACO的高实用性项目集挖掘方法提供了一个全面的框架。

资源推荐

资源详情

资源评论

ARTICLE IN PRESS

JID: KNOSYS [m5G; November 14, 2016;12:51 ]

Knowledge-Based Systems 0 0 0 (2016) 1–12

Contents lists available at ScienceDirect

Knowle dge-Base d Systems

journal homepage: www.elsevier.com/locate/knosys

An ACO-based approach to mine high-utility itemsets

Jimmy Ming-Tai Wu

, Justin Zhan

, Jerry Chun-Wei Lin

∗

Department of Computer Science, University of Nevada, Las Vegas, USA

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China

a r t i c l e i n f o

Article history:

Received 1 June 2016

Revised 13 September 2016

Accepted 31 October 2016

Available online xxx

Keywords:

Ant system

High-utility itemsets

Evolutionary computation

Ant colony system

a b s t r a c t

High-utility itemset mining (HUIM) is a major contemporary data mining issue. It is different from fre-

quent itemset mining (FIM), which only considers the frequency factor. HUIM applies both the quantity

and proﬁt factors to be used to reveal the most proﬁtable products. Several previous approaches have

been proposed to mine high-utility itemsets (HUIs) and most of them have to handle the exponential

search space for discovering HUIs when the number of distinct items and the size of the database are

both very large. Therefore, two evolutionary computation (EC) techniques, genetic algorithm (GA) and

particle swarm optimization (PSO), were previously proposed to mine HUIs. In these studies, GAs and

PSOs also could obtain the huge amount of high-utility items in a limitation time. In this paper, a novel

algorithm based on the other evolutionary computation technique, ant colony optimization (ACO), is pro-

posed to resolve this issue. Unlike GAs and PSOs, ACOs produce a feasible solution in a constructive way.

They can avoid generating unreasonable solutions as much as possible. Thus, a well-deﬁned ACO approach

can always obtain suitable solutions eﬃciently. An ant colony system (ACS), which is extended from ACO

and consists of high-utility itemset mining by ACS (HUIM-ACS), is proposed to eﬃciently ﬁnd HUIs. In

general, an EC algorithm cannot make sure the provided solution is the global optimal solution. But the

designed HUIM-ACS algorithm maps the completed solution space into the routing graph and includes

two pruning processes. Therefore, it guarantees that it obtains all of the HUIs when there is no candidate

edge from the starting point. In addition, HUIM-ACS does not estimate the same feasible solution again in

its process in order to avoid wasting computational resource. Substantial experiments on real-life datasets

show that the proposed algorithm outperforms the other heuristic algorithms for mining HUIs in terms

of the number of discovered HUIs, and convergence.

1. Introduction

How to ﬁnd the potential or inferred information on a huge

database is an emerging issue called knowledge discovery in

database (KDD). In previous works, most of the research focuses on

frequent itemset mining (FIM) and association-rule mining (ARM).

These algorithms were developed to mine the set of frequent item-

sets for which occurrence frequencies are not less than the mini-

mum support threshold, and to ﬁnd the association rules for which

conﬁdence was not less than minimum conﬁdence threshold [1,2] .

Since only the occurrence frequencies of itemsets are discovered

whether in FIM or ARM, it is insuﬃcient to identify the high-proﬁt

itemsets, especially when the itemsets rarely appear but have high-

proﬁt values. For instance, a department store might sell fewer

jewels than most other goods in a month, but jewels usually can

∗

Corresponding author.

E-mail addresses: ming-tai.wu@unlv.edu (J.M.-T. Wu), justin.zhan@unlv.edu (J.

Zhan), jerrylin@ieee.org (J.C.-W. Lin).

obtain higher proﬁt than other goods that are bought more in

the same period. Actually, the information for high-utility item-

sets (HUIs) is more valuable than frequent itemsets in a real-life

situation.

It is different with FIM or ARM, high-utility itemset mining

(HUIM) [3–6] was proposed to discover the “useful” and “prof-

itable” itemsets from a quantitative database. A user-speciﬁc min-

imum utility threshold is used to estimate whether an itemset is

a high-utility itemset (HUI) or not. An itemset is a HUI if the util-

ity value of this itemset is higher than the threshold. In a real-

istic situation, not only “proﬁt” can be applied as utility value to

mining high-utility itemsets but “weight”, “cost” and other differ-

ent factors can also contribute to HUIs. Many previous algorithms

were respectively developed to mine the set of complete HUIs.

Chan et al. [7] ﬁrst proposed the concept of utility mining prob-

lem instead of FIM. Yao et al. [4] discovered the HUIs by the quan-

tity of items as the internal utility and the unit proﬁt of items as

the external utility. Duo to the “combinational problem” for dis-

covering HUIs in the previous methods, Liu et al. [8] proposed

http://dx.doi.org/10.1016/j.knosys.2016.10.027

Please cite this article as: J.M.-T. Wu et al., An ACO-based approach to mine high-utility itemsets, Knowledge-Based Systems (2016),

http://dx.doi.org/10.1016/j.knosys.2016.10.027

2 J.M.-T. Wu et al. / Knowledge-Based Systems 0 0 0 (2016) 1–12

ARTICLE IN PRESS

JID: KNOSYS [m5G; November 14, 2016;12:51 ]

the two-phase (transaction-weighted utility TWU) model and the

transaction-weighted downward closure (TWDC) property for min-

ing HUIs. Lin et al. [9] presented a HUP-tree for mining the HUIs.

Lan et al. [10] designed the mining algorithm based on index-

projection mechanism and developed the pruning strategy to ef-

ﬁciently mine the HUIs. Tseng et al. [11] then designed the UP-

growth mining algorithm to retrieve the HUIs based on the devel-

oped UP-tree structure. HUI-Miner [12] is an eﬃcient list-based al-

gorithm also proposed to mine the HUIs without candidate genera-

tion. Some top- k high-utility approaches were proposed to instead

of setting a minimal threshold [13–15] . Other related researches

into HUIM is still in progress [16–19] .

Due to the “exponential problem” [20] , the traditional algo-

rithms of HUIM have to spend more computation time in a huge

search space while the number of distinct items or the size of

a database is very large. Evolutionary computation is an eﬃcient

way and able to ﬁnd the optimal solutions using the principles

of natural evolution [21] . Strict termination conditions can be set

some strict termination conditions in order to limit the compu-

tation time for a process but still obtain a nearly optimal solu-

tion. The genetic algorithm (GA) [22] , which is a kind of EC, is an

optimization approach to solve the NP-hard and non-linear prob-

lems, and is used to investigate very large search spaces to ﬁnd

the optimal solutions based on the designed ﬁtness functions with

various operators such as selection, crossover, and mutation. In

the past, Kannimuthu and Premalatha adopted the genetic algo-

rithm and developed high utility pattern extraction using genetic

algorithms with ranked mutation using minimum utility threshold

(HUPE

umu

-GRAM) to mine HUIs [20] . Another genetic algorithm

called HUPE

wumu

-GRAM was also proposed to mine HUIs with-

out a speciﬁc minimum utility threshold [20] . For those two al-

gorithms, the crossover and mutation operations are required to

randomly generate the next solutions in the evolution process. Be-

sides, it needs amounts of computations to ﬁnd the satisﬁed HUIs

in the initial step, which is insuﬃcient when the number of dis-

tinct items is very large. Particle swam optimization algorithm is

now one of the most commonly used optimization techniques [23] .

Lin et al. proposed a PSO-based techniques to mine high-utility

itemsets. A binary PSO-based (BPSO) [24] algorithm is designed

for mining the HUIs. It called for HUIM-BPSO and applied TWU

model to ﬁnd HUIs effectively. BPSO is designed to resolve dis-

crete optimization problem using traditional PSO process and re-

tains the characteristic of high convergence speed in PSO [25] , it

can obtain good enough solution in early iterations. Ant colony al-

gorithms or the hybridizations of ACO with other meta-heuristic

algorithms were also applied in data mining ﬁeld [26] . ACS is an

effective evolutionary com putation algorithm and it is suitable to

apply in the discrete solution space {ex. data mining issues}. The

performance of ACS is always better than other evolutionary com-

putation algorithms if a well-deﬁned heuristic function is set in

designed ACS approach. In this paper, an ACS [27] algorithm with

a speciﬁc designed routing graph is designed for mining HUIs. The

proposed algorithm not only enhances the performance for min-

ing HUIs by ACS operators, but also provides a checking mech-

anism to see if all of the HUIs in the database are being dis-

covered or not. The key contributions of this paper are described

below:

• Fewer algorithms have been developed to ﬁnd the HUIs based

on evolutionary computation. In this paper, an ACS approach

namely HUIM-ACS is thus proposed to ﬁnd the HUIs by a spe-

ciﬁc routing graph and TWU model.

• Two pruning rules are proposed in this algorithm. They can not

only avoid the unnecessary estimation for some itemsets but

also provide a checking mechanism to see whether all of the

HUIs are being discovered or not.

• Experiments were performed in several real-life databases to

compare the performance of proposed the approach with previ-

ous algorithms. Results showed HUIM-ACS can ﬁnd more HUIs

from a huge database than other evolutionary computation al-

gorithms.

The rest of this paper is organized as follows. Related work is

brieﬂy reviewed in Section 2 . Preliminaries and the problem state-

ment are presented in Section 3 . The proposed HUIM-ACS is de-

scribed in Section 4 . Experiments are conducted and provided in

Section 5 . Finally; conclusions are given in Section 6 .

2. Related work

2.1. Ant colony optimization

In the past, ant colony optimization has been successfully ap-

plied to solving several optimization problems. They are especially

eﬃcient and effective in ﬁnding nearly optimal solutions from a

huge solution space. This section reviews work related to the orig-

inal ant system and its extended version ant colony system.

2.1.1. Ant system

Ant system (AS) is based on observations of real ant colonies

searching for food. It was ﬁrst introduced by Colorni et al. [28,29] .

A real ant population is capable of ﬁnding the shortest path be-

tween its nest and destinations by depositing pheromones on the

path. Each ant determines the next direction on the route accord-

ing to the pheromone density. AS simulates the behavior of real

ant populations, maps the solution space from applied problems

to a searching graph, and enhances the building solution process

to increase the searching performance. AS not only uses the infor-

mation of pheromones, but designs a heuristic function to guide

each ant to the better directions. Once all the ants have terminated

their tours, the amount of pheromone on the tours will have been

modiﬁed. The brief algorithm is shown in Algorithm 1 . Details of

the state transition rule and the pheromone-updating process are

described in Section 2.1.2 .

Algorithm 1: Ant system.

1 Initializeput each ant on its starting node;

2 while end conditions are not met do

3 while some ants have not already built their tour do

4 choose an ant which has not ﬁnished its tour;

5 build a solution of each ant incrementally by the state

transition rule;

6 update the pheromone information by each tour in this

iteration;

7 output the best solution;

2.1.2. Ant colony system

Ant Colony System, proposed by Dorigo and Gambardella [27] ,

is an extended algorithm from ant system. It modiﬁed the state

transition rule and pheromone-updating rule to increase the per-

formance of the original AS approach. It is shown in Algorithm 2 .

Details of the algorithm are described below.

1. State transition rule

The state transition rule is used by an ant to probabilistically

select its next node (state). The traveling salesman problem is

taken as an example. Assume the k -th ant is currently in the

city j (node). The next city s (node) for the k -th ant to visit is:

Please cite this article as: J.M.-T. Wu et al., An ACO-based approach to mine high-utility itemsets, Knowledge-Based Systems (2016),

http://dx.doi.org/10.1016/j.knosys.2016.10.027

剩余11页未读，继续阅读

评论收藏

内容反馈

weixin_38719540

粉丝: 6
资源: 908

基于ACO的高实用性项目集挖掘方法

基于ACO蚁群优化的二维路径规划算法matlab仿真,含仿真操作录像

基于ACO蚁群优化的图像边缘提取算法matlab仿真+代码操作视频

基于ACO用matlab写的TSP程序

基于ACO蚁群优化算法的三维路径规划算法的MATLAB仿真，matlab2021a测试。

开发技术-Web开发基于ACO的Web使用挖掘方法研究.zip

基于ACO_PSO的机器人路径规划和ROBCAD运动仿真.pdf

基于ACO优化的多机器人避障matlab仿真-源码

【含操作视频】基于ACO蚁群优化算法的密集城市群路线规划matlab仿真

基于ACO优化小波神经网络的语音识别.pdf

【含操作视频】基于ACO蚁群优化算法的VRP问题matlab仿真,以一个真实地图为样本进行路线规划

基于ACO-BP神经网络的大坝渗流监测应用研究.pdf

基于ACO-BP神经网络的光伏系统发电功率预测.pdf

含仿真录像，基于ACO优化的UAV任务调度以及路径规划matlab仿真

基于ACO-BP神经网络的锂离子电池容量衰退预测.pdf

基于ACO-BP神经网络的土石坝位移监测模型研究.pdf

基于CORDIC的反正弦和反余弦计算的FPGA实现

使用3DCNN和卷积LSTM进行手势识别学习时空特征

BA无标度网络中的SIR模型

基于三次贝塞尔曲线的类汽车曲率连续路径平滑

基于机器学习的设备剩余寿命预测方法综述

基于维纳过程的退化模型，具有递归过滤算法，可用于估计剩余使用寿命

基于FPGA的奇异值和特征值分解的快速实现。

基于BP神经网络的人口预测

磁悬浮系统自适应模糊PID控制器的设计

最新资源