2 J.M.-T. Wu et al. / Knowledge-Based Systems 0 0 0 (2016) 1–12
ARTICLE IN PRESS
JID: KNOSYS [m5G; November 14, 2016;12:51 ]
the two-phase (transaction-weighted utility TWU) model and the
transaction-weighted downward closure (TWDC) property for min-
ing HUIs. Lin et al. [9] presented a HUP-tree for mining the HUIs.
Lan et al. [10] designed the mining algorithm based on index-
projection mechanism and developed the pruning strategy to ef-
ficiently mine the HUIs. Tseng et al. [11] then designed the UP-
growth mining algorithm to retrieve the HUIs based on the devel-
oped UP-tree structure. HUI-Miner [12] is an efficient list-based al-
gorithm also proposed to mine the HUIs without candidate genera-
tion. Some top- k high-utility approaches were proposed to instead
of setting a minimal threshold [13–15] . Other related researches
into HUIM is still in progress [16–19] .
Due to the “exponential problem” [20] , the traditional algo-
rithms of HUIM have to spend more computation time in a huge
search space while the number of distinct items or the size of
a database is very large. Evolutionary computation is an efficient
way and able to find the optimal solutions using the principles
of natural evolution [21] . Strict termination conditions can be set
some strict termination conditions in order to limit the compu-
tation time for a process but still obtain a nearly optimal solu-
tion. The genetic algorithm (GA) [22] , which is a kind of EC, is an
optimization approach to solve the NP-hard and non-linear prob-
lems, and is used to investigate very large search spaces to find
the optimal solutions based on the designed fitness functions with
various operators such as selection, crossover, and mutation. In
the past, Kannimuthu and Premalatha adopted the genetic algo-
rithm and developed high utility pattern extraction using genetic
algorithms with ranked mutation using minimum utility threshold
(HUPE
umu
-GRAM) to mine HUIs [20] . Another genetic algorithm
called HUPE
wumu
-GRAM was also proposed to mine HUIs with-
out a specific minimum utility threshold [20] . For those two al-
gorithms, the crossover and mutation operations are required to
randomly generate the next solutions in the evolution process. Be-
sides, it needs amounts of computations to find the satisfied HUIs
in the initial step, which is insufficient when the number of dis-
tinct items is very large. Particle swam optimization algorithm is
now one of the most commonly used optimization techniques [23] .
Lin et al. proposed a PSO-based techniques to mine high-utility
itemsets. A binary PSO-based (BPSO) [24] algorithm is designed
for mining the HUIs. It called for HUIM-BPSO and applied TWU
model to find HUIs effectively. BPSO is designed to resolve dis-
crete optimization problem using traditional PSO process and re-
tains the characteristic of high convergence speed in PSO [25] , it
can obtain good enough solution in early iterations. Ant colony al-
gorithms or the hybridizations of ACO with other meta-heuristic
algorithms were also applied in data mining field [26] . ACS is an
effective evolutionary com putation algorithm and it is suitable to
apply in the discrete solution space {ex. data mining issues}. The
performance of ACS is always better than other evolutionary com-
putation algorithms if a well-defined heuristic function is set in
designed ACS approach. In this paper, an ACS [27] algorithm with
a specific designed routing graph is designed for mining HUIs. The
proposed algorithm not only enhances the performance for min-
ing HUIs by ACS operators, but also provides a checking mech-
anism to see if all of the HUIs in the database are being dis-
covered or not. The key contributions of this paper are described
below:
• Fewer algorithms have been developed to find the HUIs based
on evolutionary computation. In this paper, an ACS approach
namely HUIM-ACS is thus proposed to find the HUIs by a spe-
cific routing graph and TWU model.
• Two pruning rules are proposed in this algorithm. They can not
only avoid the unnecessary estimation for some itemsets but
also provide a checking mechanism to see whether all of the
HUIs are being discovered or not.
• Experiments were performed in several real-life databases to
compare the performance of proposed the approach with previ-
ous algorithms. Results showed HUIM-ACS can find more HUIs
from a huge database than other evolutionary computation al-
gorithms.
The rest of this paper is organized as follows. Related work is
briefly reviewed in Section 2 . Preliminaries and the problem state-
ment are presented in Section 3 . The proposed HUIM-ACS is de-
scribed in Section 4 . Experiments are conducted and provided in
Section 5 . Finally; conclusions are given in Section 6 .
2. Related work
2.1. Ant colony optimization
In the past, ant colony optimization has been successfully ap-
plied to solving several optimization problems. They are especially
efficient and effective in finding nearly optimal solutions from a
huge solution space. This section reviews work related to the orig-
inal ant system and its extended version ant colony system.
2.1.1. Ant system
Ant system (AS) is based on observations of real ant colonies
searching for food. It was first introduced by Colorni et al. [28,29] .
A real ant population is capable of finding the shortest path be-
tween its nest and destinations by depositing pheromones on the
path. Each ant determines the next direction on the route accord-
ing to the pheromone density. AS simulates the behavior of real
ant populations, maps the solution space from applied problems
to a searching graph, and enhances the building solution process
to increase the searching performance. AS not only uses the infor-
mation of pheromones, but designs a heuristic function to guide
each ant to the better directions. Once all the ants have terminated
their tours, the amount of pheromone on the tours will have been
modified. The brief algorithm is shown in Algorithm 1 . Details of
the state transition rule and the pheromone-updating process are
described in Section 2.1.2 .
Algorithm 1: Ant system.
1 Initializeput each ant on its starting node;
2 while end conditions are not met do
3 while some ants have not already built their tour do
4 choose an ant which has not finished its tour;
5 build a solution of each ant incrementally by the state
transition rule;
6 update the pheromone information by each tour in this
iteration;
7 output the best solution;
2.1.2. Ant colony system
Ant Colony System, proposed by Dorigo and Gambardella [27] ,
is an extended algorithm from ant system. It modified the state
transition rule and pheromone-updating rule to increase the per-
formance of the original AS approach. It is shown in Algorithm 2 .
Details of the algorithm are described below.
1. State transition rule
The state transition rule is used by an ant to probabilistically
select its next node (state). The traveling salesman problem is
taken as an example. Assume the k -th ant is currently in the
city j (node). The next city s (node) for the k -th ant to visit is:
Please cite this article as: J.M.-T. Wu et al., An ACO-based approach to mine high-utility itemsets, Knowledge-Based Systems (2016),
http://dx.doi.org/10.1016/j.knosys.2016.10.027