Z. Ren et al. / Knowledge-Based Systems 146 (2018) 142–151 143
ing a slope. To achieve this, Grahl and his coworkers successively
developed two identification strategies, i.e., the strategies based
on correlation triggering [13] and standard deviation ratio (SDR)
[14] . Cai et al. [15] developed a different type of variance scaling
method named cross entropy adaptive variance scaling, which cal-
culates the scaling factor by minimizing the cross entropy between
the current probability model and the predicted model for the next
generation.
Besides directly tuning variances, some researchers achieved
variance scaling by modifying the eigenvalues of the estimated co-
variance matrix. Wagner et al. [16] proposed an eigenspace EDA
(EEDA) which adjusts variances by replacing the minimum eigen-
value with the maximum one. Dong et al. [17] developed an eigen
decomposition framework for multivariate GEDA and claimed that
most variance scaling methods by then could be unified within
their framework by applying different eigenvalue tuning strategies.
Liu et al. [18] introduced principal component analysis into EDA
(PCA-EDA) and tried to avoid premature convergence by regulating
the maximum eigenvalue.
It is easily comprehensible that the efficiency of GEDA depends
not only on its search scope, but also on its search directions.
Unfortunately, it has been shown that, without fine intervention,
the main search direction of GEDA tends to become perpendicu-
lar to the fitness improvement direction [15,19] , which greatly re-
duces its search efficiency. To remedy this defect, some researchers
made beneficial attempts by improving the estimation method for
the covariance matrix. The covariance matrix adaptation evolu-
tion strategy (CMA-ES) [20] , which can be considered as a spe-
cial EDA, employs a sophisticated covariance matrix estimation
method, where the rank- μ-update operator updates the covariance
matrix using the weighted high-quality solutions in the current
generation and the corresponding mean in the last generation. By
this means, the variance along the gradient direction can be in-
creased. Bosman et al. [19] proposed an anticipated mean shift
(AMS) operator which estimates the covariance matrix by shifting
part of selected solutions along the anticipated gradient direction
such that the main search direction can be corrected to a certain
extent. Bosman et al. integrated AVS, SDR and AMS together and
developed a powerful EDA variant known as AMaLGaM [19] . Ren
et al. [21] improved the original AMS operator by directly shifting
the mean of selected solutions and taking the shifted mean as the
center when estimating the covariance matrix. Liang et al. [22] re-
cently reported an enhanced GEDA, in which the inferior solutions
in current generation are repaired and utilized to estimate the co-
variance matrix such that the search directions could be adaptively
adjusted.
In addition to scaling variances and improving the covariance
matrix estimation method, extensive effort s have been made to en-
hance the performance of EDA. Xu [23] combined EDA with chaos
perturbation operator for the purpose of enhancing the population
diversity. Chen et al. [24] proposed a fast interactive EDA which ex-
tracts user’s preference on the decision variables from historical in-
formation to reduce the initial search space to a preferred subspace
such that the search process can be accelerated. Fang et al. [25] de-
veloped a mean shift strategy to speed up the convergence of EDA.
Zhou et al. [7] suggested combining EDA with cheap and expen-
sive local search. Auger and Hansen [26] developed a restart CMA-
ES with increasing population size (IPOP-CMAES). Although IPOP-
CMAES was developed a decade years ago, it was recently reported
that IPOP-CMAES is still competitive with many other state-of-the-
art EAs proposed in recent years [27] . Karshenas et al. [28] investi-
gated the effect of regularization method on the model learning
process of GEDA. Santana et al. [29] tried to improve EDA with
the help of new selection strategies. Instead of using Gaussian and
histogram model, Zhang and Zeng [30] , Qian et al. [31] and Pour-
MohammadBagher et al. [32] adopted particle filter, Copula the-
ory and probabilistic graphical model, respectively, to capture the
distribution of good solutions. Aiming at seeking multiple solu-
tions, the techniques of detecting promising areas [33] and nich-
ing [34,35] were introduced to EDA to enhance its performance on
multimodal problems. Moreover, EDAs have been integrated with
other EAs like particle swarm optimization (PSO) [36] and differ-
ential evolution (DE) [37] to fuse their advantages together. Theo-
retical researches have also been done to characterize the behav-
iors of EDA. Rastegar [38] analyzed the convergence probability of
two univariate EDAs and showed a sufficient condition for the con-
vergence. Echegoyen et al. [39] comprehensively studied the rela-
tionship between the behaviors of EDA and the solution space of
optimization problems.
To sum up, EDAs have been improved significantly in the past
decades, but there are still some shortcomings. So far, most exist-
ing variance scaling strategies are able to adjust the search scope
of GEDA, but can hardly change its search directions. This is one
of the key issues that severely restrict the performance of GEDA,
but has not been fully recognized and studied. Consequently, few
related work was reported in recent years. CMA-ES and AMaLGaM
achieve great success by comprehensively regulating their search
scopes and directions. However, their algorithmic frameworks are
so complex that it is too difficult for practitioners to understand
the mechanisms therein, not to mention setting corresponding pa-
rameters. As for some other EDA variants, their performance also
highly depends on the search efficiency of basic EDA, then it is
hopeful that they would be further improved if the efficiency of
EDA can be increased.
Taking enhancing EDA with some simple operations as the goal,
this paper proposes an anisotropic adaptive variance scaling (AAVS)
strategy and develops a novel GEDA variant named as AAVS-EDA.
Different from most existing variance scaling strategies which ad-
just all the variances by a same factor, AAVS first captures the land-
scape of the problem being solved along each eigendirection with a
simple topology-based detection method, then adaptively tunes the
variances along different directions according to the correspond-
ing detection results. If a slope is detected along an eigendirection,
then AAVS enlarges the corresponding variance. On the contrary, if
a valley is detected, then AAVS keeps the corresponding variance
unchanged. By this means, the search speed along a slope can be
quickened, and the fine search around a valley can be achieved.
More importantly, profiting from its anisotropic scaling, AAVS is
able to make the main search direction of EDA naturally become
consistent with the fitness improvement direction. To ensure con-
vergence, AAVS-EDA also adopts an auxiliary global monitor which
shrinks all the variances if no improvement is achieved in a gener-
ation. Thanks to these fine properties, AAVS-EDA shows desirable
performance on a variety of benchmark functions.
The remainder of this paper is organized as follows.
Section 2 briefly reviews the basic knowledge of GEDA.
Section 3 describes AAVS and the resulting AAVS-EDA in de-
tail. Section 4 presents the experiment settings and analyzes the
experiment results. The conclusions are finally drawn in Section 5 .
2.
Basic knowledge of GEDA
As a model-based EA, EDA assumes that good solutions approx-
imately obey a certain probability distribution over the solution
space. During the search process, it tries to learn this distribution
and generate new solutions according to the learning results [2–4] .
The general framework of EDA is outlined in Algorithm 1 .
EDA usually starts with an initial population which is filled with
some randomly generated solutions. After the evaluation, those rel-
atively good solutions are selected generally according to a trunca-
tion selection rule. Then a new probability model is built to pro-
duce solutions for the next generation. EDA executes this iterative