没有合适的资源?快使用搜索试试~ 我知道了~
Benchpress是一种可扩展且独立于平台的工作流,用于对图形模型的结构学习算法进行基准测试_Benchpress a sca
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 174 浏览量
2022-01-17
13:18:53
上传
评论
收藏 357KB PDF 举报
温馨提示
试读
32页
Benchpress是一种可扩展且独立于平台的工作流,用于对图形模型的结构学习算法进行基准测试_Benchpress a scalable and platform-independent workflow for benchmarking structure learning algorithms for graphical models.pdf
资源详情
资源评论
arXiv:2107.03863v2 [stat.ML] 23 Aug 2021
Benchpress: A Scalabl e and Platform-Ind e pendent
Workflow for Benchmarking Structure Learning
Algorithms for Graphical Mode ls
Felix L. Rios
University of Basel
Giusi Moffa
University of Basel
Jack Kuipers
ETH Z
¨
urich
Abstract
Describing the relationship between the variables in a study domain and mode lling the
data generating mechanism is a fundamental problem in many empirical sciences. Prob-
abilistic graphical models are one common approach to tackle the problem. Learning the
graphical structure is computatio nally challenging and a fervent area of current research
with a plethora of algorithms being developed. To facilitate the benchmarking of differ-
ent methods, we pre sent a novel automated workflow, called benchpress for producing
scalable, reproducible, and platform-independent benchmarks of structure learning algo-
rithms for probabilis tic graphical models. Benchpress is interfaced via a simple JSON-file,
which makes it accessible for all users, while the code is designed in a fully modular
fashion to enable researchers to contribute additional methodologies . Benchpress cur-
rently provides an interface to a large number of state-of-the-art algorithms from librar ie s
such as BDgraph, BiDAG, bnlearn, GOBNILP, pcalg, r.blip, scikit-learn, TETRAD, and
trilearn as well as a var iety of methods for data generating models and performance evalu-
ation. Alongside user-defined models and randomly generated datasets, the software tool
also includes a number of s tandard datasets and graphical mode ls from the literature ,
which may be included in a benchmarking workflow. We demonstrate the applicability of
this workflow for learning Bayesian networks in four typical data scenarios. The s ource
code and documentation is publicly available from
http://github.com/felixleopoldo/
benchpress.
Keywords: reproducibility, scalable benchmarking, probabilistic graphical models.
1. Introduction
Probabilistic graphical models play a central role in modern statistical data analysis. Their
compact and elegant way to visualise and represent complex dependence stru ctures in multi-
variate probability distributions have shown to be successfully applicable in many scientific
domains, ranging from disciplines such as social sciences and image analysis to biology, m ed-
ical diagnosis and epidemiology (see e.g. Elwert 2013; Friedman, Linial, Nachman, and Pe’er
2000
; Friedman 2004; Moffa, Catone, Kuipers, Kuipers, Freeman, Marwaha, Lennox, Broome,
and Bebbington 2017
; Kuipers, Thurnherr, Moffa, Suter, Behr, Goosen, Christofori, and
Beerenwinkel 2018b
; Kuipers, Moffa, Kuipers, Freeman, and Bebbington 2019).
One of the main advantages of graphical models is that they provide a tool for experts and
researchers from non-statistical fields to easily specify their assumptions in a specific problem
2 Benchpress: a scalable and platform-independent workflow for stru cture learning
domain, in a way that is very transp arent. However, in many realistic situations the number of
variables is either too large to build networks by hand, as it is for example the case in genomic
applications, or s im ply no prior knowledge is available about the relationship between different
variables. As a consequence, there has been a growing interest in automated strategies to infer
the graph component of a probabilistic graph ical model from data, so-called structure learning.
For recent reviews of the wide flora of algorithms that have emerged in the last two decades,
the reader is referred to
Koski and Noble (2012); Constantinou, Liu, Chobtham, Guo, and
Kitson
(2021); Scutari, Graafland, an d Guti´errez (2019a).
Since th e scientific principle of reproducibility is being recognised as instrumental to achieve
high quality s tandards in r esearch, there is an increasing demand to present new algorithms
with publicly available source code and datasets (see e.g.
Munaf`o, Nosek, Bishop, Button,
Chambers, Percie du Sert, Simonsohn, Wagenmakers, Ware, and Ioannidis 2017
; Lamprecht,
Garcia, Kuzak, Martinez, Arcila, Martin Del Pico, Dominguez Del Angel, van de Sandt, Ison,
Martinez, McQuilton, Valencia, Harr ow, Psomopoulos, Gelpi, Chue Hong, Goble, and Capella-
Gutierrez 2020
). Ideally with publicly available software we should be able to easily compare
different methods with respect to their performance in different settings. The reality is that
the practical implementation of new approaches display high levels of heterogeneity on many
aspects, which renders comparative studies both challenging and time consuming. One of the
main difficulties is that different algorithms may be implemented in different programming
languages, with different dependencies. Different packages may d iffer with respect to the
representation of graphical models, the data formats and the way they interface to the user,
e.g. through the command-line arguments, a configuration file or a function in a specific
programming language. Another complication is that the number of computations increase
rapidly in comparison studies, in particular when input parameters are altered, which requires
parallel computation capabilities and careful bookkeeping to record and organise the results
for rep ortin g. As a result, researchers may invest considerable time on tailored benchmarking
scripts that intend to address these issues on a subset of algorithms. To complicate matters
beyond the implementational issues, there is no single well established metrics to evaluate
the p er formance of structure learning algorithms. Quite the contrary there is a w ide range of
different metrics to choose from (see e.g.
Tsamardinos, Brown, and Aliferis 2006; Constantinou
et al. 2021
) and researchers tend to use only a small subset to evaluate a new algorithm,
typically with the aim of highlighting the problem their algorithm addresses.
With the problems described above in mind , the objective of this work is to develop a unified
framework to facilitate execution and the benchmarking procedure of different algorithms. We
present a novel workflow called benchpress, which is implemented in snakemake (
K
¨
oster and
Rahmann 2012
) and designed for making repr oducible, platform-independent, and fully scal-
able benchmarks of structure learning algorithms. Benchpress is interfaced via an easy-to-use
configuration file in JSON-format (
Pezoa, Reutter, Suarez, Ugarte, and Vrgoˇc 2016), where a
benchmark setup is specified in a module based probabilistic programming style, separating
model specification, in terms of graph and corresponding parameters, from algorithm execu-
tion and evaluation. Snakemake enables benchpress to s eamlessly scale the computations to
server, cluster, grid and cloud environments, with ou t any changes in the configuration file.
The support for containerized software through e.g. Singularity (
Kurtzer, Sochat, and Bauer
2017
) or docker (Merkel 2014) images, together with platform-independent representations
of data sets and graphs enables benchpress to compare algorithms from different libr aries,
possibly im plemented in different programming languages with different dependencies.
3
Benchpress is implemented in a mo dular cod ing style which facilitated for researchers to
contribute with additional modules for generating models, str ucture learning algorithms and
evaluating results. In the first publicly released version we already included 19 algorithms from
some of the most popular libraries such as BD grap h (
Mohammadi and Wit 2019), BiDAG
(
Suter, Kuipers, Moffa, and Beerenwinkel 2021), bnlearn (Scutari 2010), GOBNILP (Cussens
2020
), pcalg (Kalisch, M
¨
achler, Colombo, Maathuis, and B
¨
uhlmann 2012), r.blip (Scana-
gatta, de Campos, Corani, and Zaffalon 2015
), scikit-learn (Pedregosa, Varoquaux, Gramfort,
Michel, Thirion, Grisel, Blond el, Prettenhofer, Weiss, Dubourg et al. 2011
), TETRAD (Gly-
mour and Scheines 1986
), and trilearn (Olsson, Pavlenko, and Rios 2019) along with some of
the models and data sets from the standard literature.
Recently
Constantinou, Liu, Chobtham, Guo, and Kitson (2020) developed Bayesys, a s oft-
ware with similar intentions, albeit exclusively designed for Bayesian networks, a specific type
of probabilistic grap hical models (Pearl 1997). Furthermore Bayesys is a Java implementa-
tion only in cluding one algorithm for structure learning, called SaiyanH (
Constantinou 2020),
suited for high dimensional settings. Another tool called causal-compare with similar pur-
pose as benchpress was recently released as part of the TETRAD project (Ramsey, Malinsky,
and Bui 2020
). However, the functionality of causal-compare is restricted to the algorithm
implementations in the TETRAD p roject, which represents only a sub set of available algo-
rithms (
Ramsey et al. 2020). This is makes its use more limited compared to benchpress,
which only requires a Linux system while there is no limitation for the individual algorithms
requirements.
Benchpress is distributed under the GPL v2.0 License, encouraging its use in both academic
and commercial settings. S ou rce code and documentation are available from
https://github.
com/felixleopoldo/benchpress
.
The rest of the paper is structured as follows. Section
2 presents an introduction to graphical
models along with some notational conventions. Section
3 provides background on current
strategies for structure learning. Section
4 describes the structur e of the JSON interface and
the m odules already available for benchmarking. In Section
5, we show how to use benchpress
in four typical benchmarking scenarios. Installation and usage guidelines are provided in
Section 6.
2. Probabilistic graphical models
This section provides a short introduction to graphical models and graph theory, with defini-
tions as in e.g.
Diestel (2005); Lauritzen (1996).
Let P be a p-dimensional distribution and let Y := (Y
i
)
p
i=1
be a random vector such that
Y ∼ P. Further let G = (V, E) be a graph, where V = {Y
i
}
p
i=1
is the node set and E is
the edge set, consisting of tuples of distinct elements of V . Then P, or alternatively Y , is
said to be Markov w ith respect to G if sep aration, based on some graphical criteria, of two
subsets of no des, Y
A
and Y
B
, by a third set Y
C
in G implies their conditional independence
in P, where Y
A
:= (Y
i
)
i∈A
. The term probabilistic graphical model or graph ical model has
been u sed interchangeably in the literature to refer to either the tuple (G, P) or a collection of
distributions P
G
, us ually restricted to some specific parametric family, where every P
′
∈ P
G
is
Markov with respect to G. In th is text we will use the former meaning. In the case where P is
restricted to a specific parametric family, it is usually uniquely determined by some parameter
4 Benchpress: a scalable and platform-independent workflow for structur e learning
vector Θ, thus either notation may be used.
Assuming f is a density for P, conditional independence between the random variables Y
A
and
Y
B
given Y
C
is well-defined as f (y
A
|y
B
, y
C
) = f(y
A
|y
C
), where (y
A
, y
B
, y
C
) ∈ Y
A
× Y
B
× Y
C
and Y
A
denotes the range of Y
A
. In contrast, node separation may have different definitions
depending on the type of the graph considered. Typical classes of graphs are: undirected
graphs, where edges are specified as unordered tuples and separation of Y
A
and Y
B
given Y
C
means that every path between Y
A
and Y
B
must pass Y
C
; directed acyclic graphs (DAGs),
where edges are represented as ord er ed tuples and separation is defined by means of a concept
known as d-separation, see e.g.
Pearl (1997). Next follows a short introduction to th e type
of graph ical models currently implemented in b enchpress.
Bayesian networks constitute one of the most popular classes of graphical models, referring to
distributions that are Markov with respect to a DAG. The conditional independence encoded
by the DAG implies that the density can be factorised into local conditional densities, where
each node Y
i
is directly dependent on its parents, pa(Y
i
), as
f(y) =
p
Y
i=1
f(y
i
|pa(y
i
)),
where f (y
i
|pa(y
i
)) denotes the conditional density of Y
i
given pa(Y
i
) = pa(y
i
). This pr operty
is appealing for several reasons. Firstly, the partition into local conditional probability distr i-
bution provides an intuitive way to express and visualise knowledge about a specific problem
domain. In particular, an expert could express a causal mechanism in terms of a DAG, which
in turn provides a mathematical language to answer causal qu er ies from non-experimental
data (
Pearl 1995). Further, it enables fast computations of conditional and marginal dis-
tribution using message-passing algorithms by exploiting the factorisation property in the
distribution. However , when the DAG is not used for expressing causal knowledge, the direc-
tion of the edges may be misleading since the conditional independencies of a distribution are
not uniquely encoded by a single DAG in general but rather by a class of Markov equivalent
DAGs, all of which encode th e same conditional independence statements. One way to rep-
resent a Markov equivalence class is by a completed partially directed acyclic graph (CPDAG,
see
Chickering 1995). A CPDAG is a partially directed DAG which can be obtained from a
DAG by regarding an edge (Y
i
, Y
j
) as undirected if and only if both directed edges (Y
i
, Y
j
)
and (Y
j
, Y
i
) are present in some of the DAGs in th e same equivalence class. The CPDAG is
also sometimes called the essential graph.
Markov networks or Markov random fields are another popular class of graph ical models,
referring to distributions that are Markov with respect to an undirected graph. There is in
general no decomposition into local densities for the models in this class. However, there exist
functions {φ
C
}
C∈C
such that
f(y) =
Y
C∈C
φ
C
(y
C
),
where C is the set of maximal subgraphs where all nodes are directly connected with each
other, usually called c liques, which may be leveraged to calculate marginal distributions ef-
ficiently. There is a sub-class of undirected graphs called decomposable (DG) or chordal or
triangulated, where each cycle with more than th ree nodes has a chord, an edge which breaks it.
Distributions which are Markov with respect to decomposable graphs are called decomposable
5
and their den sities are built by local, clique-sp ecific densities, {f
C
}
C∈C
and their marginals
so that the joint density factorises as
f(y) =
Q
C∈C
f
C
(y
C
)
Q
S∈S
f
S
(y
S
)
,
where S is the multi-set of separators in G, found e. g. as the intersection of neighboring
cliques in a so-called junction tree, built from the cliques in the graph; see e.g.
Lauritzen
(1996) for details. In general Markov networks and Bayesian networks encode different sets
of conditional independence statements, and there are sets which we can repr esent with one
model but not the other. Decomposable graphs represent th e intersection of those sets of
statements which we can represent by both und irected graphs and DAGs (see e. g.
Lauritzen
1996
; Cowell, Dawid, Lauritzen, and Spiegelhalter 2003; Koller and Friedman 2009). The
DAGs representing the intersection set are called perfect, meaning that all the parents of each
no de are connected by an edge. Further characterisations of decomposable graphs are foun d
in e.g.
Lauritzen (1996); Cowell et al. (2003) and Duarte and Solus (2021).
3. Structure learning
Structure learning algorithms for Bayesian networks was proved to be NP-hard, even when the
number of parents of each node is restricted to two (
Chickering, Heckerman, and Meek 2004).
Also the number of undirected graphs grows as O(2
p
2
) even when restricted to decomposable
graphs, making an exhaustive search for a specific graph infeasible (see e.g.
Wormald 1985;
Olsson, Pavlenko, and Rios 2018). Therefore, in practice we must rely on approximation
methods which we can divide into three main categories: constraint-based, score-based, an d
hybrid algorithms, discussed b riefly below.
The seed to the first constraint-based algorithm was planted by
Verma and Pearl (1991) in
the context of causal Bayesian networks. In line with their work, constraint-based methods
employ conditional independence tests among the variables to first estimate an undirected
graph by, e.g., starting with the complete graph, and excluding an edge (Y
i
, Y
j
) if there exists
some set Y
A
such that the hyp othesis Y
i
⊥⊥
P
Y
j
| Y
A
can be not rejected according to a suitable
test procedure. Such methods tend to be very fast but testing can suffer from large numbers
of false negatives, i.e. falsely not including edges when they should be present. Relaxing
the significance level of the tests to reduce false negatives can however rap idly inflate false
positives and runtimes.
Score-based methods on the other hand aim to optimise a global score function defined on
the graph space. As a work around to cope with the immense number of grap hs, the search
is often limited to certain types of graphs . For example in DAG models, one m ay restrict
the number of parents or combine DAGs into bigger and more easily represented classes (e.g .
orders and partitions
Friedman and Koller 2003; Kuipers and Moffa 2017). For decomposable
graphs a possibility is to limit the maximal clique size. Score functions may rely on penalised
log-likelihoods or on Bayesian posterior probabilities of graphical structures conditional on
the observed data (see e.g.
Koller and Friedman 2009). Besides mere optimisation, Bayesian
structure learning also focuses on sampling procedure to provide a finer characterisation of the
posterior graph distribution. When an expression for the posterior distribution is available at
least up to a normalising constant, we can implement Markov chain Monte Carlo (MCMC)
schemes to sample from it, see e.g.
Madigan, York, and Allard (1995); Giudici and Castelo
剩余31页未读,继续阅读
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功
评论0
最新资源