Benchpress是一种可扩展且独立于平台的工作流，用于对图形模型的结构学习算法进行基准测试

版权申诉

174 浏览量 2022-01-17 13:18:53 上传评论收藏 357KB PDF 举报

资源详情

资源评论

arXiv:2107.03863v2 [stat.ML] 23 Aug 2021

Benchpress: A Scalabl e and Platform-Ind e pendent

Workﬂow for Benchmarking Structure Learning

Algorithms for Graphical Mode ls

Felix L. Rios

University of Basel

Giusi Moﬀa

University of Basel

Jack Kuipers

ETH Z

urich

Abstract

Describing the relationship between the variables in a study domain and mode lling the

data generating mechanism is a fundamental problem in many empirical sciences. Prob-

abilistic graphical models are one common approach to tackle the problem. Learning the

graphical structure is computatio nally challenging and a fervent area of current research

with a plethora of algorithms being developed. To facilitate the benchmarking of diﬀer-

ent methods, we pre sent a novel automated workﬂow, called benchpress for producing

scalable, reproducible, and platform-independent benchmarks of structure learning algo-

rithms for probabilis tic graphical models. Benchpress is interfaced via a simple JSON-ﬁle,

which makes it accessible for all users, while the code is designed in a fully modular

fashion to enable researchers to contribute additional methodologies . Benchpress cur-

rently provides an interface to a large number of state-of-the-art algorithms from librar ie s

such as BDgraph, BiDAG, bnlearn, GOBNILP, pcalg, r.blip, scikit-learn, TETRAD, and

trilearn as well as a var iety of methods for data generating models and performance evalu-

ation. Alongside user-deﬁned models and randomly generated datasets, the software tool

also includes a number of s tandard datasets and graphical mode ls from the literature ,

which may be included in a benchmarking workﬂow. We demonstrate the applicability of

this workﬂow for learning Bayesian networks in four typical data scenarios. The s ource

code and documentation is publicly available from

http://github.com/felixleopoldo/

benchpress.

Keywords: reproducibility, scalable benchmarking, probabilistic graphical models.

1. Introduction

Probabilistic graphical models play a central role in modern statistical data analysis. Their

compact and elegant way to visualise and represent complex dependence stru ctures in multi-

variate probability distributions have shown to be successfully applicable in many scientiﬁc

domains, ranging from disciplines such as social sciences and image analysis to biology, m ed-

ical diagnosis and epidemiology (see e.g. Elwert 2013; Friedman, Linial, Nachman, and Pe’er

2000

; Friedman 2004; Moﬀa, Catone, Kuipers, Kuipers, Freeman, Marwaha, Lennox, Broome,

and Bebbington 2017

; Kuipers, Thurnherr, Moﬀa, Suter, Behr, Goosen, Christofori, and

Beerenwinkel 2018b

; Kuipers, Moﬀa, Kuipers, Freeman, and Bebbington 2019).

One of the main advantages of graphical models is that they provide a tool for experts and

researchers from non-statistical ﬁelds to easily specify their assumptions in a speciﬁc problem

2 Benchpress: a scalable and platform-independent workﬂow for stru cture learning

domain, in a way that is very transp arent. However, in many realistic situations the number of

variables is either too large to build networks by hand, as it is for example the case in genomic

applications, or s im ply no prior knowledge is available about the relationship between diﬀerent

variables. As a consequence, there has been a growing interest in automated strategies to infer

the graph component of a probabilistic graph ical model from data, so-called structure learning.

For recent reviews of the wide ﬂora of algorithms that have emerged in the last two decades,

the reader is referred to

Koski and Noble (2012); Constantinou, Liu, Chobtham, Guo, and

Kitson

(2021); Scutari, Graaﬂand, an d Guti´errez (2019a).

Since th e scientiﬁc principle of reproducibility is being recognised as instrumental to achieve

high quality s tandards in r esearch, there is an increasing demand to present new algorithms

with publicly available source code and datasets (see e.g.

Munaf`o, Nosek, Bishop, Button,

Chambers, Percie du Sert, Simonsohn, Wagenmakers, Ware, and Ioannidis 2017

; Lamprecht,

Garcia, Kuzak, Martinez, Arcila, Martin Del Pico, Dominguez Del Angel, van de Sandt, Ison,

Martinez, McQuilton, Valencia, Harr ow, Psomopoulos, Gelpi, Chue Hong, Goble, and Capella-

Gutierrez 2020

). Ideally with publicly available software we should be able to easily compare

diﬀerent methods with respect to their performance in diﬀerent settings. The reality is that

the practical implementation of new approaches display high levels of heterogeneity on many

aspects, which renders comparative studies both challenging and time consuming. One of the

main diﬃculties is that diﬀerent algorithms may be implemented in diﬀerent programming

languages, with diﬀerent dependencies. Diﬀerent packages may d iﬀer with respect to the

representation of graphical models, the data formats and the way they interface to the user,

e.g. through the command-line arguments, a conﬁguration ﬁle or a function in a speciﬁc

programming language. Another complication is that the number of computations increase

rapidly in comparison studies, in particular when input parameters are altered, which requires

parallel computation capabilities and careful bookkeeping to record and organise the results

for rep ortin g. As a result, researchers may invest considerable time on tailored benchmarking

scripts that intend to address these issues on a subset of algorithms. To complicate matters

beyond the implementational issues, there is no single well established metrics to evaluate

the p er formance of structure learning algorithms. Quite the contrary there is a w ide range of

diﬀerent metrics to choose from (see e.g.

Tsamardinos, Brown, and Aliferis 2006; Constantinou

et al. 2021

) and researchers tend to use only a small subset to evaluate a new algorithm,

typically with the aim of highlighting the problem their algorithm addresses.

With the problems described above in mind , the objective of this work is to develop a uniﬁed

framework to facilitate execution and the benchmarking procedure of diﬀerent algorithms. We

present a novel workﬂow called benchpress, which is implemented in snakemake (

oster and

Rahmann 2012

) and designed for making repr oducible, platform-independent, and fully scal-

able benchmarks of structure learning algorithms. Benchpress is interfaced via an easy-to-use

conﬁguration ﬁle in JSON-format (

Pezoa, Reutter, Suarez, Ugarte, and Vrgoˇc 2016), where a

benchmark setup is speciﬁed in a module based probabilistic programming style, separating

model speciﬁcation, in terms of graph and corresponding parameters, from algorithm execu-

tion and evaluation. Snakemake enables benchpress to s eamlessly scale the computations to

server, cluster, grid and cloud environments, with ou t any changes in the conﬁguration ﬁle.

The support for containerized software through e.g. Singularity (

Kurtzer, Sochat, and Bauer

2017

) or docker (Merkel 2014) images, together with platform-independent representations

of data sets and graphs enables benchpress to compare algorithms from diﬀerent libr aries,

possibly im plemented in diﬀerent programming languages with diﬀerent dependencies.

Benchpress is implemented in a mo dular cod ing style which facilitated for researchers to

contribute with additional modules for generating models, str ucture learning algorithms and

evaluating results. In the ﬁrst publicly released version we already included 19 algorithms from

some of the most popular libraries such as BD grap h (

Mohammadi and Wit 2019), BiDAG

(

Suter, Kuipers, Moﬀa, and Beerenwinkel 2021), bnlearn (Scutari 2010), GOBNILP (Cussens

2020

), pcalg (Kalisch, M

achler, Colombo, Maathuis, and B

uhlmann 2012), r.blip (Scana-

gatta, de Campos, Corani, and Zaﬀalon 2015

), scikit-learn (Pedregosa, Varoquaux, Gramfort,

Michel, Thirion, Grisel, Blond el, Prettenhofer, Weiss, Dubourg et al. 2011

), TETRAD (Gly-

mour and Scheines 1986

), and trilearn (Olsson, Pavlenko, and Rios 2019) along with some of

the models and data sets from the standard literature.

Recently

Constantinou, Liu, Chobtham, Guo, and Kitson (2020) developed Bayesys, a s oft-

ware with similar intentions, albeit exclusively designed for Bayesian networks, a speciﬁc type

of probabilistic grap hical models (Pearl 1997). Furthermore Bayesys is a Java implementa-

tion only in cluding one algorithm for structure learning, called SaiyanH (

Constantinou 2020),

suited for high dimensional settings. Another tool called causal-compare with similar pur-

pose as benchpress was recently released as part of the TETRAD project (Ramsey, Malinsky,

and Bui 2020

). However, the functionality of causal-compare is restricted to the algorithm

implementations in the TETRAD p roject, which represents only a sub set of available algo-

rithms (

Ramsey et al. 2020). This is makes its use more limited compared to benchpress,

which only requires a Linux system while there is no limitation for the individual algorithms

requirements.

Benchpress is distributed under the GPL v2.0 License, encouraging its use in both academic

and commercial settings. S ou rce code and documentation are available from

https://github.

com/felixleopoldo/benchpress

The rest of the paper is structured as follows. Section

2 presents an introduction to graphical

models along with some notational conventions. Section

3 provides background on current

strategies for structure learning. Section

4 describes the structur e of the JSON interface and

the m odules already available for benchmarking. In Section

5, we show how to use benchpress

in four typical benchmarking scenarios. Installation and usage guidelines are provided in

Section 6.

2. Probabilistic graphical models

This section provides a short introduction to graphical models and graph theory, with deﬁni-

tions as in e.g.

Diestel (2005); Lauritzen (1996).

Let P be a p-dimensional distribution and let Y := (Y

)

i=1

be a random vector such that

Y ∼ P. Further let G = (V, E) be a graph, where V = {Y

}

i=1

is the node set and E is

the edge set, consisting of tuples of distinct elements of V . Then P, or alternatively Y , is

said to be Markov w ith respect to G if sep aration, based on some graphical criteria, of two

subsets of no des, Y

and Y

, by a third set Y

in G implies their conditional independence

in P, where Y

:= (Y

)

i∈A

. The term probabilistic graphical model or graph ical model has

been u sed interchangeably in the literature to refer to either the tuple (G, P) or a collection of

distributions P

, us ually restricted to some speciﬁc parametric family, where every P

′

∈ P

Markov with respect to G. In th is text we will use the former meaning. In the case where P is

restricted to a speciﬁc parametric family, it is usually uniquely determined by some parameter

4 Benchpress: a scalable and platform-independent workﬂow for structur e learning

vector Θ, thus either notation may be used.

Assuming f is a density for P, conditional independence between the random variables Y

and

given Y

is well-deﬁned as f (y

, y

) = f(y

), where (y

, y

) ∈ Y

× Y

and Y

denotes the range of Y

. In contrast, node separation may have diﬀerent deﬁnitions

depending on the type of the graph considered. Typical classes of graphs are: undirected

graphs, where edges are speciﬁed as unordered tuples and separation of Y

and Y

given Y

means that every path between Y

and Y

must pass Y

; directed acyclic graphs (DAGs),

where edges are represented as ord er ed tuples and separation is deﬁned by means of a concept

known as d-separation, see e.g.

Pearl (1997). Next follows a short introduction to th e type

of graph ical models currently implemented in b enchpress.

Bayesian networks constitute one of the most popular classes of graphical models, referring to

distributions that are Markov with respect to a DAG. The conditional independence encoded

by the DAG implies that the density can be factorised into local conditional densities, where

each node Y

is directly dependent on its parents, pa(Y

), as

f(y) =

i=1

f(y

|pa(y

)),

where f (y

|pa(y

)) denotes the conditional density of Y

given pa(Y

) = pa(y

). This pr operty

is appealing for several reasons. Firstly, the partition into local conditional probability distr i-

bution provides an intuitive way to express and visualise knowledge about a speciﬁc problem

domain. In particular, an expert could express a causal mechanism in terms of a DAG, which

in turn provides a mathematical language to answer causal qu er ies from non-experimental

data (

Pearl 1995). Further, it enables fast computations of conditional and marginal dis-

tribution using message-passing algorithms by exploiting the factorisation property in the

distribution. However , when the DAG is not used for expressing causal knowledge, the direc-

tion of the edges may be misleading since the conditional independencies of a distribution are

not uniquely encoded by a single DAG in general but rather by a class of Markov equivalent

DAGs, all of which encode th e same conditional independence statements. One way to rep-

resent a Markov equivalence class is by a completed partially directed acyclic graph (CPDAG,

see

Chickering 1995). A CPDAG is a partially directed DAG which can be obtained from a

DAG by regarding an edge (Y

, Y

) as undirected if and only if both directed edges (Y

, Y

)

and (Y

, Y

) are present in some of the DAGs in th e same equivalence class. The CPDAG is

also sometimes called the essential graph.

Markov networks or Markov random ﬁelds are another popular class of graph ical models,

referring to distributions that are Markov with respect to an undirected graph. There is in

general no decomposition into local densities for the models in this class. However, there exist

functions {φ

}

C∈C

such that

f(y) =

C∈C

where C is the set of maximal subgraphs where all nodes are directly connected with each

other, usually called c liques, which may be leveraged to calculate marginal distributions ef-

ﬁciently. There is a sub-class of undirected graphs called decomposable (DG) or chordal or

triangulated, where each cycle with more than th ree nodes has a chord, an edge which breaks it.

Distributions which are Markov with respect to decomposable graphs are called decomposable

and their den sities are built by local, clique-sp eciﬁc densities, {f

}

C∈C

and their marginals

so that the joint density factorises as

f(y) =

C∈C

)

S∈S

)

where S is the multi-set of separators in G, found e. g. as the intersection of neighboring

cliques in a so-called junction tree, built from the cliques in the graph; see e.g.

Lauritzen

(1996) for details. In general Markov networks and Bayesian networks encode diﬀerent sets

of conditional independence statements, and there are sets which we can repr esent with one

model but not the other. Decomposable graphs represent th e intersection of those sets of

statements which we can represent by both und irected graphs and DAGs (see e. g.

Lauritzen

1996

; Cowell, Dawid, Lauritzen, and Spiegelhalter 2003; Koller and Friedman 2009). The

DAGs representing the intersection set are called perfect, meaning that all the parents of each

no de are connected by an edge. Further characterisations of decomposable graphs are foun d

in e.g.

Lauritzen (1996); Cowell et al. (2003) and Duarte and Solus (2021).

3. Structure learning

Structure learning algorithms for Bayesian networks was proved to be NP-hard, even when the

number of parents of each node is restricted to two (

Chickering, Heckerman, and Meek 2004).

Also the number of undirected graphs grows as O(2

) even when restricted to decomposable

graphs, making an exhaustive search for a speciﬁc graph infeasible (see e.g.

Wormald 1985;

Olsson, Pavlenko, and Rios 2018). Therefore, in practice we must rely on approximation

methods which we can divide into three main categories: constraint-based, score-based, an d

hybrid algorithms, discussed b rieﬂy below.

The seed to the ﬁrst constraint-based algorithm was planted by

Verma and Pearl (1991) in

the context of causal Bayesian networks. In line with their work, constraint-based methods

employ conditional independence tests among the variables to ﬁrst estimate an undirected

graph by, e.g., starting with the complete graph, and excluding an edge (Y

, Y

) if there exists

some set Y

such that the hyp othesis Y

⊥⊥

| Y

can be not rejected according to a suitable

test procedure. Such methods tend to be very fast but testing can suﬀer from large numbers

of false negatives, i.e. falsely not including edges when they should be present. Relaxing

the signiﬁcance level of the tests to reduce false negatives can however rap idly inﬂate false

positives and runtimes.

Score-based methods on the other hand aim to optimise a global score function deﬁned on

the graph space. As a work around to cope with the immense number of grap hs, the search

is often limited to certain types of graphs . For example in DAG models, one m ay restrict

the number of parents or combine DAGs into bigger and more easily represented classes (e.g .

orders and partitions

Friedman and Koller 2003; Kuipers and Moﬀa 2017). For decomposable

graphs a possibility is to limit the maximal clique size. Score functions may rely on penalised

log-likelihoods or on Bayesian posterior probabilities of graphical structures conditional on

the observed data (see e.g.

Koller and Friedman 2009). Besides mere optimisation, Bayesian

structure learning also focuses on sampling procedure to provide a ﬁner characterisation of the

posterior graph distribution. When an expression for the posterior distribution is available at

least up to a normalising constant, we can implement Markov chain Monte Carlo (MCMC)

schemes to sample from it, see e.g.

Madigan, York, and Allard (1995); Giudici and Castelo

剩余31页未读，继续阅读

评论收藏

内容反馈

版权申诉

Benchpress是一种可扩展且独立于平台的工作流，用于对图形模型的结构学习算法进行基准测试_Benchpress a sca

评论0

最新资源

Benchpress是一种可扩展且独立于平台的工作流，用于对图形模型的结构学习算法进行基准测试_Benchpress a sca

评论0

最新资源

相关推荐

SCA_正余弦算法sca_sca算法_正弦余弦算法_SCA优化算法_SCA_

SCA_sca算法_SCA_sca程序代码_sca算法代码_

SCA_SCA优化算法_正弦余弦优化算法_SCA_优化算法_正弦余弦算法_源码.zip

SCA_SCA优化算法_正弦余弦优化算法_SCA_优化算法_正弦余弦算法.zip

SCA_正弦余弦算法_正余弦算法_SCA_

SCA3.rar_SCA_sca算法_swamecb_改进SCA算法

matlab_利用不同的光谱匹配算法实现光谱匹配_SA_SCA_SCF_SCM_SGA_SID_SID/SA

SCA.zip_sca智能算法吧_智能优化_智能优化算法_智能算法_群体智能

OB_SCA BA.zip_OB_SCA_SCA_SCA优化_优化算法_正余弦算法

软件通信体系结构规范 SCA 4.1.rar_SCA 4_SCA 4.1_dance5xi_软件开发规范

SOA编程模型_SCA_SDO

SCA and DEM.zip_DEM模型_sca_dem_各向异性_各向异性SCA_岩石物理

SCA_装配模型规范_中文版

SCA.rar_SCA 算法介绍_SINE_algorithm_cosine_黄金正弦算法

sca.rar_SCA_SCA matlab_SCA模型_多矿物_矿物

SCA_PowerAllocation.m SCA算法matlab代码

sca_SCA凸优化_凸优化SCA_SCA_凸优化_

正弦余弦算法（SCA）matlab代码

Cobalt Strike下载

计算机系统-笔记-HUN2021级

北京邮电大学计算机考研复试笔试资料

cs1.6老版本供下载

合成孔径雷达的经典成像算法cs(matlab)仿真代码（吐血整理，内容全，注释全）

港大CS（MSC）面试整理

合成孔径雷达RD CS OmegaK算法点目标仿真.rar

计算机科学导论原书第二版答案.zip

Cobalt-Strike-4.5

cobaltstrike4.3.zip