基于深度学习种子重要性采样的引力波快速天空定位_Swiftskylocalizationofgravitational资源-CSDN文库

版权申诉

105 浏览量 2022-01-22 02:30:29 上传评论收藏 1.27MB PDF 举报

资源推荐

资源详情

资源评论

Swift sky localization of gravitational waves

using deep learning seeded importance sampling

Alex Kolmus,

1, ∗

Gr´egory Baltus,

Justin Janquart,

3, 4

Twan van Laarhoven,

Sarah Caudill,

3, 4

and Tom Heskes

Institute for Computing and Information Sciences (ICIS),

Radboud University Nijmegen, Toernooiveld 212, 6525 EC Nijmegen, The Netherlands

STAR Institut, Bˆatiment B5, Universit´e de Li`ege, Sart Tilman B4000 Li`ege, Belgium

Nikhef, Science Park 105, 1098 XG Amsterdam, The Netherlands

Institute for Gravitational and Subatomic Physics (GRASP),

Utrecht University, Princetonplein 1, 3584 CC Utrecht, The Netherlands

(Dated: November 2, 2021)

Fast, highly accurate, and reliable inference of the sky origin of gravitational waves would enable

real-time multi-messenger astronomy. Current Bayesian inference methodologies, although highly

accurate and reliable, are slow. Deep learning models have shown themselves to be accurate and

extremely fast for inference tasks on gravitational waves, but their output is inherently questionable

due to the blackbox nature of neural networks. In this work, we join Bayesian inference and deep

learning by applying importance sampling on an approximate posterior generated by a multi-headed

convolutional neural network. The neural network parametrizes Von Mises-Fisher and Gaussian

distributions for the sky coordinates and two masses for given simulated gravitational wave injections

in the LIGO and Virgo detectors. We generate skymaps for unseen gravitational-wave events that

highly resemble predictions generated using Bayesian inference in a few minutes. Furthermore, we

can detect poor predictions from the neural network, and quickly ﬂag them.

I. INTRODUCTION

Gravitational waves (GWs) have immensely advanced

our understanding of physics and astronomy since 2015

[1–4]. These GWs are observed by the Hanford (H)

and Livingston (L) interferometers of the Laser Inter-

ferometer Gravitational Wave Observatory (LIGO) [5]

and the Advanced Virgo (V) interferometer [6]. The

collaboration between these three detectors has enabled

triple-detector observations of GWs [2], making it pos-

sible to do proper sky localisation of their astrophysical

sources. This additional detector changes the sky distri-

bution from a broad band to a more narrow distribution

[2].

Better early sky localisation capabilities would allow

for real-time multi-messenger astronomy (MMA), observ-

ing astrophysical events through multiple channels - elec-

tromagnetic transients, cosmic rays, neutrinos - only sec-

onds after the GW is detected. MMA is limited to GWs

originating from binary neutron star (BNS) and neutron

star-black hole mergers. According to current literature,

it is unlikely that binary black holes (BBHs) emit an elec-

tromagnetic counterpart during their merger [7, 8]. Cur-

rently, astrophysicists try to collect the non-GW chan-

nels in the weeks after the event. A notable example

is GW170817 [9, 10]. This process takes an enormous

amount of eﬀort, while the obtained data quality is of-

ten sub-optimal. Having all channels observed for the

full duration of the event would be a major leap forward.

Real-time MMA would enable a plethora of new science,

e.g. unravelling the nucleosynthesis of heavy elements

∗

alex.kolmus@ru.nl

using r- and s-processes, more accurate and novel tests

of general relativity, and a deeper understanding of the

cosmological evolution [11–13]. As aforementioned, real-

time MMA relies on the generation of a skymap and it

imposes two limits on the methodology used to obtain

one. First, it needs to be swift in order to allow observa-

tories to turn towards an event’s origin, preferably only

seconds after its observation. Second, the skymap needs

to be as accurate as possible since telescopes have a lim-

ited area they can observe. Below we present current

approaches in generating skymaps for GW events.

Most GW software libraries [14, 15] use Bayesian infer-

ence methods - in particular Markov chain Monte Carlo

(MCMC) and nested sampling [16] - to construct the pos-

terior over all GW parameters. These methods asymp-

totically approach the true distribution given a suﬃcient

number of samples [17]. Although theoretically optimal,

a chain with around 10

to 10

samples is required [14]

to closely approximate the true posterior distribution for

a GW event. Even when using Bilby [18] - a modern

Bayesian inference library made for GW astronomy - to

perform the inference for a single BBH event, takes hours

to produce [19]; BNS events take even longer. Bayesian

inference is the most accurate method available for GW

posterior estimation, but its run-time is prohibitively

long when it comes to MMA.

To overcome the speed limitations of the Bayesian ap-

proaches, Singer and Price developed BAYESTAR in

2016 [20], an algorithm that can output a robust skymap

for a GW event within a minute. BAYESTAR realizes

this speedup in two ways. First, it exploits the infor-

mation provided by the matched ﬁltering pipeline used

in the detection of GWs. The inner product between

time strain and matched ﬁlters contains nearly all of

the information regarding arrival times, amplitudes and

arXiv:2111.00833v1 [gr-qc] 1 Nov 2021

phases, which are critical for skymap estimation. Sec-

ond, Singer and Price derive a likelihood function that

is semi-independent from the mass estimation and does

not rely on direct computation of GW waveforms, allow-

ing for massive speedups and parallelization. Although

BAYESTAR is fast, its predictions tend to be broader

and less precise than those made by Bilby [21].

Deep learning (DL) algorithms have shown themselves

to be exceptionally quick and powerful when handling

high-dimensional data [22, 23]. Therefore, they are an

interesting alternative to the Bayesian methods. Several

papers have proposed methods to estimate the GW pos-

terior, including the skymap, using DL algorithms. Ex-

amples of such algorithms are Delaunoy et al. [24] and

Green and Gair [25]. Delaunoy et al. [24] use a convolu-

tional neural network (CNN) to model the likelihood-

to-evidence ratio when given a strain-parameter pair.

By evaluating a large amount of parameter options in

parallel, they can generate conﬁdence intervals within a

minute. The reported conﬁdence intervals are slightly

wider than those made by Bilby. A completely diﬀerent

approach was taken by Green and Gair [25]. They show-

case complete 15-parameter inference for GW150914 us-

ing normalizing ﬂows. They apply a sequence of invert-

ible functions to transform an elementary distribution

into a complex distribution [26] which, in this case, is

a BBH posterior. Within a single second, their method

is able to generate 5,000 independent posterior samples

that are in agreement with the reference posterior[27].

A Kolmogorov-Smirnov test conﬁrms that these samples

are very closely resemble the samples that are drawn

from the exact posterior. Both DL methods are fast

and seem to be accurate for the 100 - 1000 simulated

GW events they have been evaluated on. However, these

methods have a few issues: (1) they are both suscepti-

ble to changes in the power spectral density (PSD) and

signal-to-noise ratio (SNR), (2) both are close in perfor-

mance to Bilby but do not match it, (3) they can act un-

predictably outside of the trained strain-parameters pairs

and, even within this space, they can act unpredictably

due to the blackbox nature of neural networks (NNs).

Issues (1) and (2) have been addressed for the normaliz-

ing ﬂow algorithm in a recent paper by Dax et al. [28],

however the robustness guarantees remain behind those

of traditional Bayesian inference.

Our method tries to bridge the gap between Bayesian

inference and DL methods, allowing for fast inference

while still guaranteeing optimal accuracy. It is to be

noted that combining Bayesian inference and DL meth-

ods has recently gained traction in the GW community,

see for example reference [29]. The goal of our algorithm

is to restrict the parameter space such that, via sam-

pling, one can quickly obtain an accurate skymap. We

use a multi-headed CNN to parameterize an independent

sky and mass distribution for a given BBH event. The

model is trained on simulated precessing quasi-circular

BBH signals resembling the ones observed by the HLV

detectors. The parameterized sky and mass distribu-

tions are Gaussian-like and are assumed to approximate

the sky and mass distributions generated by Bayesian

inference. Using the parameterized sky and mass dis-

tributions, we construct a proposal posterior in which

all other BBH parameters are uniformly distributed. By

using importance sampling we can then sample from the

exact reference posterior. This implies that we eﬀectively

match the performance of Bayesian inference in a short

time span, without exploring the entire parameter space.

We stress that this work is a proof of concept to show

the promises of combining NNs and Bayesian inference.

More ﬂexible DL models and BNS events will be consid-

ered in future studies.

This paper is organised as follows. Section 2 discusses

the model architecture and importance sampling scheme.

Section 3 details the performed experiments, including

the model training. Section 4 covers the results of these

experiments and subsequently assesses the performance

of the model and importance sampling scheme by com-

paring it with skymaps generated using Bilby for a non-

spinning BBH system. Conclusions and future endeav-

ours are speciﬁed in Section 5.

II. METHODOLOGY

Our inference setup is a two-step method. In the ini-

tial step we infer simple distributions for the sky local-

ization and the masses of the BBH by using a neural

network. Subsequently, we apply importance sampling

to these simple distributions to compute a more accu-

rate posterior. The ﬁrst subsection describes the role

and implementation of importance sampling. The sec-

ond subsection discusses the neural network setup and

our method for distribution estimation.

A. Importance sampling

High-dimensional distributions in which the majority

of the probability density is conﬁned to a small volume of

the entire space are hard to sample from, which results

in long run times to get proper estimates when using

MCMC methods. A well-known method to cope with this

problem is importance sampling. By using a proposal dis-

tribution q that covers this high probability density re-

gion of the complex distribution p one can quickly obtain

useful samples. There are two requirements when using

importance sampling. First, the desired distribution p

needs to be known up to the normalization constant Z:

p(λ) =

θ(λ). Second, the proposal distribution q needs

to be non-zero for all λ where p is non-zero. Importance

sampling can be understood as compensating for the dif-

ference between the distributions p and q by assigning an

importance weight w(λ) to the each sample λ,

w(λ) =

θ(λ)

q(λ)

, (1)

where the fraction is the likelihood ratio between the -

not-normalized - p and q. The distribution created by

the reweighted samples will converge to the p distribution

given enough samples [30].

Generating accurate posteriors for GW observations

using MCMC is very time consuming, and thus impor-

tance sampling is an interesting alternative. Importance

sampling requires us to have a viable proposal distribu-

tion. Published posteriors for known gravitational waves

show that the probability density in the posterior is rel-

atively well conﬁned for both the sky location and the

two masses [31]. A Von Mises Fisher (VMF) and Multi

Variate Gaussian (MVG) distribution are good ﬁrst order

approximations of the sky and mass distribution respec-

tively, and thus suitable to use as a proposal distribution

for importance sampling. We propose to construct this

proposal distribution by assuming a uniform distribution

over all non-spinning BBH parameters, except for the sky

angles which will be represented by a VMF and a MVG

distribution for the masses. Assuming that the BBH pa-

rameters, sky angles, and masses are independent, our

proposal distribution becomes the product of these two

distributions. In the next subsection we discuss how we

create this proposal distribution using a neural network.

Importance sampling demands a likelihood function for

the proposal distribution and the desired distribution. In

the previous paragraph we have discussed how we want

to create a proposal distribution, we will now focus on

the desired distribution p. For the likelihood function of

the GW posterior p(s|λ) we take the deﬁnition given by

Canizares et al. [32]:

p(s|λ) ∝ θ(s|λ) = exp



−

hs − h(λ)|s − h(λ)i



, (2)

where s is the observed strain, h(λ) is the GW template

deﬁned by parameters λ. The inner product is weighted

by the PSD of the detector’s noise. In practice we use

the likelihood implementation provided by Bilby named

GravitationalWaveTransient.

We now have all the parts needed to discuss how we uti-

lize importance sampling for a given strain s. A trained

neural network parameterizes the proposal distribution

q for the given strain. The proposal distribution gen-

erates n samples, these samples represent possible GW

parameter conﬁgurations. For each sample we calculate

the logarithm of the importance weight,

log w(λ) = log θ(s|λ) − log q(λ) + C, (3)

instead of the importance weight w(λ) itself to prevent

numeric under- and overﬂow. The constant C is added to

set the highest log w(λ) to zero, to prevent very large neg-

ative values from becoming zero when we calculate the

associated likelihood. Since we normalize the weights

afterwards the correct importance weights are still ob-

tained. The reweighted samples represent the desired

distribution p.

If the proposal distribution does not cover the true

distribution well enough, the importance samples will be

dominated by only a single to a few weights if we restrict

the run-time. We can use this as a gauge to check if the

skymap produced by the neural network and importance

sampling is to be trusted.

B. Model

Previous work done by George et al. [33] shows that

convolutional neural networks (CNN) are able to extract

the masses from a BBH event just as well as the currently-

in-use matched ﬁltering. Furthermore, work done by Fan

et al. [34] indicates that 1D CNNs are able to locate GW

origins. We therefore chose to use a 1D CNN to model

both the distribution across the sky for the origin of the

GWs and a multivariate normal distribution for the two

masses of the BBH system.

The network architecture of this 1D CNN is presented

in Figure 1 and consists of four parts: a convolutional

feature extractor and three neural network heads. These

heads are used to specify the two distributions. The fol-

lowing properties were tested or tuned for optimal per-

formance: number of convolutional layers, kernel size,

dilation, batch normalization, and dropout. The model

shown in Figure 1 produced the best result on a valida-

tion set.

The convolutional feature extractor generates a set of

features that characterize a given GW. This set of fea-

tures is passed on to the neural heads. Each head is

specialized to model a speciﬁc GW parameter. The ﬁrst

head determines the sky distribution, the second head

the masses, and the third head the uncertainty over the

two masses. Below we will elaborate on each of these

heads and how they characterize these distributions.

The ﬁrst head speciﬁes the distribution of the GW ori-

gin. Since the sky is described by the surface of a 3D

sphere, a 2D Gaussian distribution is an ill ﬁt. A suitable

alternative is the Von Mises-Fisher (VMF) distribution

[35] which is the equivalent of a Gaussian distribution on

the surface of a sphere. The probability density function

and the associated negative log-likelihood (NLL) of the

VMF distribution:

p(x|µ, κ) =

4π sinh(κ)

exp



κx



(4)

NLL

VMF

(x, µ, κ) = − log(κ) − log(1 − exp(−2κ))

−κ− log(2π) + κx

µ , (5)

where x and µ are normalized vectors in R

, with

the former being the true direction and the latter being

the predicted direction. κ is the concentration parame-

ter, which determines the width of the distribution. It

plays the same role as the inverse of the variance for

a Gaussian distribution. We use this distribution by

letting the ﬁrst head output a three-dimensional vector

D = (D

, D

). The norm of D speciﬁes the con-

centration parameter κ, and its projection onto the unit

sphere gives the mean µ, κ = |D|, and µ = D/|D|. These

剩余11页未读，继续阅读

评论收藏

内容反馈

版权申诉

易小侠

粉丝: 6507
资源: 9万+

基于深度学习种子重要性采样的引力波快速天空定位_Swift sky localization of gravitational

最新资源

基于深度学习种子重要性采样的引力波快速天空定位_Swift sky localization of gravitational

IoT-WIFI-localization-master_室内WiFi定位_iot_localization_wifi定位_wi

robot_localization_localization_雷达定位ekf_环境建图_ROS_ros小车建图_

hdl_global_localization

imu_gps_localization代码注释版.zip

Amorphous.rar_Amorphous_localization_node localization_定位_毕业设计

wifi_localization_WiFi数据集_室内定位_WiFi室内定位_

RSSI.zip_localization_localization Matlab_rssi matlab_无线定位_网络定位

AUV-localization_AUVUNDERWATER_auv_underwater_timer_localization

Amorphous.zip_WSN_WSN Localization_WSN 定位_localization Matlab_ma

sanbian.rar_localization RSSI_rssi_rssi localization_rssi定位_定位

mcl.zip_MCL_localization_localization .tcl

wireless localization_localization_tentvts_无线网络定位克拉美罗_定位克拉美罗_

tdoa_localization.zip_TDOA location_TDOA定位_TDOA定位算法_localization

code.zip_PYTHON室内定位_indoor localization _spinc6v_定位_室内定位

Mobile_user_localization.zip_MATLAB mobile_Mobile localization_

LS_Localization_localization_最小二乘法_

python大作业 含爬虫、数据可视化、地图、报告、及源码（2016-2021全国各地区粮食产量）.rar

解决Python导入opencv报错“DLL load failed while importing cv2: 找不到指定的模”

python3.6和3.7版本的dlib包

matlab批量读取excel表格数据并处理画图

《剑指大数据——Flink学习精要（Java版）》（最终修订版）.pdf

JLink_Windows_V764c_x86_64

java-11 windows-x64 安装包

python123.io平台部分题目答案资源整理

php生成PDF电子合同签名

MATLAB深度学习入门实例（果树病虫害识别VGG19版）

MATLAB安装MinGW-w64 C/C++编译器

数学建模30个常用算法（Python）

7针 OLED驱动.c及.h程序（SPI）

最新资源

python大作业含爬虫、数据可视化、地图、报告、及源码（2016-2021全国各地区粮食产量）.rar