# PILOT: Introducing Transformers for Probabilistic Sound Event Localization
This repository contains the codebase accompanying our publication:
> Christopher Schymura, Benedikt B枚nninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa, "PILOT: Introducing Transformers for Probabilistic Sound Event Localization", *INTERSPEECH 2021*
[ [arXiv](https://arxiv.org/abs/2106.03903) ]
## 馃摀 Summary
Sound event localization aims at estimating the positions of sound sources in the environment with respect to an acoustic receiver (e.g. a microphone array). Recent advances in this domain most prominently focused on utilizing deep recurrent neural networks. Inspired by the success of transformer architectures as a suitable alternative to classical recurrent neural networks, the PILOT (*P*robab*i*listic *L*ocalization of S*o*unds with *T*ransformers) model is a transformer-based sound event localization framework, where temporal dependencies in the received multi-channel audio signals are captured via self-attention mechanisms. Additionally, the estimated sound event positions are represented as multivariate Gaussian variables, yielding an additional notion of uncertainty, which many previously proposed deep learning-based systems designed for this application do not provide. The general architecture of PILOT is shown in the figure below.
<div align="center">
<img src="./images/overview.png" width="800" title="PILOT architecture">
<p>Overview of the general PILOT architecture.</p>
</div>
## 馃殌 Getting started
You can train and evaluate the PILOT model using the [ANSIM](https://doi.org/10.5281/zenodo.1237703), [RESIM](https://doi.org/10.5281/zenodo.1237707) and [REAL](https://doi.org/10.5281/zenodo.1237793) sound event localization and detection datasets. We have prepared a script that downloads the respective datasets and stores them in a suitable folder structure. Simply run
`$ ./download_data.sh dataset-name`
where `dataset-name` specifies the desired dataset (either `ansim`, `resim` or `real`).
没有合适的资源?快使用搜索试试~ 我知道了~
PILOT-MAIN 自存留用
共13个文件
py:8个
sh:1个
gitignore:1个
需积分: 0 1 下载量 67 浏览量
2024-03-07
10:58:17
上传
评论
收藏 138KB ZIP 举报
温馨提示
PILOT-MAIN 自存留用
资源推荐
资源详情
资源评论
收起资源包目录
pilot-main.zip (13个子文件)
pilot-main
data_handlers
__init__.py 32B
tut_sound_events.py 9KB
LICENSE 8KB
utils
__init__.py 63B
losses.py 3KB
metrics.py 4KB
download_data.sh 6KB
models
__init__.py 21B
pilot.py 8KB
modules.py 7KB
.gitignore 2KB
images
overview.png 125KB
README.md 2KB
共 13 条
- 1
资源评论
kinggoin
- 粉丝: 27
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功