# Dual-path-RNN-Pytorch
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
If you have any questions, you can ask them through the issue.
If you find this project helpful, you can give me a star generously.
Demo Pages: [Results of pure speech separation model](https://cslikai.cn/project/Pure-Audio/)
# Plan
- [x] 2020-02-01: Reading article “[Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation](https://arxiv.org/abs/1910.06379 "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation")”. Zhihu Article link "[阅读笔记”Dual-path RNN for Speech Separation“](https://zhuanlan.zhihu.com/p/104606356 "阅读笔记”Dual-path RNN for Speech Separation“")". Blog Article link "[阅读笔记《Dual-path RNN for speech separation》](https://www.likai.show/archives/dual-path-rnn "阅读笔记《Dual-path RNN for speech separation》")". Both articles are interpretations of the paper. If you have any questions, welcome to discuss with me
- [x] 2020-02-02: Complete data preprocessing, data set code. Dataset Code: [/data_loader/Dataset.py](https://github.com/JusperLee/Dual-path-RNN-Pytorch/blob/master/data_loader/Dataset.py)
- [x] 2020-02-03: Complete Conv-TasNet Framework (Update **/model/model.py, Trainer_Tasnet.py, Train_Tasnet.py**)
- [x] 2020-02-07: Complete Training code. (Update **/model/model_rnn.py**) and Test parameters and some details are being adjusted.
- [x] 2020-02-08: Fixed the code's bug.
- [x] 2020-02-11: Complete Testing code.
# Dataset
We used the WSJ0 dataset as our training, test, and validation sets. Below is the data download link and mixed audio code for WSJ0.
- [Audio mix Sample](https://www.merl.com/demos/deep-clustering/media/female-female-mixture.wav)
- [WSJ0 Dataset](https://catalog.ldc.upenn.edu/LDC93S6A)
- [Create Dataset](https://www.merl.com/demos/deep-clustering/create-speaker-mixtures.zip)
# Training
## Training for Conv-TasNet model
1. First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
```shell
python create_scp.py
```
2. Then you can modify the training and model parameters through "[config/Conv_Tasnet/train.yml](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/tree/master/config/Conv_Tasnet )".
```shell
cd config/Conv-Tasnet
vim train.yml
```
3. Then use the following command in the root directory to train the model.
```shell
python train_Tasnet.py --opt config/Conv_Tasnet/train.yml
```
## Training for Dual Path RNN model
1. First, you need to generate the scp file using the following command. The content of the scp file is "filename && path".
```shell
python create_scp.py
```
2. Then you can modify the training and model parameters through "[config/Dual_RNN/train.yml](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/tree/master/config/Dual_RNN "config / Dual_RNN / train.yml")".
```shell
cd config/Dual_RNN
vim train.yml
```
3. Then use the following command in the root directory to train the model.
```shell
python train_rnn.py --opt config/Dual_RNN/train.yml
```
# Inference
## Conv-TasNet
You need to modify the default parameters in the test_tasnet.py file, including test files, test models, etc.
### For multi-audio
```shell
python test_tasnet.py
```
### For single-audio
```shell
python test_tasnet_wav.py
```
## Dual-Path-RNN
You need to modify the default parameters in the test_dualrnn.py file, including test files, test models, etc.
### For multi-audio
```shell
python test_dualrnn.py
```
### For single-audio
```shell
python test_dualrnn_wav.py
```
# Pretrain Model
## Conv-TasNet
[Conv-TasNet model](https://drive.google.com/open?id=1MRe4jiwgtAFZErjz-LWuuyEG8VGSU0YS "Google Driver")
## Dual-Path-RNN
[Dual-Path-RNN model](https://drive.google.com/open?id=1TInJB-idggkKJ5YkNvnrTopum_HgX3_o "Google Driver")
# Result
## Conv-TasNet
![](https://github.com/JusperLee/Dual-Path-RNN-Pytorch/blob/master/log/Conv_Tasnet/loss.png)
Final Results: **15.8690** is 0.56 higher than **15.3** in the paper.
## Dual-Path-RNN
Final Results: **18.98** is 0.1 higher than **18.8** in the paper.
# Reference
1. Luo Y, Chen Z, Yoshioka T. Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation[J]. arXiv preprint arXiv:1910.06379, 2019.
2. [Conv-TasNet code](https://github.com/JusperLee/Conv-TasNet "Conv-TasNet code") && [Dual-RNN code](https://github.com/yluo42/TAC/blob/master/utility/models.py "Dual-RNN code")
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
Dual-path RNN:Pytorch实现的时域单通道语音分离的高效长序列建模 使用 WSJ0 数据集作为我们的训练、测试和验证集。下面是WSJ0的数据下载链接和混合音频代码。 音频混合样本 WSJ0 数据集 创建数据集
资源推荐
资源详情
资源评论
收起资源包目录
Dual-Path-RNN-Pytorch-master.zip (33个子文件)
Dual-Path-RNN-Pytorch-master
train_Tasnet.py 3KB
log
Conv_Tasnet
conv-tasnet.log 281KB
loss.png 25KB
train_rnn.py 3KB
config
option.py 327B
Conv_Tasnet
train.yml 1KB
test.yml 679B
Dual_RNN
train_rnn.yml 1KB
__init__.py 21B
utils.py 0B
model
loss.py 1KB
model_rnn.py 15KB
__init__.py 40B
model.py 10KB
dualrnn_test_wav.py 3KB
dualrnn_test.py 3KB
trainer
trainer_Dual_RNN.py 9KB
__init__.py 30B
trainer_Tasnet.py 8KB
requirements.txt 47B
data_loader
AudioData.py 3KB
AudioReader.py 2KB
Dataset.py 1KB
__init__.py 47B
.gitignore 11B
test_tasnet_wav.py 3KB
test_tasnet.py 3KB
logger
set_logger.py 990B
__init__.py 25B
README.md 4KB
utils
util.py 962B
__init__.py 19B
create_scp.py 3KB
共 33 条
- 1
资源评论
- mnhawh72024-04-09资源太好了,解决了我当下遇到的难题,抱紧大佬的大腿~
- 向着光的方向j2024-01-13资源是宝藏资源,实用也是真的实用,感谢大佬分享~
- j_ackking2022-06-24用户下载后在一定时间内未进行评价,系统默认好评。
快撑死的鱼
- 粉丝: 1w+
- 资源: 9156
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功