---
language: "en"
thumbnail:
tags:
- automatic-speech-recognition
- CTC
- Attention
- Transformers
- pytorch
- speechbrain
license: "apache-2.0"
datasets:
- aishell
metrics:
- wer
- cer
---
<iframe src="https://ghbtns.com/github-btn.html?user=speechbrain&repo=speechbrain&type=star&count=true&size=large&v=2" frameborder="0" scrolling="0" width="170" height="30" title="GitHub"></iframe>
<br/><br/>
# Transformer for AISHELL (Mandarin Chinese)
This repository provides all the necessary tools to perform automatic speech
recognition from an end-to-end system pretrained on AISHELL (Mandarin Chinese)
within SpeechBrain. For a better experience, we encourage you to learn more about
[SpeechBrain](https://speechbrain.github.io).
The performance of the model is the following:
| Release | Dev CER | Test CER | GPUs | Full Results |
|:-------------:|:--------------:|:--------------:|:--------:|:--------:|
| 05-03-21 | 5.60 | 6.04 | 2xV100 32GB | [Google Drive](https://drive.google.com/drive/folders/1zlTBib0XEwWeyhaXDXnkqtPsIBI18Uzs?usp=sharing)|
## Pipeline description
This ASR system is composed of 2 different but linked blocks:
- Tokenizer (unigram) that transforms words into subword units and trained with
the train transcriptions of LibriSpeech.
- Acoustic model made of a transformer encoder and a joint decoder with CTC +
transformer. Hence, the decoding also incorporates the CTC probabilities.
To Train this system from scratch, [see our SpeechBrain recipe](https://github.com/speechbrain/speechbrain/tree/develop/recipes/AISHELL-1).
## Install SpeechBrain
First of all, please install SpeechBrain with the following command:
```
pip install speechbrain
```
Please notice that we encourage you to read our tutorials and learn more about
[SpeechBrain](https://speechbrain.github.io).
### Transcribing your own audio files (in English)
```python
from speechbrain.pretrained import EncoderDecoderASR
asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-transformer-aishell", savedir="pretrained_models/asr-transformer-aishell")
asr_model.transcribe_file("speechbrain/asr-transformer-aishell/example_mandarin.wav")
```
### Inference on GPU
To perform inference on the GPU, add `run_opts={"device":"cuda"}` when calling the `from_hparams` method.
## Parallel Inference on a Batch
Please, [see this Colab notebook](https://colab.research.google.com/drive/1hX5ZI9S4jHIjahFCZnhwwQmFoGAi3tmu?usp=sharing) to figure out how to transcribe in parallel a batch of input sentences using a pre-trained model.
### Training
The model was trained with SpeechBrain (Commit hash: '986a2175').
To train it from scratch follow these steps:
1. Clone SpeechBrain:
```bash
git clone https://github.com/speechbrain/speechbrain/
```
2. Install it:
```bash
cd speechbrain
pip install -r requirements.txt
pip install -e .
```
3. Run Training:
```bash
cd recipes/AISHELL-1/ASR/transformer/
python train.py hparams/train_ASR_transformer.yaml --data_folder=your_data_folder
```
You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1QU18YoauzLOXueogspT0CgR5bqJ6zFfu?usp=sharing).
### Limitations
The SpeechBrain team does not provide any warranty on the performance achieved by this model when used on other datasets.
# **About SpeechBrain**
- Website: https://speechbrain.github.io/
- Code: https://github.com/speechbrain/speechbrain/
- HuggingFace: https://huggingface.co/speechbrain/
# **Citing SpeechBrain**
Please, cite SpeechBrain if you use it for your research or business.
```bibtex
@misc{speechbrain,
title={{SpeechBrain}: A General-Purpose Speech Toolkit},
author={Mirco Ravanelli and Titouan Parcollet and Peter Plantinga and Aku Rouhe and Samuele Cornell and Loren Lugosch and Cem Subakan and Nauman Dawalatabad and Abdelwahab Heba and Jianyuan Zhong and Ju-Chieh Chou and Sung-Lin Yeh and Szu-Wei Fu and Chien-Feng Liao and Elena Rastorgueva and François Grondin and William Aris and Hwidong Na and Yan Gao and Renato De Mori and Yoshua Bengio},
year={2021},
eprint={2106.04624},
archivePrefix={arXiv},
primaryClass={eess.AS},
note={arXiv:2106.04624}
}
```
没有合适的资源?快使用搜索试试~ 我知道了~
语音交互助手asr python版本语音转文本转拼音实现人机交互
共11个文件
ckpt:3个
txt:2个
md:2个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
5星 · 超过95%的资源 1 下载量 106 浏览量
2023-12-29
13:51:04
上传
评论
收藏 112.24MB ZIP 举报
温馨提示
语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互 语音交互助手asr python版本语音转文本转拼音实现人机交互
资源推荐
资源详情
资源评论
收起资源包目录
voice-assistant.zip (11个子文件)
voice-assistant
pretrained_models
asr-transformer-aishell
normalizer.ckpt 2KB
example_mandarin.wav 67KB
hyperparams.yaml 4KB
tokenizer.ckpt 293KB
gitattributes 838B
asr.ckpt 120.85MB
README.md 4KB
main.py 6KB
sound_data
requirements.txt 131B
pinyin.txt 268KB
temp_data
README.md 252B
共 11 条
- 1
资源评论
- 2401_843258892024-04-21感谢大佬分享的资源给了我灵感,果断支持!感谢分享~
安小呆
- 粉丝: 1w+
- 资源: 1212
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- zblog站群:zblog seo站群高收录排名全地域霸屏
- 【安卓毕业设计】数独联网对战APP源码(完整前后端+mysql+说明文档).zip
- 【安卓毕业设计】Android天气小作业源码(完整前后端+mysql+说明文档).zip
- 【安卓毕业设计】群养猪生长状态远程监测源码(完整前后端+mysql+说明文档).zip
- 【安卓毕业设计】奶牛管理新加功能源码(完整前后端+mysql+说明文档).zip
- C#.NET公墓陵园管理系统源码数据库 SQL2008源码类型 WebForm
- 作业这是作业文件这是作业
- 【物理化学实验报告】挥发性双液系气-液平衡相图的测绘.pdf
- 4353_135543959.html
- C#物联订单仓储综合管理系统源码 物联综合管理系统源码数据库 SQL2008源码类型 WebForm
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功