([简体中文](./README_cn.md)|English)
> conf/ws_ds2_application.yaml need onnxruntime>=1.11.0
# Streaming ASR Server
## Introduction
This demo is an implementation of starting the streaming speech service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python.
Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol.
服务接口定义请参考:
- [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API)
## Usage
### 1. Installation
see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md).
It is recommended to use **paddlepaddle 2.4rc** or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
**If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to
### 2. Prepare config File
The configuration file can be found in `conf/ws_application.yaml` 和 `conf/ws_conformer_wenetspeech_application.yaml`.
At present, the speech tasks integrated by the model include: DeepSpeech2 and conformer.
The input of ASR client demo should be a WAV file(`.wav`), and the sample rate must be the same as the model.
Here are sample files for thisASR client demo that can be downloaded:
```bash
wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav
```
### 3. Server Usage
- Command Line (Recommended)
**Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
```bash
# in PaddleSpeech/demos/streaming_asr_server start the service
paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml
# if you want to increase decoding speed, you can use the config file below, it will increase decoding speed and reduce accuracy
paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application_faster.yaml
```
Usage:
```bash
paddlespeech_server start --help
```
Arguments:
- `config_file`: yaml file of the app, defalut: `./conf/application.yaml`
- `log_file`: log file. Default: `./log/paddlespeech.log`
Output:
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
- Python API
**Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file.
```python
# in PaddleSpeech/demos/streaming_asr_server directory
from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
server_executor = ServerExecutor()
server_executor(
config_file="./conf/ws_conformer_wenetspeech_application.yaml",
log_file="./log/paddlespeech.log")
```
Output:
```text
[2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance
[2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu
[2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k
[2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking...
[2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams
[2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine
[2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online
[2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success
[2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully.
INFO: Started server process [4242]
[2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242]
INFO: Waiting for application startup.
[2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
[2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
```
### 4. ASR Client Usage
**Note:** The response time will be slightly longer when using the client for the first time
- Command Line (Recommended)
If `127.0.0.1` is not accessible, you need to use the actual service IP address.
```bash
paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav
```
Usage:
```bash
paddlespeech_client asr_online --help
```
Arguments:
- `server_ip`: server ip. Default: 127.0.0.1
- `port`: server port. Default: 8090
- `input`(required): Audio f
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
DeepSpeech2是一个采用PaddlePaddle平台的端到端自动语音识别(ASR)引擎的开源项目。是基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发,包含大量基于深度学习前沿和有影响力的模型
资源推荐
资源详情
资源评论
收起资源包目录
基于飞桨 PaddlePaddle 的语音方向的开源模型库,用于语音和音频中的各种关键任务的开发 (2000个子文件)
fftsg.c 88KB
ctc_beam_search_decoder.cpp 23KB
utils.cpp 18KB
effects_chain.cpp 18KB
io.cpp 10KB
effects.cpp 9KB
pybind.cpp 7KB
scorer.cpp 7KB
decoder_utils.cpp 6KB
path_trie.cpp 5KB
types.cpp 4KB
ctc_greedy_decoder.cpp 2KB
utils.cpp 573B
custom.css 112B
custom.css 112B
kaldi-table-inl.h 98KB
determinize-lattice-inl.h 54KB
fstext-utils-inl.h 47KB
determinize-star-inl.h 45KB
kaldi-matrix.h 44KB
basic-filebuf.h 35KB
pre-determinize-inl.h 32KB
lattice-weight.h 31KB
kaldi-holder-inl.h 28KB
lattice-faster-decoder.h 25KB
lattice-functions.h 24KB
online-feature.h 23KB
kaldi-vector.h 23KB
jama-eig.h 23KB
cblas-wrappers.h 21KB
pitch-functions.h 20KB
sp-matrix.h 20KB
kaldi-table.h 18KB
fstext-utils.h 18KB
sparse-matrix.h 17KB
jama-svd.h 15KB
determinize-lattice-pruned.h 15KB
table-matcher.h 14KB
resample.h 13KB
kaldi-holder.h 13KB
remove-eps-local-inl.h 11KB
kaldi-thread.h 11KB
compressed-matrix.h 11KB
io-funcs-inl.h 11KB
parse-options.h 11KB
text-utils.h 11KB
io-funcs.h 10KB
kaldi-io.h 10KB
stl-utils.h 10KB
kaldi-math.h 10KB
log.h 10KB
lattice-utils-inl.h 10KB
optimization.h 10KB
feature-window.h 10KB
kaldi-error.h 9KB
lattice-utils.h 9KB
table-types.h 9KB
feature-functions.h 8KB
wave-reader.h 7KB
determinize-lattice.h 7KB
matrix-functions.h 7KB
feature-window.h 7KB
decodable-itf.h 7KB
feature-common.h 7KB
flags.h 7KB
kaldi-fst-io-inl.h 7KB
feature-plp.h 7KB
edit-distance-inl.h 6KB
packed-matrix.h 6KB
mel-computations.h 6KB
hash-list-inl.h 6KB
lattice-faster-online-decoder.h 6KB
feature-mfcc.h 6KB
kaldi-fst-io.h 6KB
u2_recognizer.h 6KB
ctc_beam_search_decoder.h 6KB
feature-fbank.h 6KB
hash-list.h 6KB
online-feature-itf.h 5KB
arpa-file-parser.h 5KB
srfft.h 5KB
kaldi-utils.h 5KB
determinize-star.h 5KB
kaldi-cygwin-io-inl.h 5KB
feature_pipeline.h 5KB
kaldi-lattice.h 5KB
feature-fbank.h 4KB
kaldi-blas.h 4KB
tp-matrix.h 4KB
u2_nnet.h 4KB
feature-spectrogram.h 4KB
pre-determinize.h 4KB
mel-computations.h 4KB
nnet_itf.h 4KB
decoder_utils.h 4KB
simple-options.h 4KB
utils.h 4KB
feature-common-inl.h 4KB
feature_common_inl.h 4KB
thread_pool.h 3KB
共 2000 条
- 1
- 2
- 3
- 4
- 5
- 6
- 20
资源评论
Java程序员-张凯
- 粉丝: 1w+
- 资源: 6735
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 【老生谈算法】matlab实现非线性整数规划的遗传算法.doc
- MTB020C04RQ8-VB一款N+P-Channel沟道SOP8的MOSFET晶体管参数介绍与应用说明
- MTA40B03Q8-VB一款2个P-Channel沟道SOP8的MOSFET晶体管参数介绍与应用说明
- data.json全国省市区县 json数据
- MTA100N10KRN3-VB一款N-Channel沟道SOT23的MOSFET晶体管参数介绍与应用说明
- Unity 单机版斗地主游戏源码
- MacOs Sonoma懒人版镜像包附VM-unlock最新版
- Unity 插件之移动端影子生成插件(Mobile Fast Shadow 1.0.6)
- MTA025N03KSN3-VB一款N-Channel沟道SOT23的MOSFET晶体管参数介绍与应用说明
- MT4953ACTR-VB一款2个P-Channel沟道SOP8的MOSFET晶体管参数介绍与应用说明
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功