基于飞桨PaddlePaddle的语音方向的开源模型库，用于语音和音频中的各种关键任务的开发

共2000个文件

py：728个

sh：587个

md：236个

版权申诉

程序开发

语音处理

61 浏览量 2023-11-10 09:28:30 上传评论收藏 12.26MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

基于飞桨 PaddlePaddle 的语音方向的开源模型库，用于语音和音频中的各种关键任务的开发（2000个子文件）

fftsg.c 88KB

ctc_beam_search_decoder.cpp 23KB

utils.cpp 18KB

effects_chain.cpp 18KB

io.cpp 10KB

effects.cpp 9KB

pybind.cpp 7KB

scorer.cpp 7KB

decoder_utils.cpp 6KB

path_trie.cpp 5KB

types.cpp 4KB

ctc_greedy_decoder.cpp 2KB

utils.cpp 573B

custom.css 112B

kaldi-table-inl.h 98KB

determinize-lattice-inl.h 54KB

fstext-utils-inl.h 47KB

determinize-star-inl.h 45KB

kaldi-matrix.h 44KB

basic-filebuf.h 35KB

pre-determinize-inl.h 32KB

lattice-weight.h 31KB

kaldi-holder-inl.h 28KB

lattice-faster-decoder.h 25KB

lattice-functions.h 24KB

online-feature.h 23KB

kaldi-vector.h 23KB

jama-eig.h 23KB

cblas-wrappers.h 21KB

pitch-functions.h 20KB

sp-matrix.h 20KB

kaldi-table.h 18KB

fstext-utils.h 18KB

sparse-matrix.h 17KB

jama-svd.h 15KB

determinize-lattice-pruned.h 15KB

table-matcher.h 14KB

resample.h 13KB

kaldi-holder.h 13KB

remove-eps-local-inl.h 11KB

kaldi-thread.h 11KB

compressed-matrix.h 11KB

io-funcs-inl.h 11KB

parse-options.h 11KB

text-utils.h 11KB

io-funcs.h 10KB

kaldi-io.h 10KB

stl-utils.h 10KB

kaldi-math.h 10KB

log.h 10KB

lattice-utils-inl.h 10KB

optimization.h 10KB

feature-window.h 10KB

kaldi-error.h 9KB

lattice-utils.h 9KB

table-types.h 9KB

feature-functions.h 8KB

wave-reader.h 7KB

determinize-lattice.h 7KB

matrix-functions.h 7KB

feature-window.h 7KB

decodable-itf.h 7KB

feature-common.h 7KB

flags.h 7KB

kaldi-fst-io-inl.h 7KB

feature-plp.h 7KB

edit-distance-inl.h 6KB

packed-matrix.h 6KB

mel-computations.h 6KB

hash-list-inl.h 6KB

lattice-faster-online-decoder.h 6KB

feature-mfcc.h 6KB

kaldi-fst-io.h 6KB

u2_recognizer.h 6KB

ctc_beam_search_decoder.h 6KB

feature-fbank.h 6KB

hash-list.h 6KB

online-feature-itf.h 5KB

arpa-file-parser.h 5KB

srfft.h 5KB

kaldi-utils.h 5KB

determinize-star.h 5KB

kaldi-cygwin-io-inl.h 5KB

feature_pipeline.h 5KB

kaldi-lattice.h 5KB

feature-fbank.h 4KB

kaldi-blas.h 4KB

tp-matrix.h 4KB

u2_nnet.h 4KB

feature-spectrogram.h 4KB

pre-determinize.h 4KB

mel-computations.h 4KB

nnet_itf.h 4KB

decoder_utils.h 4KB

simple-options.h 4KB

utils.h 4KB

feature-common-inl.h 4KB

feature_common_inl.h 4KB

thread_pool.h 3KB

共 2000 条

([简体中文](./README_cn.md)|English) > conf/ws_ds2_application.yaml need onnxruntime>=1.11.0 # Streaming ASR Server ## Introduction This demo is an implementation of starting the streaming speech service and accessing the service. It can be achieved with a single command using `paddlespeech_server` and `paddlespeech_client` or a few lines of code in python. Streaming ASR server only support `websocket` protocol, and doesn't support `http` protocol. 服务接口定义请参考: - [PaddleSpeech Streaming Server WebSocket API](https://github.com/PaddlePaddle/PaddleSpeech/wiki/PaddleSpeech-Server-WebSocket-API) ## Usage ### 1. Installation see [installation](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/install.md). It is recommended to use **paddlepaddle 2.4rc** or above. You can choose one way from easy, meduim and hard to install paddlespeech. **If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to ### 2. Prepare config File The configuration file can be found in `conf/ws_application.yaml` 和 `conf/ws_conformer_wenetspeech_application.yaml`. At present, the speech tasks integrated by the model include: DeepSpeech2 and conformer. The input of ASR client demo should be a WAV file(`.wav`), and the sample rate must be the same as the model. Here are sample files for thisASR client demo that can be downloaded: ```bash wget -c https://paddlespeech.bj.bcebos.com/PaddleAudio/zh.wav ``` ### 3. Server Usage - Command Line (Recommended) **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file. ```bash # in PaddleSpeech/demos/streaming_asr_server start the service paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application.yaml # if you want to increase decoding speed, you can use the config file below, it will increase decoding speed and reduce accuracy paddlespeech_server start --config_file ./conf/ws_conformer_wenetspeech_application_faster.yaml ``` Usage: ```bash paddlespeech_server start --help ``` Arguments: - `config_file`: yaml file of the app, defalut: `./conf/application.yaml` - `log_file`: log file. Default: `./log/paddlespeech.log` Output: ```text [2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance [2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu [2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k [2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking... [2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams [2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine [2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online [2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success [2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully. INFO: Started server process [4242] [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242] INFO: Waiting for application startup. [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup. INFO: Application startup complete. [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit) [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit) ``` - Python API **Note:** The default deployment of the server is on the 'CPU' device, which can be deployed on the 'GPU' by modifying the 'device' parameter in the service configuration file. ```python # in PaddleSpeech/demos/streaming_asr_server directory from paddlespeech.server.bin.paddlespeech_server import ServerExecutor server_executor = ServerExecutor() server_executor( config_file="./conf/ws_conformer_wenetspeech_application.yaml", log_file="./log/paddlespeech.log") ``` Output: ```text [2022-05-14 04:56:13,086] [ INFO] - create the online asr engine instance [2022-05-14 04:56:13,086] [ INFO] - paddlespeech_server set the device: cpu [2022-05-14 04:56:13,087] [ INFO] - Load the pretrained model, tag = conformer_online_wenetspeech-zh-16k [2022-05-14 04:56:13,087] [ INFO] - File /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar.gz md5 checking... [2022-05-14 04:56:17,542] [ INFO] - Use pretrained model stored in: /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1. 0.0a.model.tar [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/model.yaml [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams [2022-05-14 04:56:17,543] [ INFO] - /root/.paddlespeech/models/conformer_online_wenetspeech-zh-16k/asr1_chunk_conformer_wenetspeech_ckpt_1.0.0a.model.tar/exp/ chunk_conformer/checkpoints/avg_10.pdparams [2022-05-14 04:56:17,852] [ INFO] - start to create the stream conformer asr engine [2022-05-14 04:56:17,863] [ INFO] - model name: conformer_online [2022-05-14 04:56:22,756] [ INFO] - create the transformer like model success [2022-05-14 04:56:22,758] [ INFO] - Initialize ASR server engine successfully. INFO: Started server process [4242] [2022-05-14 04:56:22] [INFO] [server.py:75] Started server process [4242] INFO: Waiting for application startup. [2022-05-14 04:56:22] [INFO] [on.py:45] Waiting for application startup. INFO: Application startup complete. [2022-05-14 04:56:22] [INFO] [on.py:59] Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit) [2022-05-14 04:56:22] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit) ``` ### 4. ASR Client Usage **Note:** The response time will be slightly longer when using the client for the first time - Command Line (Recommended) If `127.0.0.1` is not accessible, you need to use the actual service IP address. ```bash paddlespeech_client asr_online --server_ip 127.0.0.1 --port 8090 --input ./zh.wav ``` Usage: ```bash paddlespeech_client asr_online --help ``` Arguments: - `server_ip`: server ip. Default: 127.0.0.1 - `port`: server port. Default: 8090 - `input`(required): Audio f

评论收藏

内容反馈

版权申诉