PyTorch的语音工具包_wsj0资源-CSDN文库

共1297个文件

py：624个

yaml：274个

md：119个

版权申诉

pytorch

155 浏览量 2024-03-14 21:12:17 上传评论收藏 22.05MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

PyTorch的语音工具包（1297个子文件）

asr-crdnn-rnnlm-librispeech 92B

CITATION.cff 4KB

normalizer.ckpt 66B

asr.ckpt 61B

tokenizer.ckpt 53B

lm.ckpt 46B

UrbanSound8k_speechbrain.csv 612KB

UrbanSound8K.csv 483KB

tokenizer.csv 52KB

dev-clean.csv 52KB

LibriSpeech.csv 30KB

CommonVoice.csv 25KB

Voicebank.csv 8KB

WHAMandWHAMR.csv 8KB

full_inference.csv 7KB

timers-and-such.csv 6KB

LJSpeech.csv 5KB

Switchboard.csv 5KB

DVoice.csv 5KB

WSJ0Mix.csv 5KB

VoxCeleb.csv 5KB

IWSLT22_lowresource.csv 5KB

AISHELL-1.csv 4KB

BinauralWSJ0Mix.csv 4KB

TIMIT.csv 4KB

ASR_train_plda.csv 4KB

Aishell1Mix.csv 4KB

IEMOCAP.csv 3KB

ASR_test_librispeech_clean.csv 3KB

ESC50.csv 3KB

SLURP.csv 3KB

AMI.csv 3KB

AudioMNIST.csv 2KB

MEDIA.csv 2KB

KsponSpeech.csv 2KB

Fisher-Callhome-Spanish.csv 2KB

MultiWOZ.csv 2KB

LibriMix.csv 2KB

Google-speech-commands.csv 1KB

fluent-speech-commands.csv 1KB

Tedlium2.csv 1KB

ASR_train.csv 1KB

ASR_train_stereo.csv 1KB

LibriTTS.csv 1KB

CVSS.csv 1KB

ZaionEmotionDataset.csv 1KB

LibriParty.csv 1KB

DNS.csv 1KB

REAL-M.csv 1KB

UrbanSound8k.csv 1006B

CommonLanguage.csv 908B

VoxLingua107.csv 882B

RescueSpeech.csv 765B

separation_train_stereo.csv 381B

separation_dev_stereo.csv 374B

multi_annotation.csv 372B

separation_dev.csv 367B

separation_train.csv 367B

esc50_speechbrain.csv 365B

noise_paths.csv 325B

LM_train.csv 300B

speech.csv 298B

noise.csv 295B

esc50.csv 289B

RIRs.csv 199B

single_recording.csv 145B

LM_dev.csv 76B

noise_diffuse.flac 1.05MB

noise_0.70225_-0.70225_0.11704.flac 975KB

speech_-0.82918_0.55279_-0.082918.flac 593KB

speech_-0.98894_0_0.14834.flac 378KB

example1.flac 59KB

example1.flac 52KB

example1.flac 49KB

example1.flac 48KB

example1.flac 46KB

example1.flac 43KB

example2.flac 39KB

example2.flac 30B

.flake8 296B

.gitignore 2KB

LM_train.txt.gz 197B

hparams_conformer 572B

hparams_ecapa_tdnn 545B

hparams_RNNLM 62B

hparams_save_teachers 1KB

hparams_train_ecapa_tdnn 30B

hparams_train_kd 124B

hparams_train_w2v2_st 254B

hparams_transformer 17B

hparams_transformer 13B

hparams_verification_plda_xvector 116B

hparams_xvectors 543B

pytest.ini 164B

ASR_train.json 12KB

ASR_train_39p.json 5KB

response_generation_train_multiwoz.json 5KB

ASR_dev.json 3KB

Diarization_train.json 2KB

共 1297 条

<p align="center"> <img src="https://raw.githubusercontent.com/speechbrain/speechbrain/develop/docs/images/speechbrain-logo.svg" alt="SpeechBrain Logo"/> </p> [![Typing SVG](https://readme-typing-svg.demolab.com?font=Fira+Code&size=40&duration=7000&pause=1000&random=false&width=1200&height=100&lines=Simplify+Conversational+AI+Development)](https://git.io/typing-svg) | ð [Tutorials](https://speechbrain.github.io/tutorial_basics.html) | ð [Website](https://speechbrain.github.io/) | ð [Documentation](https://speechbrain.readthedocs.io/en/latest/index.html) | ð¤ [Contributing](https://speechbrain.readthedocs.io/en/latest/contributing.html) | ð¤ [HuggingFace](https://huggingface.co/speechbrain) | â¶ï¸ [YouTube](https://www.youtube.com/@SpeechBrainProject) | ð¦ [X](https://twitter.com/SpeechBrain1) | ![GitHub Repo stars](https://img.shields.io/github/stars/speechbrain/speechbrain?style=social) *Please, help our community project. Star on GitHub!* **Exciting News (January, 2024):** Discover what is new in SpeechBrain 1.0 [here](https://colab.research.google.com/drive/1IEPfKRuvJRSjoxu22GZhb3czfVHsAy0s?usp=sharing)! # # ð£ï¸ð¬ What SpeechBrain Offers - SpeechBrain is an **open-source** [PyTorch](https://pytorch.org/) toolkit that accelerates **Conversational AI** development, i.e., the technology behind *speech assistants*, *chatbots*, and *large language models*. - It is crafted for fast and easy creation of advanced technologies for **Speech** and **Text** Processing. ## ð Vision - With the rise of [deep learning](https://www.deeplearningbook.org/), once-distant domains like speech processing and NLP are now very close. A well-designed neural network and large datasets are all you need. - We think it is now time for a **holistic toolkit** that, mimicking the human brain, jointly supports diverse technologies for complex Conversational AI systems. - This spans *speech recognition*, *speaker recognition*, *speech enhancement*, *speech separation*, *language modeling*, *dialogue*, and beyond. ## ð Training Recipes - We share over 200 competitive training [recipes](https://github.com/speechbrain/speechbrain/tree/develop/recipes) on more than 40 datasets supporting 20 speech and text processing tasks (see below). - We support both training from scratch and fine-tuning pretrained models such as [Whisper](https://huggingface.co/openai/whisper-large), [Wav2Vec2](https://huggingface.co/docs/transformers/model_doc/wav2vec2), [WavLM](https://huggingface.co/docs/transformers/model_doc/wavlm), [Hubert](https://huggingface.co/docs/transformers/model_doc/hubert), [GPT2](https://huggingface.co/gpt2), [Llama2](https://huggingface.co/docs/transformers/model_doc/llama2), and beyond. The models on [HuggingFace](https://huggingface.co/) can be easily plugged in and fine-tuned. - For any task, you train the model using these commands: ```python python train.py hparams/train.yaml ``` - The hyperparameters are encapsulated in a YAML file, while the training process is orchestrated through a Python script. - We maintained a consistent code structure across different tasks. - For better replicability, training logs and checkpoints are hosted on Dropbox. ## <a href="https://huggingface.co/speechbrain" target="_blank"> <img src="https://huggingface.co/front/assets/huggingface_logo.svg" alt="drawing" width="40"/> </a> Pretrained Models and Inference - Access over 100 pretrained models hosted on [HuggingFace](https://huggingface.co/speechbrain). - Each model comes with a user-friendly interface for seamless inference. For example, transcribing speech using a pretrained model requires just three lines of code: ```python from speechbrain.pretrained import EncoderDecoderASR asr_model = EncoderDecoderASR.from_hparams(source="speechbrain/asr-conformer-transformerlm-librispeech", savedir="pretrained_models/asr-transformer-transformerlm-librispeech") asr_model.transcribe_file("speechbrain/asr-conformer-transformerlm-librispeech/example.wav") ``` ## <a href="https://speechbrain.github.io/" target="_blank"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d0/Google_Colaboratory_SVG_Logo.svg/1200px-Google_Colaboratory_SVG_Logo.svg.png" alt="drawing" width="50"/> </a> Documentation - We are deeply dedicated to promoting inclusivity and education. - We have authored over 30 [tutorials](https://speechbrain.github.io/) on Google Colab that not only describe how SpeechBrain works but also help users familiarize themselves with Conversational AI. - Every class or function has clear explanations and examples that you can run. Check out the [documentation](https://speechbrain.readthedocs.io/en/latest/index.html) for more details ð. ## ð¯ Use Cases - ð **Research Acceleration**: Speeding up academic and industrial research. You can develop and integrate new models effortlessly, comparing their performance against our baselines. - â¡ï¸ **Rapid Prototyping**: Ideal for quick prototyping in time-sensitive projects. - ð **Educational Tool**: SpeechBrain's simplicity makes it a valuable educational resource. It is used by institutions like [Mila](https://mila.quebec/en/), [Concordia University](https://www.concordia.ca/), [Avignon University](https://univ-avignon.fr/en/), and many others for student training. # # ð Quick Start To get started with SpeechBrain, follow these simple steps: ## ð ï¸ Installation ### Install via PyPI 1. Install SpeechBrain using PyPI: ```bash pip install speechbrain ``` 2. Access SpeechBrain in your Python code: ```python import speechbrain as sb ``` ### Install from GitHub This installation is recommended for users who wish to conduct experiments and customize the toolkit according to their needs. 1. Clone the GitHub repository and install the requirements: ```bash git clone https://github.com/speechbrain/speechbrain.git cd speechbrain pip install -r requirements.txt pip install --editable . ``` 2. Access SpeechBrain in your Python code: ```python import speechbrain as sb ``` Any modifications made to the `speechbrain` package will be automatically reflected, thanks to the `--editable` flag. ## âï¸ Test Installation Ensure your installation is correct by running the following commands: ```bash pytest tests pytest --doctest-modules speechbrain ``` ## ðââï¸ Running an Experiment In SpeechBrain, you can train a model for any task using the following steps: ```python cd recipes/<dataset>/<task>/ python experiment.py params.yaml ``` The results will be saved in the `output_folder` specified in the YAML file. ## ð Learning SpeechBrain - **Website:** Explore general information on the [official website](https://speechbrain.github.io). - **Tutorials:** Start with [basic tutorials](https://speechbrain.github.io/tutorial_basics.html) covering fundamental functionalities. Find advanced tutorials and topics in the Tutorials menu on the [SpeechBrain website](https://speechbrain.github.io). - **Documentation:** Detailed information on the SpeechBrain API, contribution guidelines, and code is available in the [documentation](https://speechbrain.readthedocs.io/en/latest/index.html). # # ð§ Supported Technologies - SpeechBrain is a versatile framework designed for implementing a wide range of technologies within the field of Conversational AI. - It excels not only in individual task implementations but also in combining various technologies into complex pipelines. ## ðï¸ Speech/Audio Processing | Tasks | Datasets | Technologies/Models | | ------------- |-------------| -----| | Speech Recognition | [AISHELL-1](https://github.com/speechbrain/speechbrain/tree/develop/recipes/AISHELL-1), [CommonVoice](https://github.com/speechbrain/speechbrain/tree/develop/recipes/CommonVoice), [DVoice](https://github.com/speechbrain/speechbrain/tree/develop/recipes/DVoice), [KsponSpeech](https:/

评论收藏

内容反馈

版权申诉