语音识别-基于Tensorflow+Sequence-to-Sequence算法实现语音识别算法-附项目源码-优质项目实战资源-CSDN文库

共51个文件

py：36个

txt：4个

png：3个

版权申诉

语音识别

Tensorflow

10 浏览量 2024-05-17 10:18:54 上传评论收藏 30.51MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

语音识别_基于Tensorflow+Sequence-to-Sequence算法实现语音识别算法_附项目源码_优质项目实战.zip （51个子文件）

语音识别_基于Tensorflow+Sequence-to-Sequence算法实现语音识别算法_附项目源码_优质项目实战

lstm_ctc_to_chars.py 7KB

__init__.py 45B

wave_GANerate.py 5KB

speech_encoder.py 4KB

lstm_mfcc_to_chars.py 4KB

extra

speech_to_phonemes.swift 2KB

cepstrum.py 2KB

generate_sound.py 7KB

phonemes.txt 1KB

subtitle-downloader.py 5KB

lstm-tflearn.py 1KB

densenet_layer.py 3KB

speech2text-seq2seq.py 5KB

tensorpeers

__init__.py 0B

pytt

utils.py 7KB

__init__.py 0B

tracker.py 6KB

bencode.py 4KB

.git 36B

server.py 1KB

master.py 2KB

sync.py 3KB

requirements.txt 159B

test.py 2KB

README.md 2KB

subtitle_srt_parser.py 1KB

bdlstm_utils.py 4KB

WarpCTC.txt 3KB

speech2text-tflearn.py 2KB

number_gan_tflearn.py 2KB

.idea

codeStyleSettings.xml 277B

spoken_numbers_pcm.tar 37.71MB

number_classifier_tflearn.py 1KB

lstm_mfcc_ctc_to_words.py 6KB

mfcc_feature_classifier.py 3KB

record.py 3KB

word_to_phonemes.swift 608B

spectro_gan.py 4KB

requirements.txt 321B

record-autoencoder.py 389B

images

spectrogram.demo.png 55KB

0_Karen_160.png 160KB

tensorboard.png 69KB

speaker_classifier_tflearn.py 2KB

lstm_to_chars.py 7KB

number_gan_layer.py 2KB

README.md 2KB

layer

net.py 19KB

generate_speech_data.py 5KB

spoken_numbers_spectros_64x64.tar 10.28MB

speech_data.py 16KB

# Tensorflow Speech Recognition Speech recognition using google's [tensorflow](https://github.com/tensorflow/tensorflow/) deep learning framework, [sequence-to-sequence](https://www.tensorflow.org/versions/master/tutorials/seq2seq/index.html) neural networks. ## Ultimate goal Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB [here](http://www.openslr.org/12) and 21GB [here on openslr.org](http://www.openslr.org/7/) , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time... ![Sample spectrogram, That's what she said, too laid?](images/0_Karen_160.png) Sample spectrogram, Karen uttering 'zero' with 160 words per minute. ## Installation ### clone code ``` # download the project git clone https://github.com/pannous/layer.git git clone https://github.com/pannous/tensorpeers.git ``` ### pyaudio #### requirements portaudio from http://www.portaudio.com/ ``` git clone https://git.assembla.com/portaudio.git ./configure --prefix=/path/to/your/local make make install export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/local/lib export LIDRARY_PATH=$LIBRARY_PATH:/path/to/your/local/lib export CPATH=$CPATH:/path/to/your/local/include source ~/.bashrc ``` #### install pyaudio ``` pip install pyaudio ``` ## Getting started Toy examples: `./number_classifier_tflearn.py` `./speaker_classifier_tflearn.py` Some less trivial architectures: `./densenet_layer.py` Later: `./train.sh` `./record.py` ![Sample spectrogram or record.py](images/spectrogram.demo.png)

评论收藏

内容反馈

版权申诉