# Speaker-Recognition
A simple Speaker Recognition application in python using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Model. The mel-frequency cepstrum coefficients of each sample is extracted and fitted into a Gaussian Mixture Model. We have taken 4 samples of 9 people of length 2 seconds each. The samples are taken in normal surroundings, hence some noise is accompanied in all samples. The first three samples are used for training and the fourth one is then tested. Gmm models of these 9 people are already created and are present in the /gmm_models directory. You can find their corresponding samples in /samples directory.
The accuracy of our implementation is very high (95%-96%) as tested upon the given samples. The accuracy still depends on the quality of the samples provided and amount of training set.
Running instructions :
This application runs on python 3.4 (windows 10). Python modules used are python_speech_features, Pyaudio, sklearn, Scipy and numpy.
Step 1 : Command Prompt start
Open up command prompt and go to the project's directory
Step 2 : Registration
First you need to register a user, providing the samples of the user's voice. Type :
python register.py
This will run the register.py file. It will ask for entering the username. Once entered, the script will start recording the voice. It will ask for 3 samples of the user of length 2 seconds each time. For convenience, we have asked user to say the words 'up' for first time, then 'down and then 'left'(although you can say anything, our application is speech independent. So just sing along for 6 seconds xD). Once the 3 samples are taken, the script trains these samples and then creates and dumps the gaussian mixture model in the gmm_models directory.
Step 3 : Testing
Once the .gmm extension file is create, you can now succesfully test your voice. Type:
python speakerrecog.py
This script records the voice of the user for 2 seconds. Say something for 2 seconds. Then the script outputs the result as :
detected as - "username"
没有合适的资源?快使用搜索试试~ 我知道了~
温馨提示
1、该资源内项目代码经过严格调试,下载即用确保可以运行! 2、该资源适合计算机相关专业(如计科、人工智能、大数据、数学、电子信息等)正在做课程设计、期末大作业和毕设项目的学生、或者相关技术学习者作为学习资料参考使用。 3、该资源包括全部源码,需要具备一定基础才能看懂并调试代码。 基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zi
资源推荐
资源详情
资源评论
收起资源包目录
基于Flask Web的中文自动语音识别演示系统源码+项目说明(,包含语音识别、语音合成、声纹识别之说话人识别).zip (430个子文件)
index.html.bak 12KB
baidu_aip.py.bak 1KB
speech_model251_e_0_step_120000.model.base 5.66MB
speech_model251_e_0_step_135500.model.base 5.66MB
speech_model251_e_0_step_68000.model.base 5.66MB
config 92B
app.v2.css 201KB
bootstrap.css 179KB
bootstrap.css 179KB
bootstrap.min.css 152KB
bootstrap.css 120KB
bootstrap.css 120KB
main.css 71KB
main.css 68KB
animate.min.css 52KB
animate.min.css 52KB
font-awesome.min.css 30KB
font-awesome.min.css 30KB
bootstrap-grid.css 18KB
bootstrap-grid.css 18KB
gw-product.css 15KB
layer.css 14KB
layer.css 14KB
jquery.DonutWidget.min.css 13KB
jquery.DonutWidget.min.css 13KB
bootstrap-slider.min.css 10KB
bootstrap-select.min.css 10KB
pages.css 9KB
pages.css 9KB
gw-header.css 8KB
linearicons.css 8KB
linearicons.css 8KB
ttsdemo.css 8KB
magnific-popup.css 7KB
magnific-popup.css 7KB
asrdemo.css 7KB
layer.css 5KB
layer.css 5KB
owl.carousel.css 4KB
owl.carousel.css 4KB
nice-select.css 4KB
nice-select.css 4KB
bootstrap-reboot.css 4KB
bootstrap-reboot.css 4KB
toast.css 815B
jquerysctipttop.css 736B
jquerysctipttop.css 736B
说话人识别实践.docx 400KB
语音合成实践.docx 271KB
语音识别实践.docx 125KB
.DS_Store 10KB
.DS_Store 10KB
.DS_Store 6KB
.DS_Store 6KB
.DS_Store 6KB
.DS_Store 6KB
fontawesome-webfont.eot 162KB
fontawesome-webfont.eot 162KB
Linearicons-Free.eot 55KB
Linearicons-Free.eot 55KB
loading-0.gif 6KB
loading-0.gif 6KB
loading-2.gif 2KB
loading-2.gif 2KB
loading-1.gif 701B
loading-1.gif 701B
zhi.gmm 11KB
李航航.gmm 11KB
test.gmm 11KB
jingkun.gmm 11KB
liu.gmm 11KB
hang.gmm 11KB
speech_model251_e_0_step_120000.h5 16.93MB
speech_model251_e_0_step_68000.h5 16.93MB
speech_model251_e_0_step_135500.h5 16.93MB
speech_model251_e_0_step_68000.base.h5 5.68MB
speech_model251_e_0_step_135500.base.h5 5.68MB
speech_model251_e_0_step_120000.base.h5 5.68MB
index.html 17KB
index.html 13KB
blog-home-banner.jpg 1.82MB
blog-home-banner.jpg 1.82MB
g7.jpg 154KB
g7.jpg 154KB
g1.jpg 122KB
g1.jpg 122KB
g2.jpg 114KB
g2.jpg 114KB
about.jpg 87KB
about.jpg 87KB
feature-img1.jpg 85KB
feature-img1.jpg 85KB
g4.jpg 83KB
g4.jpg 83KB
g3.jpg 80KB
g3.jpg 80KB
g6.jpg 73KB
g6.jpg 73KB
feature-img2.jpg 71KB
feature-img2.jpg 71KB
共 430 条
- 1
- 2
- 3
- 4
- 5
资源评论
辣椒种子
- 粉丝: 4245
- 资源: 5837
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- Java字符串转换处理工具类
- windows USB 驱动,用于PL2303芯片上报GPS信息使用
- McFly 为 Bash 提供历史命令搜索功能 v0.9.2
- Package Control-12.22.sublime-package.zip
- Dragon book编译器龙书源码附详细注释
- 华为云开发者服务协议.pdf
- Hyper-YOLO保姆级教程(私以为的YOLOv12)
- Hyper-YOLO保姆级教程(私以为的YOLOv12)
- Java课程课后作业答案(1).zip
- IMG_20230412_094114.jpg
- asm-西电微机原理实验
- py-apple-quadruped-robot-四足机器人
- asm-西电微机原理实验
- asm-西电微机原理实验
- py-apple-bldc-quadruped-robot-四足机器人
- asm-西电微机原理实验
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功