# shazam-air
~~music recognition and classifier project just like shazam but lighter and simpler~~
Music classifier project.
# Installation
The Step 2 probably takes a few minutes, just be patient.
## 1.clone this repository
`$ git clone https://github.com/vsc-hvdc/shazam-air.git`
## 2.create a conda environment using environment.yml in the project directory
`$ conda env create -f environment.yml -n $YOUR_ENV_NAME`
## 3.activate the environment
`$ source activate $YOUR_ENV_NAME`
## 4.add package to python search path
`$ conda develop /Your/Project/Path/shazam-air/src`
# Run Demos
Note: all the demo dataset, model and wav files are provided under the `/data` directory.
Feel free to record or import your own demo file though.
## 1.audioDemo
Just run the script in `demos/` directory
You should be able to see the plotting of two example audio file in time & frequency domain
![raw](asset/raw_chunk.png)
![rec](asset/rec_chunk.png)
![raw_spec](asset/raw_spec.png)
![rec_spec](asset/rec_spec.png)
## 2.musicAI
Run the script in `demos/` directory
You should be able to read the output of prediction electronic dance music genre.
You can choose to run it with example recordings `/dubsteps.wav` and `/future.wav` under `demos/demo_chunks/`.
But the model's accuracy is yet to improve, which will be discussed later.
The recommended method to run your model is to contruct your model outside the project, save your model data, and load it in this project. The interface for training your model is reserved but not completed now. Hope I can finish it sooner.
You can read the result of prediction from the last line of terminal output like this
`Huh, this song sounds like BigRoom`
## 3.train
You can try training your own model by running the `train.py` script.
This project have already include the dataset to train the model. This dataset is downloaded from [kaggle](https://en.wikipedia.org/wiki/Kaggle), See more infomation about it [here](https://www.kaggle.com/caparrini/beatsdataset)
You can see the PCA and LDA analysis of our dataset.
![pca](asset/pca.png)
![lda](asset/lda.png)
You should be able to see the training process in the terminal output.
```
Construct the model
Start training
Epoch 1/100
2019-03-21 11:12:32.030091: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2300/2300 [==============================] - 1s 250us/step - loss: 0.0417 - acc: 0.0478
Epoch 2/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0416 - acc: 0.0509
Epoch 3/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0415 - acc: 0.0648
Epoch 4/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0415 - acc: 0.0683
Epoch 5/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0414 - acc: 0.0648
Epoch 6/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0411 - acc: 0.0796
Epoch 7/100
2300/2300 [==============================] - 0s 79us/step - loss: 0.0408 - acc: 0.0909
.
.
.
Epoch 97/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0368 - acc: 0.2748
Epoch 98/100
2300/2300 [==============================] - 0s 79us/step - loss: 0.0368 - acc: 0.2683
Epoch 99/100
2300/2300 [==============================] - 0s 79us/step - loss: 0.0371 - acc: 0.2643
Epoch 100/100
2300/2300 [==============================] - 0s 78us/step - loss: 0.0373 - acc: 0.2435
```
Now you can get a model named `demo_exp.h5` in `data/model/` directory
The best model I trained reachs a accuracy of 80%. It's a good number for a dataset of only 2300 data when you try to classifiy 23 classes.
The advice is to create the dataset of your own (Yet the most tricky and impossible part is how you can access enough amount of music data), the original dataset is not fully reliable, but it's the best I can find.
Or you can just keep trying Tweaking Your Model!
![](asset/meme.gif)
## 4.real-time spectrum demo
This demo is recommended running on the jupyter notebook because I met some strange bugs when trying to run it by script.
You can open the terminal and run
`$ jupyter notebook`, which should be installed when you create the env for this project in **installation step2**.
Open `real_time_demo.ipynb`, run through the demo, and you should be able to see a dynamic sepctrum while playing the demo chunk.
I found matplotlib extremely not good for real-time plotting cuz it's really slow and memory-consuming. The demo is the best I can do now, but still have obvious delay. Maybe we can try other library like openGL.
The spectrum is not ideal now cuz the y-axis is not fixed. Right now I can't figure out how to disable its auto-scaling setting.
![REAL-TIME DEMO](asset/rt-spec.gif)
没有合适的资源?快使用搜索试试~ 我知道了~
音乐识别和分类器项目_python_Jupyter_代码_下载
共44个文件
py:14个
png:8个
ipynb:5个
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
5星 · 超过95%的资源 2 下载量 160 浏览量
2022-06-16
21:16:20
上传
评论
收藏 36.57MB ZIP 举报
温馨提示
音乐分类器项目。 1.audioDemo 只需在demos/目录中运行脚本 您应该能够在时域和频域中看到两个示例音频文件的绘图 2.音乐AI demos/运行目录中的脚本 您应该能够阅读预测电子舞曲流派的输出。 您可以选择使用示例录音/dubsteps.wav和/future.wav在demos/demo_chunks/. 但是模型的准确率还有待提高,这将在后面讨论。 运行模型的推荐方法是在项目外部构建模型,保存模型数据,然后将其加载到此项目中。训练模型的界面已保留,但尚未完成。希望我能早点完成。 您可以像这样从终端输出的最后一行读取预测结果 Huh, this song sounds like BigRoom train.py您可以尝试通过运行脚本来训练自己的模型。 该项目已经包含用于训练模型的数据集。此数据集是从kaggle下载的,请在此处查看有关它的更多信息 您可以看到我们数据集的 PCA 和 LDA 分析。
资源推荐
资源详情
资源评论
收起资源包目录
shazam-air-master.zip (44个子文件)
shazam-air-master
.gitignore 113B
README.md 5KB
test
clss_t.py 247B
feature_t.py 44B
environment.yml 4KB
dev
devlog.md 300B
jupyter_notebook
data_set.ipynb 460KB
recording.ipynb 3.49MB
dnn_1.h5 876KB
rt-display.ipynb 217KB
calParam.ipynb 5KB
LICENSE 1KB
asset
lda.png 237KB
raw_spec.png 684KB
pca.png 220KB
predict.png 1.28MB
meme.gif 1.8MB
rt-spec.gif 1.12MB
train.png 2.57MB
raw_chunk.png 37KB
rec_spec.png 657KB
rec_chunk.png 33KB
src
audioIO.py 2KB
featureExtract
__init__.py 0B
utilities.py 2KB
feature.py 24KB
visual.py 2KB
classifier
model.py 3KB
__init__.py 0B
prepareData.py 6KB
demos
musicAI.py 578B
train.py 1KB
__init__.py 0B
audioDemo.py 2KB
real_time_demo.ipynb 12KB
data
demo_chunks
rec.wav 5.04MB
real_time_demo.wav 1.41MB
raw.wav 5.04MB
bigRoom.wav 7.8MB
dubstep.wav 9.21MB
model
dnn_3.h5 257KB
demo_exp.h5 257KB
demo.h5 257KB
data_set
beatsdataset.csv 2.44MB
共 44 条
- 1
资源评论
- 罡气抖冷2023-04-19非常有用的资源,有一定的参考价值,受益匪浅,值得下载。
- 2301_766938872023-06-28资源很赞,希望多一些这类资源。
快撑死的鱼
- 粉丝: 1w+
- 资源: 9154
下载权益
C知道特权
VIP文章
课程特权
开通VIP
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功