dejavu
==========
Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
[How it works](http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/)
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input or reading from disk, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
Note: for voice recognition, *Dejavu is not the right tool!* Dejavu excels at recognition of exact signals with reasonable amounts of noise.
## Quickstart with Docker
First, install [Docker](https://docs.docker.com/get-docker/).
```shell
# build and then run our containers
$ docker-compose build
$ docker-compose up -d
# get a shell inside the container
$ docker-compose run python /bin/bash
Starting dejavu_db_1 ... done
root@f9ea95ce5cea:/code# python example_docker_postgres.py
Fingerprinting channel 1/2 for test/woodward_43s.wav
Fingerprinting channel 1/2 for test/sean_secs.wav
...
# connect to the database and poke around
root@f9ea95ce5cea:/code# psql -h db -U postgres dejavu
Password for user postgres: # type "password", as specified in the docker-compose.yml !
psql (11.7 (Debian 11.7-0+deb10u1), server 10.7)
Type "help" for help.
dejavu=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------+-------+----------
public | fingerprints | table | postgres
public | songs | table | postgres
(2 rows)
dejavu=# select * from fingerprints limit 5;
hash | song_id | offset | date_created | date_modified
------------------------+---------+--------+----------------------------+----------------------------
\x71ffcb900d06fe642a18 | 1 | 137 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
\xf731d792977330e6cc9f | 1 | 148 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
\x71ff24aaeeb55d7b60c4 | 1 | 146 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
\x29349c79b317d45a45a8 | 1 | 101 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
\x5a052144e67d2248ccf4 | 1 | 123 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153
(10 rows)
# then to shut it all down...
$ docker-compose down
```
If you want to be able to use the microphone with the Docker container, you'll need to do a [little extra work](https://stackoverflow.com/questions/43312975/record-sound-on-ubuntu-docker-image). I haven't had the time to write this up, but if anyone wants to make a PR, I'll happily merge.
## Docker alternative on local machine
Follow instructions in [INSTALLATION.md](INSTALLATION.md)
Next, you'll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup:
$ mysql -u root -p
Enter password: **********
mysql> CREATE DATABASE IF NOT EXISTS dejavu;
Now you're ready to start fingerprinting your audio collection!
You may also use Postgres, of course. The same method applies.
## Fingerprinting
Let's say we want to fingerprint all of July 2013's VA US Top 40 hits.
Start by creating a Dejavu object with your configurations settings (Dejavu takes an ordinary Python dictionary for the settings).
```python
>>> from dejavu import Dejavu
>>> config = {
... "database": {
... "host": "127.0.0.1",
... "user": "root",
... "password": <password above>,
... "database": <name of the database you created above>,
... }
... }
>>> djv = Dejavu(config)
```
Next, give the `fingerprint_directory` method three arguments:
* input directory to look for audio files
* audio extensions to look for in the input directory
* number of processes (optional)
```python
>>> djv.fingerprint_directory("va_us_top_40/mp3", [".mp3"], 3)
```
For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn't, and so won't repeat itself.
You'll have a lot of fingerprints once it completes a large folder of mp3s:
```python
>>> print djv.db.get_num_fingerprints()
5442376
```
Also, any subsequent calls to `fingerprint_file` or `fingerprint_directory` will fingerprint and add those songs to the database as well. It's meant to simulate a system where as new songs are released, they are fingerprinted and added to the database seemlessly without stopping the system.
## Configuration options
The configuration object to the Dejavu constructor must be a dictionary.
The following keys are mandatory:
* `database`, with a value as a dictionary with keys that the database you are using will accept. For example with MySQL, the keys must can be anything that the [`MySQLdb.connect()`](http://mysql-python.sourceforge.net/MySQLdb.html) function will accept.
The following keys are optional:
* `fingerprint_limit`: allows you to control how many seconds of each audio file to fingerprint. Leaving out this key, or alternatively using `-1` and `None` will cause Dejavu to fingerprint the entire audio file. Default value is `None`.
* `database_type`: `mysql` (the default value) and `postgres` are supported. If you'd like to add another subclass for `BaseDatabase` and implement a new type of database, please fork and send a pull request!
An example configuration is as follows:
```python
>>> from dejavu import Dejavu
>>> config = {
... "database": {
... "host": "127.0.0.1",
... "user": "root",
... "password": "Password123",
... "database": "dejavu_db",
... },
... "database_type" : "mysql",
... "fingerprint_limit" : 10
... }
>>> djv = Dejavu(config)
```
## Tuning
Inside `config/settings.py`, you may want to adjust following parameters (some values are given below).
FINGERPRINT_REDUCTION = 30
PEAK_SORT = False
DEFAULT_OVERLAP_RATIO = 0.4
DEFAULT_FAN_VALUE = 5
DEFAULT_AMP_MIN = 10
PEAK_NEIGHBORHOOD_SIZE = 10
These parameters are described within the file in detail. Read that in-order to understand the impact of changing these values.
## Recognizing
There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone.
### Recognizing: On Disk
Through the terminal:
```bash
$ python dejavu.py --recognize file sometrack.wav
{'total_time': 2.863781690597534, 'fingerprint_time': 2.4306554794311523, 'query_time': 0.4067542552947998, 'align_time': 0.007731199264526367, 'results': [{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'input_total_hashes': 76168, 'fingerprinted_hashes_in_db': 4919, 'hashes_matched_in_input': 794, 'input_confidence': 0.01, 'fingerprinted_confidence': 0.16, 'offset': -924, 'offset_seconds': -30.00018, 'file_sha1': b'3DC269DF7B8DB9B30D2604DA80783155912593E8'}, {...}, ...]}
```
or in scripting, assuming you've already instantiated a Dejavu object:
```python
>>> from dejavu.logic.recognizer.file_recognizer import FileRecognizer
>>> song = djv.recognize(FileRecognizer, "va_us_top_40/wav/Mirrors - Justin Timberlake.wav")
```
### Recognizing: Through a Microphone
With scripting:
```python
>>> from dejavu.logic.recognizer.microphone_recognizer import MicrophoneRecognizer
>>> song = djv.recognize(MicrophoneRecognizer, seconds=10) # Defaults to 10 seconds.
```
and with the command line script, you specify the number of seconds to listen:
```bash
$ python dejavu.py --recognize mic 10
```
## Testing
Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play.
![Confidence](plots/confidence.png)
Test your Dejavu settings on a corpus of audio files on a number of different metrics:
* Confidence of match (number finger
没有合适的资源?快使用搜索试试~ 我知道了~
资源推荐
资源详情
资源评论
收起资源包目录
Audio fingerprinting and recognition in Python.zip (55个子文件)
code_resourse
test_dejavu.sh 718B
run_tests.py 6KB
LICENSE.md 1KB
setup.py 2KB
INSTALLATION.md 2KB
example_docker_postgres.py 1KB
docker-compose.yaml 380B
dejavu
__init__.py 10KB
tests
__init__.py 0B
dejavu_test.py 10KB
third_party
__init__.py 0B
wavio.py 14KB
database_handler
__init__.py 0B
mysql_database.py 7KB
postgres_database.py 7KB
logic
__init__.py 0B
fingerprint.py 6KB
decoder.py 3KB
recognizer
__init__.py 0B
file_recognizer.py 991B
microphone_recognizer.py 2KB
config
__init__.py 0B
settings.py 3KB
base_classes
__init__.py 0B
base_recognizer.py 1KB
common_database.py 8KB
base_database.py 6KB
docker
postgres
init.sql 76B
Dockerfile 68B
.gitkeep 0B
python
Dockerfile 283B
mp3
Choc--Eigenvalue-Subspace-Decomposition.mp3 5.58MB
azan_test.wav 40.76MB
Josh-Woodward--I-Want-To-Destroy-Something-Beautiful.mp3 4.23MB
about.txt 116B
The-Lights-Galaxia--While-She-Sleeps.mp3 9.83MB
Brad-Sucks--Total-Breakdown.mp3 2.12MB
Sean-Fournier--Falling-For-You.mp3 4.26MB
dejavu.cnf.SAMPLE 172B
example_script.py 1KB
requirements.txt 122B
test
woodward_43s.wav 1.69MB
sean_secs.wav 8MB
dejavu.py 3KB
plots
blurred_lines_vertical.png 296KB
matching_graph.png 31KB
spectrogram_zoomed.png 405KB
spectrogram_peaks.png 498KB
accuracy.png 26KB
confidence.png 33KB
matching_time.png 40KB
MANIFEST.in 25B
.gitignore 42B
setup.cfg 32B
README.md 15KB
共 55 条
- 1
资源评论
嵌入式JunG
- 粉丝: 5460
- 资源: 763
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功