AudiofingerprintingandrecognitioninPython.zip资源-CSDN文库

共55个文件

py：25个

png：7个

mp3：5个

需积分: 5 55 浏览量 2024-07-09 09:40:33 上传评论收藏 73.45MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

Audio fingerprinting and recognition in Python.zip （55个子文件）

code_resourse

test_dejavu.sh 718B

run_tests.py 6KB

LICENSE.md 1KB

setup.py 2KB

INSTALLATION.md 2KB

example_docker_postgres.py 1KB

docker-compose.yaml 380B

dejavu

__init__.py 10KB

tests

__init__.py 0B

dejavu_test.py 10KB

third_party

__init__.py 0B

wavio.py 14KB

database_handler

__init__.py 0B

mysql_database.py 7KB

postgres_database.py 7KB

logic

__init__.py 0B

fingerprint.py 6KB

decoder.py 3KB

recognizer

__init__.py 0B

file_recognizer.py 991B

microphone_recognizer.py 2KB

config

__init__.py 0B

settings.py 3KB

base_classes

__init__.py 0B

base_recognizer.py 1KB

common_database.py 8KB

base_database.py 6KB

docker

postgres

init.sql 76B

Dockerfile 68B

.gitkeep 0B

python

Dockerfile 283B

mp3

Choc--Eigenvalue-Subspace-Decomposition.mp3 5.58MB

azan_test.wav 40.76MB

Josh-Woodward--I-Want-To-Destroy-Something-Beautiful.mp3 4.23MB

about.txt 116B

The-Lights-Galaxia--While-She-Sleeps.mp3 9.83MB

Brad-Sucks--Total-Breakdown.mp3 2.12MB

Sean-Fournier--Falling-For-You.mp3 4.26MB

dejavu.cnf.SAMPLE 172B

example_script.py 1KB

requirements.txt 122B

test

woodward_43s.wav 1.69MB

sean_secs.wav 8MB

dejavu.py 3KB

plots

blurred_lines_vertical.png 296KB

matching_graph.png 31KB

spectrogram_zoomed.png 405KB

spectrogram_peaks.png 498KB

accuracy.png 26KB

confidence.png 33KB

matching_time.png 40KB

MANIFEST.in 25B

.gitignore 42B

setup.cfg 32B

README.md 15KB

dejavu ========== Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: [How it works](http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/) Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input or reading from disk, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played. Note: for voice recognition, *Dejavu is not the right tool!* Dejavu excels at recognition of exact signals with reasonable amounts of noise. ## Quickstart with Docker First, install [Docker](https://docs.docker.com/get-docker/). ```shell # build and then run our containers $ docker-compose build $ docker-compose up -d # get a shell inside the container $ docker-compose run python /bin/bash Starting dejavu_db_1 ... done root@f9ea95ce5cea:/code# python example_docker_postgres.py Fingerprinting channel 1/2 for test/woodward_43s.wav Fingerprinting channel 1/2 for test/sean_secs.wav ... # connect to the database and poke around root@f9ea95ce5cea:/code# psql -h db -U postgres dejavu Password for user postgres: # type "password", as specified in the docker-compose.yml ! psql (11.7 (Debian 11.7-0+deb10u1), server 10.7) Type "help" for help. dejavu=# \dt List of relations Schema | Name | Type | Owner --------+--------------+-------+---------- public | fingerprints | table | postgres public | songs | table | postgres (2 rows) dejavu=# select * from fingerprints limit 5; hash | song_id | offset | date_created | date_modified ------------------------+---------+--------+----------------------------+---------------------------- \x71ffcb900d06fe642a18 | 1 | 137 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153 \xf731d792977330e6cc9f | 1 | 148 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153 \x71ff24aaeeb55d7b60c4 | 1 | 146 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153 \x29349c79b317d45a45a8 | 1 | 101 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153 \x5a052144e67d2248ccf4 | 1 | 123 | 2020-06-03 05:14:19.400153 | 2020-06-03 05:14:19.400153 (10 rows) # then to shut it all down... $ docker-compose down ``` If you want to be able to use the microphone with the Docker container, you'll need to do a [little extra work](https://stackoverflow.com/questions/43312975/record-sound-on-ubuntu-docker-image). I haven't had the time to write this up, but if anyone wants to make a PR, I'll happily merge. ## Docker alternative on local machine Follow instructions in [INSTALLATION.md](INSTALLATION.md) Next, you'll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup: $ mysql -u root -p Enter password: ********** mysql> CREATE DATABASE IF NOT EXISTS dejavu; Now you're ready to start fingerprinting your audio collection! You may also use Postgres, of course. The same method applies. ## Fingerprinting Let's say we want to fingerprint all of July 2013's VA US Top 40 hits. Start by creating a Dejavu object with your configurations settings (Dejavu takes an ordinary Python dictionary for the settings). ```python >>> from dejavu import Dejavu >>> config = { ... "database": { ... "host": "127.0.0.1", ... "user": "root", ... "password": <password above>, ... "database": <name of the database you created above>, ... } ... } >>> djv = Dejavu(config) ``` Next, give the `fingerprint_directory` method three arguments: * input directory to look for audio files * audio extensions to look for in the input directory * number of processes (optional) ```python >>> djv.fingerprint_directory("va_us_top_40/mp3", [".mp3"], 3) ``` For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn't, and so won't repeat itself. You'll have a lot of fingerprints once it completes a large folder of mp3s: ```python >>> print djv.db.get_num_fingerprints() 5442376 ``` Also, any subsequent calls to `fingerprint_file` or `fingerprint_directory` will fingerprint and add those songs to the database as well. It's meant to simulate a system where as new songs are released, they are fingerprinted and added to the database seemlessly without stopping the system. ## Configuration options The configuration object to the Dejavu constructor must be a dictionary. The following keys are mandatory: * `database`, with a value as a dictionary with keys that the database you are using will accept. For example with MySQL, the keys must can be anything that the [`MySQLdb.connect()`](http://mysql-python.sourceforge.net/MySQLdb.html) function will accept. The following keys are optional: * `fingerprint_limit`: allows you to control how many seconds of each audio file to fingerprint. Leaving out this key, or alternatively using `-1` and `None` will cause Dejavu to fingerprint the entire audio file. Default value is `None`. * `database_type`: `mysql` (the default value) and `postgres` are supported. If you'd like to add another subclass for `BaseDatabase` and implement a new type of database, please fork and send a pull request! An example configuration is as follows: ```python >>> from dejavu import Dejavu >>> config = { ... "database": { ... "host": "127.0.0.1", ... "user": "root", ... "password": "Password123", ... "database": "dejavu_db", ... }, ... "database_type" : "mysql", ... "fingerprint_limit" : 10 ... } >>> djv = Dejavu(config) ``` ## Tuning Inside `config/settings.py`, you may want to adjust following parameters (some values are given below). FINGERPRINT_REDUCTION = 30 PEAK_SORT = False DEFAULT_OVERLAP_RATIO = 0.4 DEFAULT_FAN_VALUE = 5 DEFAULT_AMP_MIN = 10 PEAK_NEIGHBORHOOD_SIZE = 10 These parameters are described within the file in detail. Read that in-order to understand the impact of changing these values. ## Recognizing There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone. ### Recognizing: On Disk Through the terminal: ```bash $ python dejavu.py --recognize file sometrack.wav {'total_time': 2.863781690597534, 'fingerprint_time': 2.4306554794311523, 'query_time': 0.4067542552947998, 'align_time': 0.007731199264526367, 'results': [{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'input_total_hashes': 76168, 'fingerprinted_hashes_in_db': 4919, 'hashes_matched_in_input': 794, 'input_confidence': 0.01, 'fingerprinted_confidence': 0.16, 'offset': -924, 'offset_seconds': -30.00018, 'file_sha1': b'3DC269DF7B8DB9B30D2604DA80783155912593E8'}, {...}, ...]} ``` or in scripting, assuming you've already instantiated a Dejavu object: ```python >>> from dejavu.logic.recognizer.file_recognizer import FileRecognizer >>> song = djv.recognize(FileRecognizer, "va_us_top_40/wav/Mirrors - Justin Timberlake.wav") ``` ### Recognizing: Through a Microphone With scripting: ```python >>> from dejavu.logic.recognizer.microphone_recognizer import MicrophoneRecognizer >>> song = djv.recognize(MicrophoneRecognizer, seconds=10) # Defaults to 10 seconds. ``` and with the command line script, you specify the number of seconds to listen: ```bash $ python dejavu.py --recognize mic 10 ``` ## Testing Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play. ![Confidence](plots/confidence.png) Test your Dejavu settings on a corpus of audio files on a number of different metrics: * Confidence of match (number finger

评论收藏

内容反馈