AudioDup

A trivial approach for near-duplicate detection of audios

View the Project on GitHub

View On GitHub

AudioDup - Near-duplicate Detection of Audios

This repository presents my trivial approach for near-duplicate detection of audios, by generating acoustic fingerprints.

Setup Instructions

We assume that you have access to a computer with MacOS. However, you should generally be fine with any Unix/Linux-based systems as well.
Make sure you have installed Python 3.7 and the latest version of pipenv.
Install MySQL connector using brew install mysql-connector-c.
- Fix a potential bug by this.
Install brew install portaudio && brew install ffmpeg.
Install all dependencies with pipenv install.
Setup a databset & user for the program:

CREATE DATABASE dejavu;
CREATE USER 'dejavu'@'localhost' IDENTIFIED BY 'dejavu';
GRANT ALL PRIVILEGES ON dejavu.* TO 'dejavu'@'localhost';

To Run the Program

Collect fingerprints by pipenv shell python3 collect.py.
Recognize sound from microphone by pipenv shell python3 recognize.py.

Testing

We would use the FMA Dataset to perform testing. To avoid wasting too much time & disk space, you do not have to download the whole dataset.
Put what you downloaded into the data folder.
Run pipenv shell python3 collect.py to collect all fingerprints.
Run pipenv shell python3 test.py to collect test results.

Licence

GNU General Public Licence 3.0