ecoVAD 🍀

An end to end pipeline for training and using VAD models in soundscape analysis.

Introduction

The software is an open source toolkit written in Python for Voice Active Detection in natural soundscapes and data anonymisation. It uses our own training pipeline ecoVAD which was developped in PyTorch but we also provide wrappers around existing state-of-the-art VAD models to make anonimisation of data more accessible.

Feel free to use ecoVAD for your acoustic analyses and research. If you do, please cite as:

Cretois, B., Rosten, C. M., & Sethi, S. S. (2022). Voice activity detection in eco-acoustic data enables privacy protection and is a proxy for human disturbance. Methods in Ecology and Evolution, 00,   1–10 .   https://doi.org/10.1111/2041-210X.14005

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Repository

This repository contains all the tools necessary to train from scratch a deep learning model for Voice Active Detection (VAD) but also to use existing ones (namely pyannote and webrtcvad and our own model trained using the ecoVAD pipeline)

Dataset

If you want to test our pipeline you do not need any dataset, we provided some demo files on OSF at this link: https://osf.io/f4mt5/ so that you can try the pipeline yourself!

💡 Note that the ecoVAD's model weights are also on the OSF folder and you will need to download it if you wish to use our ecoVAD's model.

Nevertheless, if you want to train a realistic model from scratch you will need your own soundscape dataset, a human speech dataset (in our analysis we used LibriSpeech) and a background noise dataset (in our analysis we used both ESC50 or BirdCLEF).

Installation

Installation without Docker

This code has been tested using Ubuntu 18.04 LTS and Windows 10 but should work with other distributions as well. Only Python 3.8 is supported though the code should work with other distributions as well.

Clone the repository:

git clone https://github.com/NINAnor/ecoVAD

Install requirements:

We use poetry as a package manager which can be installed with the instructions below:

cd ecoVAD
pip install poetry 
poetry install --no-root

Pydub and Librosa require audio backend (FFMPEG)

sudo apt-get install ffmpeg

Installation using Docker

First you need to have docker installed on your machine. Please follow the guidelines from the official documentation.

Clone the repository:

git clone https://github.com/NINAnor/ecoVAD

Build the image

cd ecoVAD
docker build -t ecovad -f Dockerfile .

Download the folder `assets`

To be able to run the pipeline with demo data and to get the weights of the model we used in our analysis, it is necessary to download the folder assets located on OSF: https://osf.io/f4mt5/.

➡️ Just go to the link, click on assets.zip and click on download.

Now, simply unzip and place assets in the ecoVAD folder.

You are now set up to run our ecoVAD pipeline!

Usage

Our repository provides the necessary scripts and instructions to train a VAD model but also to use existing ones. If you are only interested in making predictions using an existing model please refer to the section detecting human speech.

💡 Note that we recommand using the ecoVAD pipeline if you have a large enough dataset that can be used to train the model, otherwise pyannote is a very good alternative

Please note that for all the steps below, we provided a Jupyter notebook in notebooks so that it is possible to understand and run the scripts step by step.

If you are using your own dataset and wish to have more control over the training and inference pipeline please make sure to change the parameters from the config_training.yaml and confing_inference.yaml.

Training your own VAD model using ecoVAD pipeline

To generate the synthetic dataset and train the model simply run:

poetry run python train_ecovad.py

Or alternatively, if you have docker install and the docker image built:

docker run --rm -v $PWD/:/app ecovad python train_ecovad.py

Detecting human speech using a state-of-the-art VAD model

👉 For step you do not need to have trained your own VAD model but you can use existing models. Just make sure you specify the model you prefer to use in config_inference.yaml.

You can run the anonymisation script using:

poetry run python anonymise_data.py

Or alternatively, if you have docker install and the docker image built:

docker run --rm -v $PWD/:/app ecovad python anonymise_data.py

The anonymisation script will by default output some .json files that contains all detections made by the models (the default output folder is ./assets/demo_data_/detections/json/ecoVAD) that will be used to anonymise the data (which by default are in ./assets/demo_data/anonymised_data)

Extracting speech segments

You can run the extract segment script using:

poetry run python extract_detection.py

Or alternatively, if you have docker install and the docker image built:

docker run --rm -v $PWD/:/app ecovad python extract_detection.py

Note that you can choose the number of sampled detections in the config_inference.yaml

Contact

If you come across any issues with the ecoVAD pipeline, please open an issue.

For other inquiry you can contact me at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.devcontainer		.devcontainer
VAD_algorithms		VAD_algorithms
notebooks		notebooks
tests		tests
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
anonymise_data.py		anonymise_data.py
config_inference.yaml		config_inference.yaml
config_training.yaml		config_training.yaml
extract_detection.py		extract_detection.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
train_ecovad.py		train_ecovad.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ecoVAD 🍀

An end to end pipeline for training and using VAD models in soundscape analysis.

Introduction

Repository

Dataset

Installation

Installation without Docker

Installation using Docker

Download the folder `assets`

Usage

Training your own VAD model using ecoVAD pipeline

Detecting human speech using a state-of-the-art VAD model

Extracting speech segments

Contact

About

Releases 1

Packages

Languages

NINAnor/ecoVAD

Folders and files

Latest commit

History

Repository files navigation

ecoVAD 🍀

An end to end pipeline for training and using VAD models in soundscape analysis.

Introduction

Repository

Dataset

Installation

Installation without Docker

Installation using Docker

Download the folder assets

Usage

Training your own VAD model using ecoVAD pipeline

Detecting human speech using a state-of-the-art VAD model

Extracting speech segments

Contact

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Download the folder `assets`

Packages