Skip to content

Pytorch port of Google Research's VGGish model used for extracting audio features.

License

Notifications You must be signed in to change notification settings

sberbank-ai-lab/torchvggish

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VGGish

A torch-compatible port of VGGish[1], a feature embedding frontend for audio classification models. The weights are ported directly from the tensorflow model, so embeddings created using torchvggish will be identical.

Usage

import torch

model = torch.hub.load('harritaylor/torchvggish', 'vggish')
model.eval()

# Download an example audio file
import urllib
url, filename = ("http://soundbible.com/grab.php?id=1698&type=wav", "bus_chatter.wav")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

model.forward(filename)

[1] S. Hershey et al., ‘CNN Architectures for Large-Scale Audio Classification’,\ in International Conference on Acoustics, Speech and Signal Processing (ICASSP),2017\ Available: https://arxiv.org/abs/1609.09430, https://ai.google/research/pubs/pub45611

About

Pytorch port of Google Research's VGGish model used for extracting audio features.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%