Skip to content

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

License

Notifications You must be signed in to change notification settings

ghunkins/Binaural-Source-Localization-CNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binaural-Source-Localization-CNN

Basic Information


Author: Gregory Hunkins

Organization: University of Rochester

License: MIT

Abstract: A Convolutional Neural Network (CNN) classification system was designed for the task of source localization of human voices in 3-D space. A new dataset, VoiceBin100K, is introduced to accomplish this task and for future work in the field. The CNN inputs variable-length binaurual short- time Fourier Transform (STFT) magnitude and phase features and predicts location of the speaker’s voice according to 168 location classes.

Running The Code


Reference: https://cs.rochester.edu/~cxu22/t/577F17/bluehive_tutorial.html

Data


Please contact [email protected] for access to the data. A public link will available shortly.

About

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published