ACNN for Text-Independent Speaker Recognition

Official implementation of

Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition
by Seong-Hu Kim, Yong-Hwa Park @ Human Lab, Mechanical Engineering Department, KAIST

Accepted paper in InterSpeech 2021.

This code was written mainly with reference to baseline code.

Adaptive Convolutional Neural Network Module

We use two scaling maps, which are frequency and time domain, to each axis for the adaptive kernel in the ACNN module. The adaptive kernel is created by element-wise multiplication of each output channel of the content-invariant kernel with the scaling matrix. The structure of proposed ACNN module for speaker recognition is shown as follows.

This module is applied to VGG-M and ResNet for text-independent speaker recognition.

Requirements and versions used

pytorch >= 1.4.0
pytorchaudio >= 0.4.0
numpy >= 1.18

Dataset

We used Voxceleb1 dataset in this paper. You can download the dataset by reffering to Voxceleb. All data should be gathered in one folder and you set the dataset directories in 'train_model.yaml'.

Training

You can train and save model in exps folder by running:

python train_model.py

You need to adjust the training parameters in yaml before training.

Results:

Network	Top-1 (%)	Top-1 (%)	EER (%)	C_det (%)
Adaptive VGG-M (N=18)	86.51	95.31	5.68	0.510
Adaptive ResNet18 (N=18)	85.84	95.29	6.18	0.589

Pretrained models

There are pretrained models in 'pretrained_model'. The example code for verification using the pretrained models is not provided separately.

Citation

@inproceedings{kim21_interspeech,
  author={Seong-Hu Kim and Yong-Hwa Park},
  title={{Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={66--70},
  doi={10.21437/Interspeech.2021-65}
}

Please contact Seong-Hu Kim at [email protected] for any query.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
list		list
models		models
pretrained_model		pretrained_model
README.md		README.md
data_prep.py		data_prep.py
train_model.py		train_model.py
train_model.yaml		train_model.yaml
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ACNN for Text-Independent Speaker Recognition

Adaptive Convolutional Neural Network Module

Requirements and versions used

Dataset

Training

Results:

Pretrained models

Citation

About

Uh oh!

Releases

Packages

Languages

shkim816/acnn_speaker_recog

Folders and files

Latest commit

History

Repository files navigation

ACNN for Text-Independent Speaker Recognition

Adaptive Convolutional Neural Network Module

Requirements and versions used

Dataset

Training

Results:

Pretrained models

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages