-
Notifications
You must be signed in to change notification settings - Fork 61
Open
Description
I want to perform the fine-tuning of the audio subnetwork to fit my audio classification problem.
To this aim, I plan to use the _construct_linear_audio_network
, _construct_mel128_audio_network
, and _construct_mel256_audio_network
functions to load the pre-trained Keras model and then append one or more fully-connected layers to perform the classification.
However, I don't understand the Input shape of such models. According to the models.py
, the input shape is input_shape = (1, asr * audio_window_dur)
, where asr= 48000
and audio_window_dur=1
; what's asr
and why it has that value? Can you please provide an example of using the Keras model from the .wav
file?
I really appreciate any help you can provide.
Metadata
Metadata
Assignees
Labels
No labels