Skip to content

Generating audio while training the provided default mode and default arguments #29

@mukul74

Description

@mukul74

Hello, @relativeflux Thanks for reviving the SampleRNN in TensorFlow.
I have a question regarding the audio generation using a model trained on 1 audio of 8sec just for some inference and for validation, just one audio file.

--data_dir ./chunks --num_epochs 100 --batch_size 1 --max_checkpoints 1 --checkpoint_every 10 --output_file_dur 10 --sample_rate 11025

Audio Sampling_rate: 11025
I trained the model for around 40 epochs and while training and training accuracy comes out to 100% and validation accuracy is to be 4.132, as expected.
For Ref :
Epoch: 40/100, Step: 82/86, Loss: 0.000, Accuracy: 100.000, (0.440 sec/step)
Epoch: 40/100, Step: 83/86, Loss: 0.000, Accuracy: 100.000, (0.449 sec/step)
Epoch: 40/100, Step: 84/86, Loss: 0.000, Accuracy: 100.000, (0.438 sec/step)
Epoch: 40/100, Step: 85/86, Loss: 0.000, Accuracy: 100.000, (0.434 sec/step)
Epoch: 40/100, Step: 86/86, Loss: 0.000, Accuracy: 100.000, (0.437 sec/step)

Epoch: 40/100, Total Steps: 86, Loss: 0.000, Accuracy: 100.000, Val Loss: 13.038, Val Accuracy: 4.132 (1 min 0.427 sec)

But when I hear the generated audio using this checkpoint, I can hear only a small sequence of data and mostly corrupted by noise and nothing else. Generated audio sampled for 10 sec. But if I am not working due to overfitting, generated audio must provide exact training data as output or something very similar.

Just wanted to ask, am I doing something wrong or this is an expected result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions