Skip to content

Timeline misalignment in PPG extraction for audio with silence #18

@taisa1

Description

@taisa1

I noticed that the PPG extractor sometimes doesn't work properly when there is silence before or after the audio section. I'm using the latest version installed via pip.

Does anyone have insights into what might be causing this issue?

Observations

The input audio waveform:
image
I processed this audio and got the following ppg output:
debug01
However, the PPG output does not align with the actual audio section, which starts from frame 85. It's not only about the beginning, but the whole thing seems out of sync on the timeline.
The Mel spectrogram aligns correctly, so it seems the issue might be with the model:
debug01_mel_spectrogram
To investigate, I cut the audio to frame range [65, 293], then got this PPG output:
debug01_cut
The output aligns with the input audio and seems more accurate than non-cut version.

Test Data

Environment

  • Python: 3.10.12
  • ppgs: 0.0.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions