-
Notifications
You must be signed in to change notification settings - Fork 258
extend SimpleCutSampler to work better with CutConcatenate #1520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
extend SimpleCutSampler to work better with CutConcatenate #1520
Conversation
KarelVesely84
commented
Sep 29, 2025
- by default batch size is computed an (num_cuts x longest_dur), however when cuts are concatenated the sholud be sum(utt_durs)
- keeping the default behavior as it was before
|
But in your implementation it's currently not |
|
Actually the Should I rename the I used that code, and waiting for the 200s long utterance would most likely cause GPU OOM when training. Ok, unit test is needed... |
b6feafc to
279d07e
Compare
|
unit test created, also wrote docstrings in TimeConstraint for the new feature right now,
@pzelasko how about other samplers, sholud I add it into some of them too ? list of samplers, for reference: |
|
Hi, is it fine done in this way ? |
|
Hi Karel, |
- by default batch size is computed an (num_cuts x longest_dur), however when cuts are concatenated the sholud be sum(utt_durs) - keeping the default behavior as it was before
552744b to
f23f0f1
Compare
…e_to_exceeding()` method - for `concatenate_cuts=True` the behavior of `exceeded()` and `close_to_exceeding()` methods becomes the same - `close_to_exceeding()` seems to be used to decide if last batch of an epoch (incomplete batch) sholud be used for training or discarded.
|
Hi Piotr, Consequently, for From the code K. |