fix: Fix pyav cannot access time_base as a decoder when converting dataset from v2.1 to v3 #2115
fix: Fix pyav cannot access time_base as a decoder when converting dataset from v2.1 to v3 #2115jackvial wants to merge 2 commits intohuggingface:mainfrom
Conversation
|
Possibly related to using a older version of torchcodec (0.21), will test with after updating to 0.5 |
|
Hi @jackvial, Thanks for testing our newest Dataset v3 tools ;) I dig into this issue, and I went down a rabbit hole (as always with PyAV and ffmpeg). That being said, I tried without this extra care for Thanks again for your help ! Best, Caroline. |
michel-aractingi
left a comment
There was a problem hiding this comment.
thanks @jackvial ! I tested your fix it works with av versions 14.* that were causing the time_base issue. approved
|
@CarolinePascal @michel-aractingi Thank you, I tested converting a dataset on latest main branch with the recent change to video_utils.py to remove setting time_base and didn't have any problems so I think I can close this issue?
|
|
Thank you @jackvial, this fix works for me as well. @michel-aractingi, can we get the PR merged? |
Hey @lukicdarkoo, which version of pyav are you using? If you're on a version lower than 0.5 can you try recreate your virtual environment and see if this resolves the problems you're seeing? |
|
@jackvial Upgrading torchcodec also seems to be working |
|
Hi everyone, It seems we forgot to close this PR - The issue was finally solved by enforcing Best, Caroline. |
What this does
Fixes bug when converting dataset from v2.1 to v3 when file is already in av1 format and does not cause pyav to switch to is_encoder=True
Bug example
Example of file that triggers the error. File is already in av1 format so pyav codec context doesn't switch to encoder mode?
How it was tested
Conversion now succeeds for the same dataset after making the code change
The successfully converted dataset https://huggingface.co/datasets/jackvial/screwdriver_panel_center_080225_15_e5/tree/main