Skip to content

Conversation

jackvial
Copy link
Contributor

@jackvial jackvial commented Oct 4, 2025

What this does

Fixes bug when converting dataset from v2.1 to v3 when file is already in av1 format and does not cause pyav to switch to is_encoder=True

Bug example

➤ python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=jackvial/screwdriver_panel_center_080225_15_e5
Trying to download v3.0 version of the dataset from the hub...
Dataset does not have an uploaded v3.0 version. Continuing with conversion.
Using local dataset at /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5
INFO 2025-10-04 12:34:54 1_to_v30.py:438 Converting info from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
INFO 2025-10-04 12:34:54 1_to_v30.py:169 Converting tasks from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
INFO 2025-10-04 12:34:54 1_to_v30.py:212 Converting data files from 5 episodes
convert data files: 100%|███████████████████████████████| 5/5 [00:00<00:00, 568.77it/s]
INFO 2025-10-04 12:34:54 1_to_v30.py:264 Converting videos from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
convert videos of observation.images.screwdriver: 100%|| 5/5 [00:00<00:00, 549.91it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/jack/code/lerobot/src/lerobot/datasets/v30/convert_dataset_v21_to_v30.py", line 571, in <module>
    convert_dataset(**vars(args))
  File "/home/jack/code/lerobot/src/lerobot/datasets/v30/convert_dataset_v21_to_v30.py", line 500, in convert_dataset
    episodes_videos_metadata = convert_videos(root, new_root, video_file_size_in_mb)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jack/code/lerobot/src/lerobot/datasets/v30/convert_dataset_v21_to_v30.py", line 274, in convert_videos
    eps_metadata = convert_videos_of_camera(root, new_root, camera, video_file_size_in_mb)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jack/code/lerobot/src/lerobot/datasets/v30/convert_dataset_v21_to_v30.py", line 355, in convert_videos_of_camera
    concatenate_video_files(
  File "/home/jack/code/lerobot/src/lerobot/datasets/video_utils.py", line 456, in concatenate_video_files
    ].time_base = (
      ^^^^^^^^^
  File "av/stream.pyx", line 127, in av.stream.Stream.__setattr__
  File "av/codec/context.pyx", line 541, in av.codec.context.CodecContext.time_base.__set__
RuntimeError: Cannot access 'time_base' as a decoder

Example of file that triggers the error. File is already in av1 format so pyav codec context doesn't switch to encoder mode?

 ffprobe -hide_banner ep
episode_000000.mp4      episode_000000.parquet
Mac-mini:Downloads jackvial$ ffprobe -hide_banner episode_000000.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'episode_000000.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomav01iso2mp41
    encoder         : Lavf61.7.100
  Duration: 00:00:07.60, start: 0.000000, bitrate: 4661 kb/s
  Stream #0:0[0x1](und): Video: av1 (libdav1d) (Main) (av01 / 0x31307661), yuv420p(tv), 800x600, 4659 kb/s, 30 fps, 30 tbr, 15360 tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]

How it was tested

Conversion now succeeds for the same dataset after making the code change

╰➤ python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=jackvial/screwdriver_panel_center_080225_15_e5
Trying to download v3.0 version of the dataset from the hub...
Dataset does not have an uploaded v3.0 version. Continuing with conversion.
Using local dataset at /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5
INFO 2025-10-04 13:00:26 1_to_v30.py:438 Converting info from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
INFO 2025-10-04 13:00:26 1_to_v30.py:169 Converting tasks from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
INFO 2025-10-04 13:00:26 1_to_v30.py:212 Converting data files from 5 episodes
convert data files: 100%|██████████████████████████████████████████████| 5/5 [00:00<00:00, 735.77it/s]
INFO 2025-10-04 13:00:26 1_to_v30.py:264 Converting videos from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
convert videos of observation.images.screwdriver: 100%|████████████████| 5/5 [00:00<00:00, 991.14it/s]
convert videos of observation.images.side: 100%|███████████████████████| 5/5 [00:00<00:00, 778.40it/s]
convert videos of observation.images.top: 100%|████████████████████████| 5/5 [00:00<00:00, 413.96it/s]
convert videos: 100%|███████████████████████████████████████████████| 5/5 [00:00<00:00, 141699.46it/s]
INFO 2025-10-04 13:00:26 1_to_v30.py:405 Converting episodes metadata from /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5 to /home/jack/.cache/huggingface/lerobot/jackvial/screwdriver_panel_center_080225_15_e5_v30
Generating train split: 5 examples [00:00, 147.41 examples/s]
Creating parquet from Arrow format: 100%|██████████████████████████████| 1/1 [00:00<00:00, 380.95ba/s]
tag=v3.0 probably doesn't exist. Skipping exception (404 Client Error. (Request ID: Root=1-68e152aa-1c102c6236bcefff37ffe463;f0db66d3-a075-4d8e-92c8-054be90172d5)

Revision Not Found for url: https://huggingface.co/api/datasets/jackvial/screwdriver_panel_center_080225_15_e5/tag/v3.0.
Invalid rev id: v3.0)
Generating train split: 5 examples [00:00, 265.37 examples/s]
Generating train split: 1217 examples [00:00, 416657.25 examples/s]
Processing Files (6 / 6)                : 100%|██████████████████████████| 90.4MB / 90.4MB, 45.2MB/s  
New Data Upload                         : 100%|██████████████████████████| 65.7MB / 65.7MB, 32.9MB/s  
  ...ter_080225_15_e5/meta/tasks.parquet: 100%|██████████████████████████| 2.92kB / 2.92kB            
  ....images.side/chunk-000/file-000.mp4: 100%|██████████████████████████| 33.4MB / 33.4MB            
  ..._e5/data/chunk-000/file-000.parquet: 100%|██████████████████████████| 45.5kB / 45.5kB            
  ...episodes/chunk-000/file-000.parquet: 100%|██████████████████████████| 55.9kB / 55.9kB            
  ....screwdriver/chunk-000/file-000.mp4: 100%|██████████████████████████| 30.8MB / 30.8MB            
  ...n.images.top/chunk-000/file-000.mp4: 100%|██████████████████████████| 26.1MB / 26.1MB      

The successfully converted dataset https://huggingface.co/datasets/jackvial/screwdriver_panel_center_080225_15_e5/tree/main

@jackvial jackvial mentioned this pull request Oct 4, 2025
@jackvial
Copy link
Contributor Author

jackvial commented Oct 5, 2025

Possibly related to using a older version of torchcodec (0.21), will test with after updating to 0.5

@CarolinePascal
Copy link
Collaborator

Hi @jackvial,

Thanks for testing our newest Dataset v3 tools ;) I dig into this issue, and I went down a rabbit hole (as always with PyAV and ffmpeg).
The core issue is that when opening an InputContainer, PyAV does not allow the user to choose the codec, it just picks the default one (e.g. for av1 it's libdav1d). Unfortunately, although it necessarily supports decoding, the picked codec does not necessarily support encoding, hence the issue with time_base.
Digging a bit more, I realized this was also the reason add_stream_ from_template would fail without opaque=True...

That being said, I tried without this extra care for time_base and the results were totally fine. If you have 5 minutes to spare, could you try converting your datasets with this part removed :

stream_map[
    input_stream.index
].time_base = (
    input_stream.time_base
)  # set the time base to the input stream time base (missing in the codec context)

Thanks again for your help !

Best,

Caroline.

Copy link
Collaborator

@michel-aractingi michel-aractingi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jackvial ! I tested your fix it works with av versions 14.* that were causing the time_base issue. approved

@jackvial
Copy link
Contributor Author

jackvial commented Oct 9, 2025

@CarolinePascal @michel-aractingi Thank you, I tested converting a dataset on latest main branch with the recent change to video_utils.py to remove setting time_base and didn't have any problems so I think I can close this issue?

python -m lerobot.datasets.v30.convert_dataset_v21_to_v30 --repo-id=jackvial/screwdriver_attach_panel_rs_080125_31_e5

@lukicdarkoo
Copy link
Contributor

Thank you @jackvial, this fix works for me as well. @michel-aractingi, can we get the PR merged?

@jackvial
Copy link
Contributor Author

Thank you @jackvial, this fix works for me as well. @michel-aractingi, can we get the PR merged?

Hey @lukicdarkoo, which version of pyav are you using? If you're on a version lower than 0.5 can you try recreate your virtual environment and see if this resolves the problems you're seeing?

@lukicdarkoo
Copy link
Contributor

@jackvial Upgrading torchcodec also seems to be working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants