Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where to find video dataset ? #48

Open
dyyoungg opened this issue Dec 23, 2024 · 4 comments
Open

where to find video dataset ? #48

dyyoungg opened this issue Dec 23, 2024 · 4 comments

Comments

@dyyoungg
Copy link

In Huggingface dataset, I only see the video uuid in json files. How could I fetch those videos?
Thanks!

@chenjoya
Copy link
Collaborator

chenjoya commented Dec 23, 2024

Hi, the dataset is based on Ego4D (https://ego4d-data.org/). Just sign the licenses (https://ego4ddataset.com/ego4d-license/), and they will send you the aws credentials. Then pip install awscli ego4d, fill in aws credentials via aws configure. Finally, follow https://ego4d-data.org/docs/CLI/ to download the videos ;)

@dyyoungg
Copy link
Author

Hi, the dataset is based on Ego4D (https://ego4d-data.org/). Just sign the licenses (https://ego4ddataset.com/ego4d-license/), and they will send you the aws credentials. Then pip install awscli ego4d, fill in aws credentials via aws configure. Finally, follow https://ego4d-data.org/docs/CLI/ to download the videos ;)

Thank you for your careful reply!

@dyyoungg
Copy link
Author

Another question. The json file:

 "2d0c81c8-d7ff-4240-9ee7-4c71c152fb0a": {
        "8a889987-06aa-4778-bcd6-acd52bfe188f": [
            {
                "time": 0.8326981999999999,
                "text": "You pick up a jigsaw with your right hand."
            },

The ego4d.json file downloaded from aws:

 {
            "clip_uid": "7829e427-1b08-4226-a631-74e25b5d9e89",
            "video_uid": "2d0c81c8-d7ff-4240-9ee7-4c71c152fb0a",
            "video_start_sec": 0.0,
            "video_end_sec": 108.52102864583334,
            "video_start_frame": -1,
            "video_end_frame": 3255,
            "clip_metadata": {
                "fps": 30.0,
                "num_frames": 3255,
                "video_codec": "vp9",
                "audio_codec": null,
                "display_resolution_width": 1920,
                "display_resolution_height": 1080,
                "sample_resolution_width": 1920,
                "sample_resolution_height": 1080,
                "mp4_duration_sec": 108.54,
                "video_start_sec": null,
                "video_duration_sec": 108.5,
                "audio_start_sec": null,
                "audio_duration_sec": 108.522,
                "video_start_pts": 0,
                "video_duration_pts": 1666560,
                "video_base_numerator": 1,
                "video_base_denominator": 15360,
                "audio_start_pts": 0,
                "audio_duration_pts": 5209056,
                "audio_base_numerator": 1,
                "audio_base_denominator": 48000
            },
            "s3_path": "s3://ego4d-unict/public/v2/clips/7829e427-1b08-4226-a631-74e25b5d9e89.mp4",
            "manifold_path": null
        },

The first id "2d0c81c8-d7ff-4240-9ee7-4c71c152fb0a" is the video_uid, I found in the ego4d.json file, but i can't find where's the second id "8a889987-06aa-4778-bcd6-acd52bfe188f". What's the meaning of the second id?

@chenjoya
Copy link
Collaborator

Hello, the second id is clip_uid. Since livechat is generated by Ego4D goalstep, and that benchmark is proposed after Ego4D released, you may find that clip_uid in Ego4D goalstep annotations ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants