Skip to content

Question: Is there a list for publicly available s3 links of datasets of litdata.StreamingDataset format? #430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2catycm opened this issue Dec 2, 2024 · 4 comments
Labels
question Further information is requested won't fix

Comments

@2catycm
Copy link

2catycm commented Dec 2, 2024

If there is a list that collects some popular datasets that have been preprocessed by litdata and upload to lightning studio or S3, then the usability of this project will be really awesome for me.

For example, is there a streaming dataset for imagenet that is publicly available?

@2catycm 2catycm added the enhancement New feature or request label Dec 2, 2024
@tchaton
Copy link
Collaborator

tchaton commented Dec 2, 2024

Hey @2catycm. Yes, there is. I haven't processed much datasets so far.

Here are my published Studios: https://lightning.ai/thomasgridai

The dataset is available under s3://optimized-imagenet-1m/lightning_data_imagenet I think to remember

@bhimrazy bhimrazy added question Further information is requested and removed enhancement New feature or request labels Feb 8, 2025
@tchaton
Copy link
Collaborator

tchaton commented Feb 9, 2025

Hey @2catycm. We also added support for Hugging Face datasets.

Copy link

stale bot commented Apr 16, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix label Apr 16, 2025
@bhimrazy
Copy link
Collaborator

bhimrazy commented Jun 4, 2025

Closing this issue for now. Please feel free to reopen in case you have any further questions. 😊

@bhimrazy bhimrazy closed this as completed Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested won't fix
Projects
None yet
Development

No branches or pull requests

3 participants