You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on training EVA-VLIP + SDXL as a visual autoencoder using the LAION-COCO dataset from Hugging Face. I've noticed that:
The dataset contains parquet files with image URLs and captions, not the actual images
Many URLs appear to be broken or inaccessible
This results in only a small fraction of images being successfully downloaded
Questions:
For your training runs, did you use:
Only the successfully downloaded images from LAION-COCO's URLs?
Or did you have access to a more complete/pre-downloaded version of the dataset?
If using an alternative dataset source:
Could you recommend where to obtain a more reliable version of LAION-COCO with higher URL availability?
Are there other suitable datasets you'd recommend for training SDXL conditioned on visual embeddings?
For the image-text pairs that fail to download:
Did you implement any fallback strategies (e.g., using cached versions from other sources)?
Or did you simply exclude these samples from training?