Closed
Description
Hi,
I’m new to HF dataset and I tried to create datasets based on data versioned in lakeFS (MinIO S3 bucket as storage backend)
Here I’m using ±30000 PIL image from MNIST data however it is taking around 12min to execute, which is a lot!
From what I understand, it is loading the images into cache then building the dataset.
– Please find bellow the execution screenshot –
Is there a way to optimize this or am I doing something wrong?
Thanks!
Metadata
Metadata
Assignees
Labels
No labels