Description
🐛 Bug
To Reproduce
- Run the below code, and you may observe that
my_cache
dir is not being used
import litdata as ld
# Define the Hugging Face dataset URI
hf_dataset_uri = "hf://datasets/leonardPKU/clevr_cogen_a_train/data"
# Create a streaming dataset
# dataset is of 13.2 GB - so at the end of the streaming, cache should be clear
dataset = ld.StreamingDataset(hf_dataset_uri, cache_dir = "my_cache", max_cache_size="10GB")
# Stream the dataset using StreamingDataLoader
dataloader = ld.StreamingDataLoader(dataset, batch_size=4)
for sample in dataloader:
pass
Expected behavior
Additional context
Environment detail
- PyTorch Version (e.g., 1.0):
- OS (e.g., Linux):
- How you installed PyTorch (
conda
,pip
, source): - Build command you used (if compiling from source):
- Python version:
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information: