Replies: 2 comments
-
This seems to be an issue with dill. Simply put, caching in huggingface relies on a library that sometimes fails. However the discussion I found is quite old, and it is hard for me to verify this is the problem. uqfoundation/dill#19 (comment) What I did was to check what datasets/src/datasets/fingerprint.py Line 188 in 53f958e as the What can I do to enforce reproducibility and cache utilization? |
Beta Was this translation helpful? Give feedback.
-
I will for the time being use the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I have tried to make a minimal reproduction, but have not really managed. So bear with me for a second.
I have a file
foo.py
with contentsIf I call this a few times with command line
rm -rf ~/.cache/huggingface/datasets/ && python foo.py && python foo.py
, the output looks likeso clearly the caching mechanism fails intermittently. The second tqdm progress bar appears only when the cached versions of the
map
call comes to an invvalidated cache. I need to understand why the caching fails.The contents of
lib.render
looks likeThere are a couple of confusing things:
lib/render.py
fixes the problemHow can I debug the cache invalidation behavior? Where can I find exactly a description of the caching logic?
Beta Was this translation helpful? Give feedback.
All reactions