Skip to content

Document the HF_DATASETS_CACHE environment variable in the datasets cache documentation #7532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 6, 2025

Conversation

Harry-Yang0518
Copy link
Contributor

This pull request updates the Datasets documentation to include the HF_DATASETS_CACHE environment variable. While the current documentation only mentions HF_HOME for overriding the default cache directory, HF_DATASETS_CACHE is also a supported and useful option for specifying a custom cache location for datasets stored in Arrow format.

This addition is based on the discussion in (#7457), where users noted the absence of this variable in the documentation despite its functionality. The update adds a new section to cache.mdx that explains how to use HF_DATASETS_CACHE with an example.

This change aims to improve clarity and help users better manage their cache directories when working in shared environments or with limited local storage.

Closes #7457.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@lhoestq
Copy link
Member

lhoestq commented Apr 28, 2025

Your clarification in your comment at #7480 (comment) sounds great, would you like to update this PR to include it ?

@Harry-Yang0518
Copy link
Contributor Author

Hi @lhoestq, I’ve updated the documentation to reflect the clarifications discussed in #7480. Let me know if anything else is needed!

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

I also too the liberty to remove unnecessary \

@lhoestq lhoestq merged commit b1bfe15 into huggingface:main May 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document the HF_DATASETS_CACHE env variable
3 participants