Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is s3fs support cache directory? #949

Open
qiankunli opened this issue Mar 14, 2025 · 1 comment
Open

is s3fs support cache directory? #949

qiankunli opened this issue Mar 14, 2025 · 1 comment

Comments

@qiankunli
Copy link

In our scenario, S3 files may be accessed by multiple steps. Does s3fs support configuring a cache directory (with the ability to set the cache size) that can store files retrieved from S3? This way, as long as the cache directory has sufficient space, subsequent accesses to the file using s3fs.open can be directly read from the local cache directory, avoiding the need to download the file from S3 again after the first download.

@martindurant
Copy link
Member

fsspec, the base for s3fs, supports caching: https://filesystem-spec.readthedocs.io/en/latest/features.html#caching-files-locally
The first example there is probably what you want. There are multiple cache implementations ( https://filesystem-spec.readthedocs.io/en/latest/api.html#built-in-implementations ), but none of them explicitly measure the current bytescount. It would be reasonable functionality to add to WholeFileCacheFileSystem ( "filecache" ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants