Skip to content

[Core feature] Allow users to control storage for caching #6626

@Mathis-Z

Description

@Mathis-Z

Motivation: Why do you think this is important?

I am using Flyte to process large datasets (couple GB) and I enabled caching for my tasks. However, I noticed my storage quickly filled up because old cache entries were not deleted.

Goal: What should the final outcome look like, ideally?

I propose the following features:

  • Users should be able to configure a limit (at least a soft limit) to the amount of storage consumed by the cache. Cache entries should then be evicted using a least-recently-used policy (or maybe something clever - many small cache entries might be more important than one large?).
  • The Flyte API (and CLI) should expose a way to clear the cache (and the cache for individual tasks, projects, ...)

Describe alternatives you've considered

AFAIK, there is currently no way to clear the cache.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestuntriagedThis issues has not yet been looked at by the Maintainers

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions