-
Notifications
You must be signed in to change notification settings - Fork 8
Closed
Description
When working with multiple documents, it possibly would be great to include some sort of abstraction for multiple documents forming a Dataset.
The Dataset would then allow:
- Caching of Datasets after doing transformations on them (which I am currently requiring), e.g. in a Pipelined workflow
- Saving and loading Datasets efficiently.
@ArneBinder and I were discussing using HuggingFace Datasets as base class, which would bring the hole power of HuggingFace Datasets to pytorch-ie.
Metadata
Metadata
Assignees
Labels
No labels