[Discussion] Dataset Abstraction

When working with multiple documents, it possibly would be great to include some sort of abstraction for multiple documents forming a `Dataset`.

The `Dataset` would then allow:
* Caching of Datasets after doing transformations on them (which I am currently requiring), e.g. in a Pipelined workflow
* Saving and loading Datasets efficiently.

@ArneBinder and I were discussing using [HuggingFace Datasets](https://huggingface.co/docs/datasets/index) as base class, which would bring the hole power of HuggingFace Datasets to pytorch-ie.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Discussion] Dataset Abstraction #123

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Discussion] Dataset Abstraction #123

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions