You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As datasets grow in size, it will become more common to have them split into multiple files. The Tahoe dataset is a good example of there where each plate's data is saved to its own file. There are times when it would be useful to access all of this data as a unified set.
BioNeMo Framework Version
v2.6.3
Category
API/Interface
Proposed Solution
This could be implemented at the level of SingleCellMemMapDataset, or it could be a higher level class, e.g. SingleCellMemMapCollection, that chains together multiple instances of SingleCellMemMapDataset. The later appears to be what scDataset does. Note that PyTorch does have native capability to chain datasets together, e.g. IterableDataset and ConcatDataset, so it's also possible that this could be a good way to start.