v0.2.17
This release contains new features and fixes for distributed training.
Important: This release fixes hangs in distributed training by ensuring the same number of batches are returned on each rank (#237). However, this and other fixes change how samples are assigned to ranks and is therefore a breaking change. Resuming from checkpoints created with an older version of LitData will not be valid (if you are using the stateful data loader feature).
What's Changed
- Feat: Updates readme and a few nitpicks by @deependujha in #223
- docs: add
Specify cache directory
by @csy1204 in #229 - Enable compatibility with Numpy 2.0 by @weiji14 in #230
- Fix typo in resolver.py by @lud-ds in #239
- Feature: Add support for encryption and decryption of data at chunk/sample level by @bhimrazy in #219
- Fix uneven batches in distributed dataloading by @awaelchli in #237
- feat: add a custom storage options param by @csy1204 in #246
- Fix index errors on world size > 0 by @awaelchli in #252
New Contributors
- @csy1204 made their first contribution in #229
- @weiji14 made their first contribution in #230
- @lud-ds made their first contribution in #239
Full Changelog: v0.2.16...v0.2.17