Releases: cc-hpc-itwm/tarantella
Releases · cc-hpc-itwm/tarantella
Tarantella 0.8.0
Major Changes
- Add support for TensorFlow v2.0-2.7
- Keras: callbacks: add support for custom callbacks and lambda callbacks
- Model parallelism: initial pipelining implementation
- Collectives: create
TensorAllgathererto perform Allgatherv operations for TensorFlow tensors
Changes
- Keras: warn when
BatchNormalizationlayers are used with too small micro-batch sizes - Collectives: switch to
PyGPI-based implementations - Collectives: TensorAllreducer: add support for arbitrary lists and lists of tensors instead of only list of arrays
- Collectives: TensorAllreducer: extend supported datatypes
- Runtime: improve path handling for dependencies
- Update README
- Update docs
Bug fixes
- Python bindings: ensure destruction of
GaspiCxx-based objects
Tarantella 0.7.0
Major changes
- Add support for TF 2.3-2.4
- Automatically distribute datasets with any batch size and number of samples
- Support for the following Keras callbacks:
- tf.keras.callbacks.CSVLogger
- tf.keras.callbacks.EarlyStopping
- tf.keras.callbacks.History
- tf.keras.callbacks.LearningRateScheduler
- tf.keras.callbacks.ModelCheckpoint
- tf.keras.callbacks.ProgbarLogger
- tf.keras.callbacks.ReduceLROnPlateau
- tf.keras.callbacks.RemoteMonitor
- tf.keras.callbacks.TensorBoard
- tf.keras.callbacks.TerminateOnNaN
- Full distributed support for the
tf.keras.ModelAPI intnt.Model - CLI: Add --cleanup option and automatic framework shutdown on Ctrl+c
- Switch to a GaspiCxx-based backend
Minor changes
- Automatic framework initialization on import
- Update build scripts
- TensorAllreducer: add support for n-dimensional Tensors
- CLI: Add support for user defined environment variables (-x option)
- CLI: Add support for pinning ranks to socket (--pin-to-socket option)
Fixes
- model saving for tnt.Models (do not need to recompile anymore)
- Add support for optimizers specified as strings
- CLI: Use all GPUs available by default
Tarantella 0.6.2
Various bug fixes and improvements:
tarantellaCLI- make sure
-nand--n-per-nodecannot be both set - support user environment variables
- make sure
tnt.TntModelCheckpoint- fix bug in TF 2.0 and 2.1
- distributed datasets
- fix bug with map-like transformations
SynchCommunicator- fix bug related to initialization order in constructor
- update docs (installation, quick start)
Tarantella 0.6.1
Various bug fixes and improvements:
- documentation for tutorials on ResNet-50 and Transformer
- fix hostfile creation in
tarantellaCLI - replace time-out barrier with blocking barrier by default
Tarantella 0.6.0
Distributed Data Parallelism for Deep Neural Network Training of TensorFlow 2/Keras models.