Skip to content

Releases: cc-hpc-itwm/tarantella

Tarantella 0.8.0

12 Jan 17:08

Choose a tag to compare

Major Changes

  • Add support for TensorFlow v2.0-2.7
  • Keras: callbacks: add support for custom callbacks and lambda callbacks
  • Model parallelism: initial pipelining implementation
  • Collectives: create TensorAllgatherer to perform Allgatherv operations for TensorFlow tensors

Changes

  • Keras: warn when BatchNormalization layers are used with too small micro-batch sizes
  • Collectives: switch to PyGPI-based implementations
  • Collectives: TensorAllreducer: add support for arbitrary lists and lists of tensors instead of only list of arrays
  • Collectives: TensorAllreducer: extend supported datatypes
  • Runtime: improve path handling for dependencies
  • Update README
  • Update docs

Bug fixes

  • Python bindings: ensure destruction of GaspiCxx-based objects

Tarantella 0.7.0

12 Jul 13:01

Choose a tag to compare

Major changes

  • Add support for TF 2.3-2.4
  • Automatically distribute datasets with any batch size and number of samples
  • Support for the following Keras callbacks:
    • tf.keras.callbacks.CSVLogger
    • tf.keras.callbacks.EarlyStopping
    • tf.keras.callbacks.History
    • tf.keras.callbacks.LearningRateScheduler
    • tf.keras.callbacks.ModelCheckpoint
    • tf.keras.callbacks.ProgbarLogger
    • tf.keras.callbacks.ReduceLROnPlateau
    • tf.keras.callbacks.RemoteMonitor
    • tf.keras.callbacks.TensorBoard
    • tf.keras.callbacks.TerminateOnNaN
  • Full distributed support for the tf.keras.Model API in tnt.Model
  • CLI: Add --cleanup option and automatic framework shutdown on Ctrl+c
  • Switch to a GaspiCxx-based backend

Minor changes

  • Automatic framework initialization on import
  • Update build scripts
  • TensorAllreducer: add support for n-dimensional Tensors
  • CLI: Add support for user defined environment variables (-x option)
  • CLI: Add support for pinning ranks to socket (--pin-to-socket option)

Fixes

  • model saving for tnt.Models (do not need to recompile anymore)
  • Add support for optimizers specified as strings
  • CLI: Use all GPUs available by default

Tarantella 0.6.2

05 Mar 14:15

Choose a tag to compare

Various bug fixes and improvements:

  • tarantella CLI
    • make sure -n and --n-per-node cannot be both set
    • support user environment variables
  • tnt.TntModelCheckpoint
    • fix bug in TF 2.0 and 2.1
  • distributed datasets
    • fix bug with map-like transformations
  • SynchCommunicator
    • fix bug related to initialization order in constructor
  • update docs (installation, quick start)

Tarantella 0.6.1

09 Dec 14:55

Choose a tag to compare

Various bug fixes and improvements:

  • documentation for tutorials on ResNet-50 and Transformer
  • fix hostfile creation in tarantella CLI
  • replace time-out barrier with blocking barrier by default

Tarantella 0.6.0

17 Nov 16:34

Choose a tag to compare

Distributed Data Parallelism for Deep Neural Network Training of TensorFlow 2/Keras models.