HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.
- Memory-efficient loading of categorical data
- GPU-efficient orchestration of embedding layers
- Communication-efficient training and evaluation at scale
- Easy to use with existing AI workflows
A minimal example:
import tensorflow as tf
import hybridbackend.tensorflow as hb
ds = hb.data.Dataset.from_parquet(filenames)
ds = ds.batch(batch_size)
# ...
with tf.device('/gpu:0'):
embs = tf.nn.embedding_lookup_sparse(weights, input_ids)
# ...
Please see documentation for more information.
pip install {PACKAGE}
{PACKAGE} |
Dependency | Python | CUDA | GLIBC | Data Opt. | Embedding Opt. | Parallelism Opt. |
---|---|---|---|---|---|---|---|
hybridbackend-tf115-cu121 | TensorFlow 1.15 | 3.8 | 12.1 | >=2.31 | ✓ | ✓ | ✓ |
hybridbackend-tf115-cu100 | TensorFlow 1.15 | 3.6 | 10.0 | >=2.27 | ✓ | ✓ | ✗ |
hybridbackend-tf115-cpu | TensorFlow 1.15 | 3.6 | - | >=2.24 | ✓ | ✗ | ✗ |
We also provide built docker images for latest DeepRec:
registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:1.0.0-deeprec-py3.6-cu114-ubuntu18.04
HybridBackend is licensed under the Apache 2.0 License.
-
Please see Contributing Guide before your first contribution.
-
Please register as an adopter if your organization is interested in adoption. We will discuss RoadMap with registered adopters in advance.
-
Please cite HybridBackend in your publications if it helps:
@inproceedings{zhang2022picasso, title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems}, author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others}, booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)}, year={2022}, organization={IEEE} }
If you would like to share your experiences with others, you are welcome to contact us in DingTalk: