You may see more details in the Kubeflow examples mnist. The difference is that the example end-to-end takes Kubeflow Fairing to build docker image and launch TFJob for distributed training, and then create a InferenceService (KFServing CRD) to deploy model service.
This example guides you through:
- Taking an example TensorFlow model and modifying it to support distributed training.
- Using
Kubeflow Fairing
to build docker image and launch aTFJob
to train model. - Using
Kubeflow Fairing
to createInferenceService
(KFServing CR) to deploy the trained model. - Cleaning up the
TFJob
andInferenceService
usingkubeflow-tfjob
andkfserving
SDK client.
-
Launch a Jupyter notebook
-
Open the notebook mnist_e2e_on_prem.ipynb
-
Follow the notebook to train and deploy MNIST on Kubeflow.