Out of the box, mlserver
supports the deployment and serving of scikit-learn
models.
By default, it will assume that these models have been serialised using joblib
.
In this example, we will cover how we can train and serialise a simple model, to then serve it using mlserver
.
The first step will be to train a simple scikit-learn
model.
For that, we will use the MNIST example from the scikit-learn
documentation which trains an SVM model.
# Original source code and more details can be found in:
# https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html
# Import datasets, classifiers and performance metrics
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split
# The digits dataset
digits = datasets.load_digits()
# To apply a classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))
# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)
# Split data into train and test subsets
X_train, X_test, y_train, y_test = train_test_split(
data, digits.target, test_size=0.5, shuffle=False)
# We learn the digits on the first half of the digits
classifier.fit(X_train, y_train)
To save our trained model, we will serialise it using joblib
.
While this is not a perfect approach, it's currently the recommended method to persist models to disk in the scikit-learn
documentation.
Our model will be persisted as a file named mnist-svm.joblib
import joblib
model_file_name = "mnist-svm.joblib"
joblib.dump(classifier, model_file_name)
Now that we have trained and saved our model, the next step will be to serve it using mlserver
.
For that, we will need to create 2 configuration files:
settings.json
: holds the configuration of our server (e.g. ports, log level, etc.).model-settings.json
: holds the configuration of our model (e.g. input type, runtime to use, etc.).
%%writefile settings.json
{
"debug": "true"
}
%%writefile model-settings.json
{
"name": "mnist-svm",
"implementation": "mlserver_sklearn.SKLearnModel",
"parameters": {
"uri": "./mnist-svm.joblib",
"version": "v0.1.0"
}
}
Now that we have our config in-place, we can start the server by running mlserver start .
. This needs to either be ran from the same directory where our config files are or pointing to the folder where they are.
mlserver start .
Since this command will start the server and block the terminal, waiting for requests, this will need to be ran in the background on a separate terminal.
We now have our model being served by mlserver
.
To make sure that everything is working as expected, let's send a request from our test set.
For that, we can use the Python types that mlserver
provides out of box, or we can build our request manually.
import requests
x_0 = X_test[0:1]
inference_request = {
"inputs": [
{
"name": "predict",
"shape": x_0.shape,
"datatype": "FP32",
"data": x_0.tolist()
}
]
}
endpoint = "http://localhost:8080/v2/models/mnist-svm/versions/v0.1.0/infer"
response = requests.post(endpoint, json=inference_request)
response.json()
As we can see above, the model predicted the input as the number 8
, which matches what's on the test set.
y_test[0]