-
Notifications
You must be signed in to change notification settings - Fork 202
Description
I'm trying to execute the following MLflow model with MLServer:
from typing import Any
from mlflow.pyfunc import PythonModel, PythonModelContext
class MyModel(PythonModel):
def predict(
self,
context: PythonModelContext | None,
model_input: list[dict[str, list[float]]],
params: dict[str, Any] | None = None,
) -> list[dict[str, list[float]]]:
return [{"output": [y * 2.0 for y in x["input"]]} for x in model_input]
Based on the Python type hints, MLflow generates the following signature:
signature:
inputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
"required": true}]'
outputs: '[{"type": "map", "values": {"type": "array", "items": {"type": "double"}},
"required": true}]'
params: null
MLServer then converts this signature into the following model metadata:
{
"name": "my_model",
"versions": [],
"platform": "",
"inputs": [
{
"name": "input-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
}
}
],
"outputs": [
{
"name": "output-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
}
}
],
"parameters": {
"content_type": "pd"
}
}
Which seems correct according to #2080.
When I try to perform an inference with the following request body:
{
"inputs": [
{
"name": "input-0",
"datatype": "BYTES",
"shape": [-1, 1],
"parameters": {
"content_type": "pd_json"
},
"data": ["{\"input\": [1.2, 2.3, 3.4]}"]
}
]
}
I get the following error:
Traceback (most recent call last):
File "/app/src/ai_serve/dataplane.py", line 67, in infer
result = await super().infer(payload, name, version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/handlers/dataplane.py", line 112, in infer
prediction = await model.predict(payload)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/runtime.py", line 203, in predict
return self.encode_response(model_output, default_codec=TensorDictCodec)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver/model.py", line 227, in encode_response
return default_codec.encode_response(self.name, payload, self.version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ray/anaconda3/lib/python3.12/site-packages/mlserver_mlflow/codecs.py", line 45, in encode_response
for name, value in payload.items()
^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'items'
After investigation, the decoding of the request works perfectly and the predict()
function is called successfully. However, encoding the response has failed because it is trying to encode the response using the TensorDictCodec
which is incorrect. It seems to be using the default codec instead of the model metadata codec. I think this is related to the following comment:
MLServer/mlserver/codecs/utils.py
Lines 90 to 98 in 1d1f3ee
def encode_inference_response( | |
payload: Any, | |
model_settings: ModelSettings, | |
) -> Optional[InferenceResponse]: | |
# TODO: Allow users to override codec through model's metadata | |
codec = find_request_codec_by_payload(payload) | |
if not codec: | |
return None |
Is my assumption correct? If so, how can we use the model's metadata to choose the right codec?