TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

gavrissh · 2023-02-24T11:16:06Z

🐛 Describe the bug

Torchserve supports batching of multiple requests and batch_size value is provided while registering the model.

Request Envelope receives the input as list of multiple request body but Kserve V2 request envelope picks only the first item in the list of inputs
https://github.com/pytorch/serve/blob/master/ts/torch_handler/request_envelope/kservev2.py#L104

The result being a single output sent back as response causing the mismatch

Error logs

TorchServe Error
stdout MODEL_LOG - model: resnet50-3, number of batch response mismatched, expect: 5, got: 1.

Installation instructions

Followed instructions provided here - https://github.com/pytorch/serve/blob/master/kubernetes/kserve/kserve_wrapper/README.md

Model Packaing

Created a resnet50.mar using default parameters and handler

config.properties

inference_address=http://0.0.0.0:8085/
management_address=http://0.0.0.0:8085/
metrics_address=http://0.0.0.0:8082/
grpc_inference_port=7075
grpc_management_port=7076
enable_envvars_config=true
install_py_dep_per_model=true
enable_metrics_api=true
metrics_format=prometheus
NUM_WORKERS=1
number_of_netty_threads=4
job_queue_size=10
model_store=/mnt/models/model_store
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"resnet50": {"1.0": {"defaultVersion": true,"marName": "resnet50.mar","minWorkers": 6,"maxWorkers": 6,"batchSize": 16,"maxBatchDelay": 200,"responseTimeout": 2000}}}}

Versions

Name: kserve
Version: 0.10.0

Name: torch
Version: 1.13.1+cu117

Name: torchserve
Version: 0.7.1

Repro instructions

Followed instructions provided here - https://github.com/pytorch/serve/blob/master/kubernetes/kserve/kserve_wrapper/README.md

run the kserve_wrapper main.py and hit multiple curl infer request for v2 protocol

Command used -
seq 1 10 | xargs -n1 -P 5 curl -H "Content-Type: application/json" --data @input_bytes.json http://0.0.0.0:8080/v2/models/resnet50/infer

Possible Solution

Changes required to handle Torchserve batched inputs and generate output for all the requests initiated by TorchServe

Changes are need in parse_input() and format_output() methods in kservev2.py

jagadeeshi2i · 2023-02-24T12:19:32Z

@gavrishp Both envelopes needs fix for KServe 0.10 and v2 protocol has 2 examples one with bytes input and another with tensor input. Let me know what are you working on. I can take up the remaining.

gavrissh · 2023-02-24T13:12:05Z

@jagadeeshi2i I can take the v2 protocol changes.

There's one use case, I need inputs for.
As Kserve supports batching within a single request.
Request Example -

{
  "inputs": [
    {
      "name": "input-0",
      "shape": [37],
      "datatype": "INT64",
      "data": [66, 108, 111, 111, 109]
    },
    {
      "name": "input-0",
      "shape": [37],
      "datatype": "INT64",
      "data": [66, 108, 111, 111, 109]
    }
  ]
}

Response for the above example is

{
  "model_name":"resnet50",
  "model_version":"3.0",
  "id":"c0229ab0-f157-4917-974a-93646a51a57d",
  "parameters":null,
  "outputs":[
    {
      "name":"predict",
      "shape":[],
      "datatype":"BYTES",
      "parameters":null,
      "data":[2]
    },
    {
      "name":"predict",
      "shape":[],
      "datatype":"BYTES",
      "parameters":null,
      "data":[2]
    }
  ]
}

But with torchserve batching of multiple requests, as the handler postprocess output would return list of outputs. Might also need to hold some additional state to keep track of which input came from which request_id right?

jagadeeshi2i · 2023-02-27T04:44:16Z

In the above example a single http request has multiple inputs in it. So the response will have outputs with same order with request id. You are referring to Torchserve dynamic batching, which is not supported in KServe integration.

gavrissh · 2023-02-27T09:57:50Z

This issue is concerning the Torchserve dynamic batching with Kserve integration. Is there any particular reason for it not being supported? Is it planned to be supported in the future?

If that is the case, TS model config batch_size should not be allowed to be set more than 1 right now. I suppose it is causing this particular issue.

My understanding is that this batching will help with better GPU Utilisation and higher throughput values. My testing results supports this.

jagadeeshi2i · 2023-02-27T10:04:00Z

Torchserve with KServe has batching support. The inputs are statically batched. Torchserve on it own dynamic batching where it waits for batch_delay time for batch_size to be filled.

KServe v2 requires sending all inputs in a single request. Setting batch_size more than 1 here will make Torchserve wait for the batch_dealy.

Regarding GPU utilization both static and dynamic batching starts processing after all the input in received so this will not affect the GPU uitilization.

gavrissh · 2023-02-27T10:49:56Z

Thanks for clarifying!

What would you suggest is the correct fix here for this issue?

when the user sets batch_size > 1 and TS service throws this error 'message': 'number of batch response mismatched' as it did dynamic batching of multiple inputs.

jagadeeshi2i · 2023-02-27T10:55:10Z

set batch_size to 1

jagadeeshi2i · 2023-03-03T04:50:31Z

@gavrishp is the issue resolved now ?

gavrissh · 2023-09-25T02:37:35Z

@jagadeeshi2i Had a query, is it by design we are selecting only the first element in the batch in the kserve envelopes?

https://github.com/pytorch/serve/blob/master/ts/torch_handler/request_envelope/kserve.py#L27
https://github.com/pytorch/serve/blob/master/ts/torch_handler/request_envelope/kservev2.py#L102,L111

This is still voiding any use-case with batch_size > 1.

matej14086 · 2024-09-04T11:27:12Z

The main feature of torchserve is dynamic batching, especially if you have requests from multiple sources.
It's a bummer that Kserve doesn't support that

jagadeeshi2i added kubernetes kfserving labels Feb 24, 2023

jagadeeshi2i mentioned this issue Mar 13, 2023

fix: kserve fastapi migration issues #2175

Merged

10 tasks

pkluska linked a pull request Oct 4, 2024 that will close this issue

fix: kservev2 batching issue and missing parameters #3341

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

gavrissh commented Feb 24, 2023

jagadeeshi2i commented Feb 24, 2023

gavrissh commented Feb 24, 2023

jagadeeshi2i commented Feb 27, 2023

gavrissh commented Feb 27, 2023

jagadeeshi2i commented Feb 27, 2023 •

edited

Loading

gavrissh commented Feb 27, 2023

jagadeeshi2i commented Feb 27, 2023

jagadeeshi2i commented Mar 3, 2023

gavrissh commented Sep 25, 2023

matej14086 commented Sep 4, 2024

TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

TorchServe with Kserve_wrapper v2 throws 'message': 'number of batch response mismatched' #2158

Comments

gavrissh commented Feb 24, 2023

🐛 Describe the bug

Error logs

Installation instructions

Model Packaing

config.properties

Versions

Repro instructions

Possible Solution

jagadeeshi2i commented Feb 24, 2023

gavrissh commented Feb 24, 2023

jagadeeshi2i commented Feb 27, 2023

gavrissh commented Feb 27, 2023

jagadeeshi2i commented Feb 27, 2023 • edited Loading

gavrissh commented Feb 27, 2023

jagadeeshi2i commented Feb 27, 2023

jagadeeshi2i commented Mar 3, 2023

gavrissh commented Sep 25, 2023

matej14086 commented Sep 4, 2024

jagadeeshi2i commented Feb 27, 2023 •

edited

Loading