Skip to content

ClearML Serving with Triton-GPU Renaming model.onnx to model.graphdef/model.bin #84

Open
@InsertNamePls

Description

@InsertNamePls

I'm encountering an issue when deploying a GPT-2 ONNX model using ClearML Serving with Triton. The deployment process renames my model.onnx file to model.graphdef, causing Triton to fail when loading the model since it's expecting model.onnx.

The error prevents Triton from starting properly, causing the Triton container to continuously restart.

The model file should retain its original name (model.onnx) when copied to the Triton model repository.
Triton should be able to find and load the ONNX model without any filename mismatches.

ClearML Serving renames model.onnx to model.graphdef before copying it to the Triton model repository (/models/gpt2_onnx/1/).
Triton fails to locate the expected model.onnx, resulting in an error and continuous container restarts.

E0317 13:21:35.385474 45 model_lifecycle.cc:596] failed to load 'gpt2_onnx' version 1: Internal: failed to stat file /models/gpt2_onnx/1/model.onnx

But earlier, ClearML Serving logs show:

copy model into /models/gpt2_onnx/1/model.graphdef

Checked ClearML Serving Model Upload Process

Used the following command to upload the model with the correct name:

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model upload
--name "GPT2_ONNX" --project "GPT2-Serving"
--path ~/gpt/triton_models/gpt2_onnx/1/model.onnx

✅ Successfully uploaded model.onnx, as confirmed in the ClearML UI.
❌ However, during deployment, ClearML renamed it to model.graphdef.

clearml-serving --id 12e416036c4b4cd38b9fd3a46c85a583 model add
--engine triton --endpoint "gpt2_onnx"
--model-id 75159e2de62142fb9958e416807e3d1a
--preprocess preprocess.py
--aux-config platform="onnxruntime_onnx" max_batch_size=8 default_model_filename="model.onnx"

ERROR: You have default_model_filename in your config pbtxt, please remove it. It will be added automatically by the system.

Uploaded the Entire Model Directory as a ClearML Dataset
Tried Debugging Triton Container but It Restarts Too Fast

Any help on this issue would be appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions