Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

My Demo Bert Model Failed to Serve #2267

Open
cceasy opened this issue Nov 4, 2024 · 1 comment
Open

My Demo Bert Model Failed to Serve #2267

cceasy opened this issue Nov 4, 2024 · 1 comment

Comments

@cceasy
Copy link

cceasy commented Nov 4, 2024

I am trying to use tensorflow serving to serve a keras bert model, but I have problem to predict with rest api, below are informations. Can you please help me to resolve this problem.

predict output (ERROR)

{
"error": "Op type not registered 'TFText>RoundRobinTrim' in binary running on ljh-my-keras-bert-model. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib (e.g. tf.contrib.resampler), accessing should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed."
}

my local versions

Python 3.10.14
tensorflow                                   2.18.0
tensorflow-datasets                          4.9.6
tensorflow-io-gcs-filesystem                 0.37.1
tensorflow-metadata                          1.16.1
tensorflow-text                              2.18.0
keras                                        3.6.0
keras-hub-nightly                            0.16.1.dev202410210343
keras-nlp                                    0.17.0

model definition

import os
os.environ["KERAS_BACKEND"] = "tensorflow"  # "jax" or "tensorflow" or "torch"


import tensorflow_datasets as tfds
import keras_nlp


imdb_train, imdb_test = tfds.load(
    "imdb_reviews",
    split=["train", "test"],
    as_supervised=True,
    batch_size=16,
)


import keras
# Load a model.
classifier = keras_nlp.models.BertClassifier.from_preset(
    "bert_tiny_en_uncased",
    num_classes=2,
    activation="softmax",
)
# Compile the model.
classifier.compile(
    loss="sparse_categorical_crossentropy",
    optimizer=keras.optimizers.Adam(5e-5),
    metrics=["sparse_categorical_accuracy"],
    jit_compile=True,
)
# Fine-tune.
classifier.fit(imdb_train.take(250), validation_data=imdb_test.take(250))
# Predict new examples.
classifier.predict(["What an amazing movie!", "A total waste of my time."])
# expected output: array([[0.34156954, 0.65843046], [0.52648497, 0.473515  ]], dtype=float32)

save the model to local path

import tensorflow as tf
import keras_nlp


def preprocess(inputs):
    # Convert input strings to token IDs, padding mask, and segment IDs
    preprocessor = classifier.preprocessor
    encoded = preprocessor(inputs)
    return {
        'token_ids': encoded['token_ids'],
        'padding_mask': encoded['padding_mask'],
        'segment_ids': encoded['segment_ids']
    }


@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
def serving_fn(inputs):
    preprocessed = preprocess(inputs)
    outputs = classifier(preprocessed)
    return outputs


# Save the model
model_export_path = "/Users/xxx/tf_saved_models/my-keras-bert-model/1"
tf.saved_model.save(
    classifier,
    export_dir=model_export_path,
    signatures={"serving_default": serving_fn}
)


print(f"Model saved to: {model_export_path}")

build the tensorflow serving docker image

FROM tensorflow/serving:latest


COPY my-keras-bert-model /models/model
RUN ls /models/model

# Set the model environment variables
# ENV OMP_NUM_THREADS 4
# ENV TF_NUM_INTEROP_THREADS 4
# ENV TF_NUM_INTRAOP_THREADS 4

# Start TensorFlow Serving
ENTRYPOINT ["tensorflow_model_server"]
CMD ["--port=8500", "--rest_api_port=8080", "--model_name=model", "--model_base_path=/models/model"]

predict request

POST http://localhost:8080/v1/models/model/versions/1:predict
Content-Type: application/json

{"instances": ["What an amazing movie!", "A total waste of my time."]}

@janasangeetha janasangeetha self-assigned this Nov 12, 2024
@janasangeetha
Copy link

janasangeetha commented Nov 13, 2024

Hi @cceasy,
Thank you for reporting. I was able to reproduce the issue. I will check on this internally and update here.
Below is the error:

2024-11-12 09:34:40.365027: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
E0000 00:00:1731404080.457354     107 mlir_bridge_pass_util.cc:68] Failed to parse __inference_serving_fn_19270: Op type not registered 'TFText>RoundRobinTrim' in binary running on 58d2778e1319. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib (e.g. `tf.contrib.resampler`), accessing should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
I0000 00:00:1731404080.461934     107 mlir_graph_optimization_pass.cc:401] MLIR V1 optimization pass is not enabled
.
.
2024-11-12 09:34:40.505939: E external/org_tensorflow/tensorflow/core/grappler/optimizers/tfg_optimizer_hook.cc:135] tfg_optimizer{any(tfg-consolidate-attrs,tfg-toposort,tfg-shape-inference{graph-version=0},tfg-prepare-attrs-export)} failed: INVALID_ARGUMENT: Unable to find OpDef for TFText>RoundRobinTrim
	While importing function: __inference_serving_fn_19270
	when importing GraphDef to MLIR module in GrapplerHook

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants