-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OP_REQUIRES failed at xla_ops : UNIMPLEMENTED: Could not find compiler for platform CUDA: NOT_FOUND #2214
Comments
@kmkolasinski, |
Hi @singhniraj08, yes I tried this appraoch, please check out my docker (at CMD command) which I used for testing: https://github.com/kmkolasinski/triton-saved-model/blob/main/tf_serving/Dockerfile Here is my docker compose which I used to run serving: https://github.com/kmkolasinski/triton-saved-model/blob/main/docker-compose.yml # Firstly, use https://github.com/kmkolasinski/triton-saved-model/blob/main/notebooks%2Fexport-classifier.ipynb to export various classifiers
docker compose up tf_serving_server I prepared this repository which reproduces this issue https://github.com/kmkolasinski/triton-saved-model/tree/main |
Hi @YanghuaHuang did you have time to take a look at this issue ? |
Sorry for the late reply. Assign to @guanxinq to triage, who has better knowledge on this. |
Hi @YanghuaHuang thanks, I just wonder whether we can use XLA compiled models in TF Serving or not. If yes, how we can achieve it as I couldn't find any information about this. |
I think tfs does support XLA CPU but not GPU. But I could be wrong. @gharibian Hey Dero, can you help on this? Thanks! |
Thanks for the answer. If this is a truth, the message
makes a perfect sense to me now and that's a pitty. I assumed that TF Serving is using the same C++ backend to run SavedModel graph as TF libraries, so any SavedModel which I can run via python code I can also run via TF Serving. Let's wait for the confirmation from @gharibian . |
hey @gharibian did you have time to take a look at this thread ? |
Bug Report
Does Tensorflow Serving support XLA compiled SavedModels ? or am I doing something wrong ?
System information
2.13.1-gpu
Describe the problem
Hi, I'm trying to run XLA compiled models via Tensorflow Serving, however it seems to not work for me.
Here is the notebook I used to create XLA/AMP compiled SavedModel of very simple classifiers like ResNet50
https://github.com/kmkolasinski/triton-saved-model/blob/main/notebooks/export-classifier.ipynb
When running the TFServing server I can see following warning in the console
I get similar message on the client side
Exact Steps to Reproduce
You can find my repo where I compare Triton Server (python backend) with TFServing here: https://github.com/kmkolasinski/triton-saved-model. In the notebooks directory you will find
Is this expected behavior ? I am aware of this flag
however, I was not able to find any reasonable resources on how to use it, to test my case.
The text was updated successfully, but these errors were encountered: