Replies: 2 comments 1 reply
-
Any thought about this suggestion? I will start to implement a prototype for this. :) |
Beta Was this translation helpful? Give feedback.
1 reply
-
BentoML added inference graph capability in the 1.0 release. Please read more here, https://docs.bentoml.org/en/latest/guides/graph.html |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, BentoML team!
I'm trying to figure out how to implement an inference graph using the BentoML. We must be able to support real-time inference between several BentoServices already deployed in K8s in the form of acyclic graph. I think there are some following scenarios that can be considered when inferencing several models at the same time.
1. sequential inference : many one-model-in-one-BentoService
2. sequential inference : one multiple-models-in-one-BentoService
3. sequential inference : many one-model-in-one-BentoService
4. parallel inference : one multiple-models-in-one-BentoService
5. parallel inference : many one-model-in-one-BentoService
6. graph definition : many one-model-in-one-BentoService
Can BentoML cover all scenarios here in the future? I want to discuss with you how to implement all the scenarios here to provide a more convenient interface to users. Please let me know if there is anything I am missing. Any comments are welcome.
Beta Was this translation helpful? Give feedback.
All reactions