You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently model_server cli is built and installed as part of archgw pip package. This results in pulling in many dependencies that are not directly related if we are just using llm_routing (egress_routing). Splitting model server off from archgw will make archgw cli a bit simpler by not having to serve models directly from archgw cli. This would require some thinking and planning on how to effectively work on to separate out model_server.
Ideally archgw cli will manage llm_gateway and prompt_gateway and whenever it needs access to local or remote hosted models we'd need hosted model endpoint.
Following models would need to be hosted and managed outside of archgw cli,