capacity managment change in starter.go not loading capacity correctly

**Describe the bug**
Change in model capacity handling does not work properly on all pipelines. I had FLUX.1-dev set to `capacity: 2` and the error was `insufficient capacity` when updating to version that included this PR for batch pipelines.

The change in PR #3558 assumes a model can only serve 1 request at a time. For the most part this is likely correct and the Orchestrator needs to test to ensure the model can handle multiple requests at a time.  In prior tests, image-to-text, segment-anything-2 and LLM runners can handle multiple requests per runner.  LLM natively does this with vllm backend built to serve multiple requests simultaneously.

Should also consider updating the error to be more helpful to indicate the capacity set in the config is too large for the GPUs available.  

If want to switch to this method, the docs should be updated to indicate must use external container if have capacity > 1 for a runner.
https://docs.livepeer.org/ai/orchestrators/models-config#param-capacity

cc @pschroedl @leszko 

**To Reproduce**
Set `capacity: 2` on any batch pipeline runner and try to start up an ai-worker

**Expected behavior**
Load the model config one time and allow capacity set to be sent to that runner.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

capacity managment change in starter.go not loading capacity correctly #3627

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

capacity managment change in starter.go not loading capacity correctly #3627

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions