Multiple models support for LLM TGI #835

sgurunat · 2024-10-30T07:37:55Z

Description
To support multiple llm models for ChatQnA, the changes are incorporated into llms TGI text-generation. Multiple models can be provided in model_configs.json file which will be loaded into MODEL_CONFIGS environment variable.

Type of change
New feature (non-breaking change which adds new functionality)
##Changes
To support this the model parameter has been added in the ChatQnAGateway and LLMParams from gateway.py and docarray.py respectively.
Added load_model_configs method in utils.py to validate all the required fields ( 'model_name', 'displayName', 'endpoint', 'minToken', 'maxToken') and then load the configurations. This is added in utils so that it can be reused.
Updated llm.py from llms text-generation tgi to support multiple models and transfer the call to right endpoint.
Updated the template.py file from llms text-generation tgi to have new template for models meta-llama/Meta-Llama-3.1-70B-Instruct" and "meta-llama/Meta-Llama-3.1-8B-Instruct"

…l field for ChatQnAGateway and LLMParams respectively

…del_configs

…els. Uses load_model_configs method from utils

…or different models

for more information, see https://pre-commit.ci

codecov · 2024-10-30T07:40:39Z

Codecov Report

Attention: Patch coverage is 16.66667% with 25 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
comps/cores/mega/utils.py	13.79%	25 Missing ⚠️

Files with missing lines	Coverage Δ
comps/cores/mega/gateway.py	`29.01% <ø> (ø)`
comps/cores/proto/docarray.py	`99.41% <100.00%> (+<0.01%)`	⬆️
comps/cores/mega/utils.py	`22.50% <13.79%> (-1.93%)`	⬇️

comps/llms/text-generation/tgi/llm.py

Signed-off-by: sgurunat <[email protected]>

ftian1 · 2024-10-31T01:32:27Z

I am confused by this PR. why we want user to pass a model_config to support different models? each OPEA microservice's instance will only support 1 model during deployment, The model_id will not be changed. and the endpoint is not configurable, it's predefined in OPEA API spec which is openai API compatible

I don't think this is right requirement to support changing different models during inference request.

comps/llms/text-generation/tgi/llm.py

Signed-off-by: sgurunat <[email protected]>

ftian1 · 2024-11-08T09:50:19Z

after discussion, we think having such model switch capability in s/w layer is right way to go. just the model_config format may be able to revise for simplicity. it could be done in future PR.

lvliang-intel · 2024-11-12T07:19:01Z

@sgurunat,
please resolve the conflict and then we can merge the PR.

lvliang-intel · 2024-11-12T07:50:10Z

@sgurunat,
please resolve the code conflict.

for more information, see https://pre-commit.ci

sgurunat · 2024-11-13T05:21:39Z

@lvliang-intel - resolved the merge conflicts

* Update gateway and docarray from mega and proto services to have model field for ChatQnAGateway and LLMParams respectively * Add load_model_configs method in utils.py to validate and load the model_configs * Update llms text-generation tgi file (llm.py) to support multiple models. Uses load_model_configs method from utils * Update llms text-generation tgi template to add different templates for different models * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed llm_endpoint empty string issue on error scenario Signed-off-by: sgurunat <[email protected]> * Function to get llm_endpoint and keep the code clean Signed-off-by: sgurunat <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: sgurunat <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

sgurunat added 4 commits October 29, 2024 08:58

Update gateway and docarray from mega and proto services to have mode…

11f378f

…l field for ChatQnAGateway and LLMParams respectively

Add load_model_configs method in utils.py to validate and load the mo…

3cd528c

…del_configs

Update llms text-generation tgi file (llm.py) to support multiple mod…

d9e5a32

…els. Uses load_model_configs method from utils

Update llms text-generation tgi template to add different templates f…

c3bc176

…or different models

sgurunat requested a review from lvliang-intel as a code owner October 30, 2024 07:37

[pre-commit.ci] auto fixes from pre-commit.com hooks

310201a

for more information, see https://pre-commit.ci

lvliang-intel requested a review from letonghan October 30, 2024 13:23

lvliang-intel reviewed Oct 30, 2024

View reviewed changes

comps/llms/text-generation/tgi/llm.py Outdated Show resolved Hide resolved

fixed llm_endpoint empty string issue on error scenario

15aafbc

Signed-off-by: sgurunat <[email protected]>

letonghan reviewed Oct 31, 2024

View reviewed changes

comps/llms/text-generation/tgi/llm.py Outdated Show resolved Hide resolved

comps/llms/text-generation/tgi/llm.py Show resolved Hide resolved

Function to get llm_endpoint and keep the code clean

9b7deaf

Signed-off-by: sgurunat <[email protected]>

ftian1 approved these changes Nov 8, 2024

View reviewed changes

lvliang-intel approved these changes Nov 12, 2024

View reviewed changes

ENCODERS09 approved these changes Nov 12, 2024

View reviewed changes

chensuyue added this to the v1.1 milestone Nov 13, 2024

sgurunat and others added 2 commits November 13, 2024 04:55

Resolved merge conflicts

ac4d530

[pre-commit.ci] auto fixes from pre-commit.com hooks

2be4159

for more information, see https://pre-commit.ci

lvliang-intel merged commit e879366 into opea-project:main Nov 13, 2024
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multiple models support for LLM TGI #835

Multiple models support for LLM TGI #835

Uh oh!

sgurunat commented Oct 30, 2024

Uh oh!

codecov bot commented Oct 30, 2024 •

edited

Loading

Uh oh!

Uh oh!

ftian1 commented Oct 31, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ftian1 commented Nov 8, 2024

Uh oh!

lvliang-intel commented Nov 12, 2024

Uh oh!

lvliang-intel commented Nov 12, 2024

Uh oh!

sgurunat commented Nov 13, 2024

Uh oh!

Uh oh!

Uh oh!

Multiple models support for LLM TGI #835

Multiple models support for LLM TGI #835

Uh oh!

Conversation

sgurunat commented Oct 30, 2024

Uh oh!

codecov bot commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

ftian1 commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ftian1 commented Nov 8, 2024

Uh oh!

lvliang-intel commented Nov 12, 2024

Uh oh!

lvliang-intel commented Nov 12, 2024

Uh oh!

sgurunat commented Nov 13, 2024

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Oct 30, 2024 •

edited

Loading

ftian1 commented Oct 31, 2024 •

edited

Loading