Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Check services before deleting MLModel #3191

Open
xinyual opened this issue Oct 30, 2024 · 14 comments
Open

[RFC] Check services before deleting MLModel #3191

xinyual opened this issue Oct 30, 2024 · 14 comments

Comments

@xinyual
Copy link
Collaborator

xinyual commented Oct 30, 2024

Background

Currently, models in ML-commons are widely used, particularly by pipelines and the agent framework. However, we lack a mechanism to detect if a model is being utilized by other services. To address this, we propose a new feature that will identify whether a model is in use by another service. If so, an error should be raised—similar to when attempting to delete a connector that is still in use (like here).

All related service

We list all service using MLModel.

  • Pipeline
    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools
    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Solution

We can apply the same solution used for connectors and models. Generally, when attempting to delete a specific model ID, we can search both services (pipelines and agents) to check if they are using the current model ID. We have separate solution for each service since they are different

Pipeline:

For all the pipelines we developed, we found that they all use the key model_id to document their relevant model id. Therefore, we can use GetPipelineAction or GetSearchPipelineAction to retrieve all pipelines and then check the key “model_id” to detect whether they reference the current model ID.

Agent and Tools

Tools are more complex than pipelines because they use different keys to store the model IDs they reference. Furthermore, some tools use more than one model ID. For example, RAGTool requires both embedding_model_id and inference_model_id, while PPLTool uses model_id to document its model. The target field we need to check varies by tool, making it challenging to detect relevant models with a single template. Therefore, we propose a few options to address this complexity.

Option 1 (prefer)

We add an abstract function GETAllModelKeys in Base Class Factory.

interface Factory<T extends Tool> {
     List<String> GETAllModelKeys;
}

Each Tool’s factory needs to implement function and return a list of string which contains all fields can use model. For example, PPLTool should return [“model_id”] while RAGTool will return [“embedding_model_id”, “inference_model_id”].
When registering an agent, we call these functions to retrieve the relevant field names, parse the input parameters, and then write them to a fixed key, relatedModelIds. For example, A flow agent containing PPLTool's registration body is:

{
  "name": "Test_Agent_For_RAG_2",
  "type": "flow",
  "tools": [
      {
      "type": "PPLTool",
      "parameters": {
        "model_id": "test_model_id,
        "model_type": "FINETUNE"
      }
    }
  ]
}

After parsing in registering, we will write the following body into index

{
  "name": "Test_Agent",
  "type": "flow",
  "relatedModelIds": ["test_model_id,"],
  "tools": [
      {
      "type": "PPLTool",
      "parameters": {
        "model_id": "test_model_id"
      }
    }
  ]
}

Then we can use the following DSL template to query:

{
  "query": {
    "terms": {
            "relatedModelIds": ["delete_model_id"]
        }
  }
}

Pros: We don’t change any API. When develop new tools, we don’t have additional maintenance cost.
Cons: When customer call GET AGENT API, the “relatedModelIds” field will be shown and may confuse customer.

Option 2

For basic class Tool, we will implement an attribute “relatedModelIds”. All tools can be registered with a parameter:, it’s an optional one. For example,

{
  "name": "Test_Agent_For_RAG_2",
  "type": "flow",
  "tools": [
      {
      "type": "PPLTool",
      "parameters": {
        "model_id": "test_model_id,
        "model_type": "FINETUNE",
        "relatedModelIds": ["test_model_id"]
      }
    }
  ]
}

then we can use the following DSL template to query agent index:

{
  "query": {
    "terms": {
            "tool.parameters.relatedModelIds": ["delete_model_id"]
        }
  }
}

Pros: Relatively low code change.
Cons: If customer doesn’t provide this in request body, we cannot know which model is used by tool and the solution won’t become effective. It also requires redundant information in the request body, which may confuse users.

option 3

We maintain two separate mapping objects in ML-commons and skills. The key is the tool type, and the value is the field name that stores related model IDs. When searching for relevant agents, we iterate through all mappings and construct a DSL for each tool.
For example, we have

{
   "PPLTool": ["model_id"],
   "....."
}

Then we would create a DSL for PPLTool like

{
  "query": {
    "terms": {
            "tool.parameters.model_id": ["delete_model_id"]
        }
  }
}

Pros: It will not impact the user interaction experience.
Cons: High maintenance cost, as each new tool requires adding a corresponding value to the map.

@yuye-aws
Copy link
Member

Cons: If customer doesn’t provide this in request body, we cannot know which model is used by tool and the solution won’t become effective. It also requires redundant information in the request body, which may confuse users.

How about each tool provides a default value from code?

@yuye-aws
Copy link
Member

Hi @zane-neo and @dhrubo-os ! Is it possible that some field, when the user is searching for the agent, can be hidden? We don't wish to return redundant fields like relatedModelIds. For example, when the user is using the search agent API, it implicitly excludes relatedModelIds in the search request.

@yuye-aws
Copy link
Member

Pros: It will not impact the user interaction experience.
Cons: High maintenance cost, as each new tool requires adding a corresponding value to the map.

Plus one more con for option 3: need to maintain 2 maps, one in ml-commons and another in skills. It's not quite straightforward to merge two maps.

@yuye-aws
Copy link
Member

yuye-aws commented Oct 30, 2024

  • Pipeline

    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools

    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Plus one thing: model controller. Hi @xinyual! Can you include it in the RFC?

@xinyual
Copy link
Collaborator Author

xinyual commented Oct 30, 2024

  • Pipeline

    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools

    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Plus one thing: model controller. Hi @xinyual! Can you include it in the RFC?

model controller is to set limitation for multiply users and not a downstream task. We don't need to consider it here.

@xinyual
Copy link
Collaborator Author

xinyual commented Oct 30, 2024

Cons: If customer doesn’t provide this in request body, we cannot know which model is used by tool and the solution won’t become effective. It also requires redundant information in the request body, which may confuse users.

How about each tool provides a default value from code?

If so, it comes back to the option 1. Why option 2 is here is we want customer to input it and be clear about what relatedModelIds is. Besides, option 2 will keep the consistency between request body and GET/SEARCH agent API.

@yuye-aws
Copy link
Member

If so, it comes back to the option 1. Why option 2 is here is we want customer to input it and be clear about what relatedModelIds is.

IMO, the difference lies in whether the customer can override the default value of relatedModelIds.

@yuye-aws
Copy link
Member

model controller is to set limitation for multiply users and not a downstream task. We don't need to consider it here.

To the best of my knowledge, model controller relies on a single model id. If the model gets deleted, the model controller becomes useless. Let's also discuss whether the model controller should get deleted after deleting the model.

@zane-neo
Copy link
Collaborator

Hi @zane-neo and @dhrubo-os ! Is it possible that some field, when the user is searching for the agent, can be hidden? We don't wish to return redundant fields like relatedModelIds. For example, when the user is using the search agent API, it implicitly excludes relatedModelIds in the search request.

Yes this can be excluded in search and get request.

@zane-neo
Copy link
Collaborator

  • Pipeline

    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools

    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Plus one thing: model controller. Hi @xinyual! Can you include it in the RFC?

It make sense, we should consider model controller as it's targeting specific models.

@zane-neo
Copy link
Collaborator

Option1 LGTM, but let's refine the names like method name and field name. Also another thing we need consider is the new field is not visible to users and it's reserved, we should update the doc and add validation logic in agent creation to ensure users are not using it as their business key.

@xinyual
Copy link
Collaborator Author

xinyual commented Oct 30, 2024

  • Pipeline

    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools

    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Plus one thing: model controller. Hi @xinyual! Can you include it in the RFC?

It make sense, we should consider model controller as it's targeting specific models.

From my perspective, model controller is just a config for one user when using a model. Deleting the model won't influence the user's experience related with the deleted model so much.

@zane-neo
Copy link
Collaborator

  • Pipeline

    • rag-pipeline
    • neural-sparse search pipeline
    • neural-sparse ingest pipeline
    • neural-sparse two phase pipeline
    • ml inference pipeline
    • text-embedding
  • Tools

    • MLModel Tool
    • PPLTool
    • RAGTool
    • .....

Plus one thing: model controller. Hi @xinyual! Can you include it in the RFC?

It make sense, we should consider model controller as it's targeting specific models.

From my perspective, model controller is just a config for one user when using a model. Deleting the model won't influence the user's experience related with the deleted model so much.

True it doesn't have functionality impact not cleaning model controller, but it's a good practice to remove the unnecessary resources in system if the underlying dependency is been cleaned up.

@yuye-aws
Copy link
Member

True it doesn't have functionality impact not cleaning model controller, but it's a good practice to remove the unnecessary resources in system if the underlying dependency is been cleaned up.

Agree that model controller should be cleaned up. There are a few options for user experience.

  1. Stops the user from deleting the ML Model as long as there is a model controller referring to this model. This options follows the connector behavior.
  2. Implicitly deleting the model controller.
  3. Let's user define whether to automatically clean up the corresponding model controller

The problem also persist in agent and search processor. I prefer a unified way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

4 participants