Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add embedding support for message.content which is string/object array type #3045

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hehua2008
Copy link

@hehua2008 hehua2008 commented Jan 28, 2025

Pull Request Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ”¨ chore
  • πŸ“ docs

Relevant Issues

None

What is in this change?

Add embedding support for message.content which is string/object array type.

Additional Information

https://platform.openai.com/docs/api-reference/chat/create
Screenshot2025-01-29 15 15 24
Screenshot2025-01-29 10 39 44

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

@hehua2008 hehua2008 force-pushed the feat-content-array-embedding branch from a5d6675 to ffddabf Compare January 28, 2025 14:22
@timothycarambat
Copy link
Member

Blocking this PR until some more explanation is provided since there is no issue.

@hehua2008 What is the intended function of this PR since there is no issue and it touches a critical part of the code its pretty tantamount that the root cause is discussed here. How is it you are getting Array type data into the chat window and if so - how are you doing so?

Then we can have a larger discussion around why this is necessary or if there is some other pre-processing step we can apply here.

Thanks

@hehua2008 hehua2008 force-pushed the feat-content-array-embedding branch from ffddabf to 107203c Compare January 28, 2025 17:16
@hehua2008
Copy link
Author

hehua2008 commented Jan 28, 2025

Blocking this PR until some more explanation is provided since there is no issue.

@hehua2008 What is the intended function of this PR since there is no issue and it touches a critical part of the code its pretty tantamount that the root cause is discussed here. How is it you are getting Array type data into the chat window and if so - how are you doing so?

Then we can have a larger discussion around why this is necessary or if there is some other pre-processing step we can apply here.

Thanks

For write code with Cline or Roo-Code in VSCode, because sometimes the message.content sent by Cline or Roo-Code is an array like this:

{
  "model": "mlx-python",
  "messages": [
    {
      "role": "system",
      "content": "You are Roo, a highly skilled software engineer with extensive knowledge in many programming languages, frameworks, design patterns, and best practices..."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "<task>\nWrite a MLX example in Python\n</task>"
        },
        {
          "type": "text",
          "text": "<environment_details>\n# VSCode Visible Files...</environment_details>"
        }
      ]
    }
  ],
  "temperature": 0,
  "stream": true
}

@hehua2008
Copy link
Author

hehua2008 commented Jan 28, 2025

This is the relevant commit to support AnythingLLM on Roo-Code:
https://github.com/hehua2008/Roo-Code/tree/feature/AnythingLLM-RooCode

@timothycarambat
Copy link
Member

timothycarambat commented Jan 28, 2025

@hehua2008 This seems like something that should be done for pre-proc in your fork though, no? I can see why it is here in the repo, I just wonder if that is basically our responsibility or not is all.

That function modified is intended to take a single query string to be used for semantic search

@hehua2008
Copy link
Author

@hehua2008 This seems like something that should be done for pre-proc in your fork though, no? I can see why it is here in the repo, I just wonder if that is basically our responsibility or not is all.

That function modified is intended to take a single query string to be used for semantic search

Yes, it is AnythingLLM's responsibility.
Please see the OpenAI API documentation:
https://platform.openai.com/docs/api-reference/chat/create
Screenshot2025-01-29 10 39 44

@hehua2008
Copy link
Author

This commit may not be a good solution, but I want to bring it up here to draw your attention. It would be a great and valuable thing to make AnythingLLM support Cline or Roo-Code, etc. If you are interested in it, please help improve this commit or provide a better solution. Thank you very much!

@hehua2008 hehua2008 force-pushed the feat-content-array-embedding branch from 107203c to 727a54f Compare January 29, 2025 07:00
@hehua2008
Copy link
Author

@timothycarambat So, what's your plan to support content in array type as described in https://platform.openai.com/docs/api-reference/chat/create ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants