Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send embeddings to GPT4All's Local API Server #12114

Closed
ThiloteE opened this issue Oct 28, 2024 · 7 comments
Closed

Send embeddings to GPT4All's Local API Server #12114

ThiloteE opened this issue Oct 28, 2024 · 7 comments
Labels
AI Related to AI Chat/Summarization bug Confirmed bugs or reports that are very likely to be bugs
Milestone

Comments

@ThiloteE
Copy link
Member

ThiloteE commented Oct 28, 2024

Follow up to #11870 and #12078.
Sub-issue of #11872

Setup:

  1. Download and install GPT4All. Go to GPT4All settings and enable "Local API Server".
  2. Download a large language model and configure it both in GPT4All and JabRef. An outdated example is depicted at Support GPT4All Server API #11870 (comment). Nowadays it is possible to choose GPT4All as AI provider at "File > Preferences > AI". Also, for testing, I recommend Replete-LLM-V2.5-Qwen-1.5b, since it is super small and fast, but it uses a different prompt template syntax than the phi-3 models.

Description of problem:

Chatting already works with GPT4All, but the embeddings of the pdf documents that are attached to JabRef's entries are seemingly not sent to GPT4All.

Image
Image

Hypotheses about the root of the problem:

  • Hypothesis 0: JabRef does not try to send embeddings.
  • Hypothesis 1: JabRef tries to attach the embeddings to the "system prompt" (as called in GPT4All) / "Instructions for AI" (As we call it in JabRef) and GPT4All does support system prompts, at least it is claimed in Enable system prompt for server. nomic-ai/gpt4all#2921, but I believe nomic-ai/gpt4all@328df85 might have disabled the ability for clients to send system prompts to GPT4All and instead force the system prompt that is configured in GPT4All. Basically there is a conflict between System prompt of GPT4All and System prompt of JabRef. Maybe it is the other way round though. Issue system prompt is ignored in API request nomic-ai/gpt4all#1855 shows somebody trying to send a system prompt via API, but there have been fixes since then that were supposed to fix it. Maybe it works now?! Worth a try.
  • Hypothesis 2: JabRef should send the embeddings to the user prompt, but doesn't.
  • Hypothesis 3: GPT4All's Local API Server is stateless and previous messages (including system prompt?!) are lost upon every new message. See Local server not remembering previous messages nomic-ai/gpt4all#2602 (comment). This is an unlikely hypothesis, as it requires multiple conditions to be fulfilled: (1) JabRef attaches embeddings to system prompt, (2) GPT4All is stateless and "forgets" the system prompt even before the first user message.

Additional info:

@ThiloteE ThiloteE added AI Related to AI Chat/Summarization bug Confirmed bugs or reports that are very likely to be bugs labels Oct 28, 2024
@ThiloteE
Copy link
Member Author

@InAnYan

@InAnYan
Copy link
Collaborator

InAnYan commented Oct 28, 2024

JabRef sends "embeddings" in user message. Thus hypothesis 1 coudln't be applied. However, thank you for mentioning, it's strange that GPT4All have those issues with system message customizations... It could (will) affect the output.

For hypothesis 3, every LLM API is stateless (REST - representable state transfer.., smth like that). Anyways JabRef handles the state.

@ThiloteE, could you run JabRef in debug mode and look at the logs. There should be a log from Gpt4AllModel when you send message. Can you see document pieces there (I call them "paper excerpts" in the templates PR)?

If I remember correctly, you just need to pass an argument --debug to JabRef (in Gradle it's probably: run --args='--debug'.

@ThiloteE
Copy link
Member Author

ThiloteE commented Oct 28, 2024

@InAnYan Here are some logs: logs for embeddings GPT4All.txt

We can see that entry metadata is added to the systemmessage, not the user message.

@InAnYan
Copy link
Collaborator

InAnYan commented Oct 28, 2024

@ThiloteE thanks. I see, system message is there.

Because there is "answer using this information", it means there probably was a search in documents.

Does chatting work with other models?

@ThiloteE
Copy link
Member Author

ThiloteE commented Oct 28, 2024

No, other models exhibit the same behaviour with GPT4All.

Results of testing today:

✅ OpenAI (ChatGPTT-4o-mini):

  • Test 1: The model responds to Bibtex metadata in the system prompt.

✅ Ollama (via OpenAI AP)I:

  • Test 1: The model responds to Bibtex metadata in the system prompt.
  • Test 1: Here is the logfile from running JabRef with debug argument: Logs for embeddings Ollama.txt
  • Test 1: But Ollama via OpenAI API only works, if something is written in the API key field. Doesn't matter what is written there. Just anything, Otherwise an error message is triggered that forces the user to add something in the API key field.
  • Test 1: Also, JabRef crashed with Ollama.

❌ GPT4All:

  • Test 1: The model does NOT respond to Bibtex metadata in the system prompt.
  • Test 2: Works for Ruslan, but not for Thilo.

@koppor koppor added this to the 6.0-alpha milestone Oct 30, 2024
@ThiloteE
Copy link
Member Author

ThiloteE commented Oct 30, 2024

✅ llama.cpp (via OpenAI API):

@ThiloteE
Copy link
Member Author

ThiloteE commented Nov 10, 2024

Real Problem:

The embeddings are not created, because of issue #12169 (Melting-pot issue https://github.com/JabRef/jabref-issue-melting-pot/issues/537)

Solutions:

  • A) Have no CUDA installed on your system (and have no CUDA paths set in your systems Environment Variables), then it should automatically detect the correct location of the CUDA path that is managed by JabRef (and DeepJavaLibrary (djl) and the system will also somehow find all dependencies.

  • B) (If you have multiple CUDA installed on your system) Add C:\Users\USER\.djl.ai\pytorch\2.4.0-cu124-win-x86_64 to "PATH" in Windows 10 Environment Variables. Can be done via searching for environment variables and manually adding the folder path. See https://www.howtogeek.com/787217/how-to-edit-environment-variables-on-windows-10-or-11/

Explanation about what happened to me:

My hypothesis about what happened: Since I had multiple CUDAs installed on my system and I had not set the PATH and System Environment Variables, the embedding model was not functioning and only the LLM was fully functional, which made it seem like embeddings were not sent to GPT4All, while in reality, no embeddings had ever been created in the first place. I confirmed LLMs being functional, while testing local API servers like GPT4All, Ollama or llama.cpp, as reported in issue #12114

I also had lots of x86 Microsoft Visual C++ Redistributables installed, which are not needed on my x64 system and that also might have caused some conflicts, but the main issue was the path issue, which caused the embeddings model to not function.

Links and comments that helped me find the answers:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI Related to AI Chat/Summarization bug Confirmed bugs or reports that are very likely to be bugs
Projects
None yet
Development

No branches or pull requests

3 participants