-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Fix spacing issues with deepseek models: #3470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1) If the think tag is empty don't show it at all 2) Always trim the responses for whitespace so if the model generates <think>blah</think> [new line] final response The finale response is trimmed of the newline. Signed-off-by: Adam Treat <[email protected]>
Signed-off-by: Adam Treat <[email protected]>
gpt4all-chat/src/chatllm.cpp
Outdated
@@ -976,7 +976,7 @@ class ChatViewResponseHandler : public BaseResponseHandler { | |||
{ | |||
Q_UNUSED(bufferIdx) | |||
try { | |||
m_cllm->m_chatModel->setResponseValue(response); | |||
m_cllm->m_chatModel->setResponseValue(response.trimmed()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we trim leading whitespace here, then we no longer need to do so in onRegularResponse:
-auto respStr = QString::fromUtf8(m_result->response);
-return onBufferResponse(removeLeadingWhitespace(respStr), 0);
+return onBufferResponse(QString::fromUtf8(m_result->response), 0);
I think the reason we don't currently remove trailing whitespace while generating is that it causes a "jump" where if the model outputs e.g. 10 blank lines and one new word, it will only show any change once it generates the word.
Is there any reason related to tool calling that we would need to start doing that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should just trim leading whitespace here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see new version. taking the temporary is gross though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind; it's just like glib's g_strchug (or C functions like strtok).
Signed-off-by: Adam Treat <[email protected]>
Signed-off-by: AT <[email protected]>
auto respStr = QString::fromUtf8(result.response); | ||
if (!respStr.isEmpty() && (std::as_const(respStr).back().isSpace() || finalBuffers.size() > 1)) { | ||
if (finalBuffers.size() > 1) | ||
m_chatModel->setResponseValue(finalBuffers.last()); | ||
m_chatModel->setResponseValue(finalBuffers.last().trimmed()); | ||
else | ||
m_chatModel->setResponseValue(respStr.trimmed()); | ||
emit responseChanged(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're here, shouldn't this be more like:
auto respStr = finalBuffers.size() > 1 ? finalBuffers.last() : QString::fromUtf8(result.response);
if (!respStr.isEmpty() && std::as_const(respStr).back().isSpace()) {
m_chatModel->setResponseValue(respStr.trimmed());
emit responseChanged();
}
The purpose of this code is to trim the trailing whitespace from the end of the response if needed, since the response handler won't do it while generating. I don't see why any check on result.response
is necessary if we aren't using its value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, could be simplified.
blah
[new line]
final response
The finale response is trimmed of the newline.