Skip to content

Fix spacing issues with deepseek models: #3470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 6, 2025
Merged

Conversation

manyoso
Copy link
Collaborator

@manyoso manyoso commented Feb 6, 2025

  1. If the think tag is empty don't show it at all
  2. Always trim the responses for whitespace so if the model generates
    blah
    [new line]
    final response

The finale response is trimmed of the newline.

1) If the think tag is empty don't show it at all
2) Always trim the responses for whitespace so if the model generates
   <think>blah</think>
   [new line]
   final response

The finale response is trimmed of the newline.

Signed-off-by: Adam Treat <[email protected]>
@manyoso manyoso requested a review from cebtenzzre February 6, 2025 16:08
Signed-off-by: Adam Treat <[email protected]>
@@ -976,7 +976,7 @@ class ChatViewResponseHandler : public BaseResponseHandler {
{
Q_UNUSED(bufferIdx)
try {
m_cllm->m_chatModel->setResponseValue(response);
m_cllm->m_chatModel->setResponseValue(response.trimmed());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we trim leading whitespace here, then we no longer need to do so in onRegularResponse:

-auto respStr = QString::fromUtf8(m_result->response);
-return onBufferResponse(removeLeadingWhitespace(respStr), 0);
+return onBufferResponse(QString::fromUtf8(m_result->response), 0);

I think the reason we don't currently remove trailing whitespace while generating is that it causes a "jump" where if the model outputs e.g. 10 blank lines and one new word, it will only show any change once it generates the word.

Is there any reason related to tool calling that we would need to start doing that here?

Copy link
Collaborator Author

@manyoso manyoso Feb 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just trim leading whitespace here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see new version. taking the temporary is gross though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind; it's just like glib's g_strchug (or C functions like strtok).

manyoso and others added 2 commits February 6, 2025 11:25
Comment on lines 1079 to 1085
auto respStr = QString::fromUtf8(result.response);
if (!respStr.isEmpty() && (std::as_const(respStr).back().isSpace() || finalBuffers.size() > 1)) {
if (finalBuffers.size() > 1)
m_chatModel->setResponseValue(finalBuffers.last());
m_chatModel->setResponseValue(finalBuffers.last().trimmed());
else
m_chatModel->setResponseValue(respStr.trimmed());
emit responseChanged();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here, shouldn't this be more like:

    auto respStr = finalBuffers.size() > 1 ? finalBuffers.last() : QString::fromUtf8(result.response);
    if (!respStr.isEmpty() && std::as_const(respStr).back().isSpace()) {
        m_chatModel->setResponseValue(respStr.trimmed());
        emit responseChanged();
    }

The purpose of this code is to trim the trailing whitespace from the end of the response if needed, since the response handler won't do it while generating. I don't see why any check on result.response is necessary if we aren't using its value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, could be simplified.

@manyoso manyoso merged commit 5e7e4b3 into main Feb 6, 2025
4 of 12 checks passed
@cebtenzzre cebtenzzre deleted the fix_deepseek_spacing branch February 10, 2025 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants