Filter out unsafe AI messages from the history #709

SmittieC · 2024-10-08T08:19:17Z

We now save unsafe AI messages. This needs to be filtered out from the chat history whenever we give it to the LLM to generate a response, as well as from the chat UI when debug mode is disabled.

Proposed approach:
We tag the offending messages with a new tag. Something like SAFTEY_LAYER_TRIGGERD. Whenever we send the history to th bot or show it to the participant, we filter out all AI messages with this tag, since it is unsafe.

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to OpenChatStudio Oct 8, 2024

SmittieC mentioned this issue Oct 8, 2024

bug: unsafe content shown when continuing the chat #708

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter out unsafe AI messages from the history #709

Filter out unsafe AI messages from the history #709

SmittieC commented Oct 8, 2024

Filter out unsafe AI messages from the history #709

Filter out unsafe AI messages from the history #709

Comments

SmittieC commented Oct 8, 2024