Skip to content

Potential Fix For Crashes Caused By Non-UTF-8 IRC Input #3

Open
@GeorgiaM-honestly

Description

@GeorgiaM-honestly

Hello,

I was having a lot of problems with crashes caused by non-UTF-8 IRC input. This won't hit most users because I'm using it in, shall we say, a completely insane channel, but perhaps this fix: 1) works 2) is desirable

I am using the next branch here.

Error examples, both triggered by the same IRC input at the same time:

Instance 1 running on a bare metal system (I call it "BM"):

Traceback (most recent call last):
  File "/home/shithead-x/virtualenvs/shithead-X-next/./gpt2_bot.py", line 6, in <module>
    bot.start()
  File "/home/shithead-x/virtualenvs/shithead-X-next/gpt2_bot/irc.py", line 65, in start
    lines = self.socket.recv(524288).decode("UTF-8")
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 508-509: invalid continuation byte


Instance 2 running in a virtual machine, which I call "VM":

Traceback (most recent call last):
  File "./gpt2_bot.py", line 6, in <module>
    bot.start()
  File "/home/shithead-x/virtualenvs/shithead-X-next/gpt2_bot/irc.py", line 65, in start
    lines = self.socket.recv(524288).decode("UTF-8")
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 508-509: invalid continuation byte

Potential fix: Change line 65 in irc.py to:

lines = self.socket.recv(524288).decode("UTF-8", errors='ignore')

Another option for errors= is 'replace' however I have not experimented with that. Previously under the conditions each of the bots would only last a few minutes before encountering the issue and crashing. I let both of them run overnight in those conditions without a single crash.

Credit should go to irc.libera.chat #python for helping me.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions