Skip to content

libpostal logs "invalid UTF-8" warning for a string having "\0" or "\u0000" and stays in waiting state  #36

@myasirkhan

Description

@myasirkhan

Using the jpostal, if I call jpostal parseaddress like:

AddressParser.getInstance().parseAddress("Rue du Médecin-Colonel Calbairac Toulouse France\u0000")

I am seeing this warning logged

WARN  invalid UTF-8
   at transliterate (transliterate.c:791) errno: No such file or directory
WARN  invalid UTF-8
   at transliterate (transliterate.c:791) errno: No such file or directory
WARN  invalid UTF-8
   at transliterate (transliterate.c:791) errno: No such file or directory
WARN  invalid UTF-8
   at transliterate (transliterate.c:791) errno: No such file or directory

And the thread remains in waiting state.. This happens only when the address have \u0000 or (simple \0) character in it. Simplest solution seems to not send \0 character or replace it before calling parseAddress...

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleresolved because the reported issue has been inactive for an extended period

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions