Skip to content

wordcounter assumes stream is bytes #76

@huard

Description

@huard

Description

In the _handler, we have

        def words(f):
            for line in f:
                for word in wordre.findall(line.decode('UTF-8')):
                    yield word

which assumes line has a decode method, but the supported_format (TEXT) does not explicitly specify that the encoding is utf-8. So if the content is passed as an embedded string in the request, with no encoding information, the process fails.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions