Skip to content

Input stream unusable after reading a JSON document #47

@daggaz

Description

@daggaz

Hey,

So I was playing with the following code:

from io import StringIO

from json_stream.tokenizer import tokenize

import json_stream


def json_document_iterator(f):
    try:
        while True:
            yield json_stream.load(f, tokenizer=tokenize)
    except StopIteration:
        pass


data = """
{"bob": 1, "bobby": 4}
{"bobo": 3}
"""


f = StringIO(data)
for document in json_document_iterator(f):
    for k, v in document.items():
        print(f"{k} = {v}")
    print("end of document")

Output:

bob = 1
bobby = 4
end of document
bobo = 3
end of document

If I use the default (rust) tokenizer instead, I only get the first document.

It appears the whole stream is consumed by the rust tokenize before returning?

This prints nothing:

f = StringIO(data)
list(json_stream.load(f))
print(f.read())

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions