Description
The byte buffer chunking logic which presumed jsonl structured input should not.
Essentially the file partitioner needs to be aware of whether it is reading json streams or jsonl streams, and it should only resort to optimal partitioning logic in the case that the input is designated as JSONL. Otherwise, a separate thread should be used to scan and parse the object stream from the beginning.
Alternately, JsonNode stream processing can be used to derive offsets on or before chunking gaps. It would still be useful, for very large files, to have valid chunking offsets provided for streams of independent objects. It would also be useful to walk into a level of object structure from the outside of a large object which contained logical object streams internally, whether as arrays or as values.