forked from intel/tinycbor
-
Notifications
You must be signed in to change notification settings - Fork 2
Document and improve abstract reader/writer interface #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
sjlongland
wants to merge
52
commits into
thiagomacieira:dev
from
widesky:feature/WSHUB-455-chunked-codec
Closed
Document and improve abstract reader/writer interface #2
sjlongland
wants to merge
52
commits into
thiagomacieira:dev
from
widesky:feature/WSHUB-455-chunked-codec
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This method allows for writing raw data directly to the encoding buffer. This can be useful if you have something stored as CBOR encoded data. Fixes intel#162. Signed-off-by: Tofik Sonono <[email protected]>
Signed-off-by: phirsov <[email protected]>
printf s.b. used for indentation because no newline at end is needed Signed-off-by: phirsov <[email protected]>
bytestring got escaped hex string literal output format to not to confuse e.g. "\x11" bytestring with 11 integer Signed-off-by: phirsov <[email protected]>
string got plain old C output format to avoid confusion e.g. "true" string with CBOR true value Signed-off-by: phirsov <[email protected]>
Signed-off-by: Mahavir Jain <[email protected]>
5d24326
to
64299b9
Compare
…mes were unclear. containerEncoder could be the containing Encoder
… not always supported
Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
…nt data as single float Motivation: half-precision floating point format is used to minimize storage and traffic mostly. Application level manipulates with single and double precision usually. So, two routines added to public API to encode/decode given single precision value in the half precision format Signed-off-by: S.Phirsov Signed-off-by: Thiago Macieira <[email protected]>
The listing was outdated. Signed-off-by: Thiago Macieira <[email protected]>
The last version of Qt to support MSVC 2015 is no longer maintained, so I'm skipping that. I'm therefore rededicating MSVC 2017 for 32-bit. There's also no more no-tests build.
We don't need to compare the map lengths directly. The memcmp is sufficient, since the source data is big endian. Of course, verifying for sorting requires the map has known length. Signed-off-by: Thiago Macieira <[email protected]>
Because of all the inline functions, #include'ing <cbor.h> without linking in all .c files may result in undefined symbol linker errors. Signed-off-by: Thiago Macieira <[email protected]>
QCOMPARE macro has a return in case of failure. If a test fails inside the encodeOne function, we log that error, but were proceeding to perform more tests (which could fail again and produce more errors). Signed-off-by: Thiago Macieira <[email protected]>
As noted in the comment, we need to be sure we don't allow a length too big from the stream to overflow and become smaller than the number of bytes we're looking for. This is also the first step in creating an API that reads from something other than a linear buffer. Signed-off-by: Thiago Macieira <[email protected]>
This commit does not change all the readers yet, this is just the first step. As an interesting side-effect, we ended up reading the half-float into it->extra and need not re-read it. The cbor_value_get_half_float() function can be inlined in a later commit. Signed-off-by: Thiago Macieira <[email protected]>
We don't need to skimp on bits in the CborValue::flags, so save the fact that the preparser found a 64-bit number in there. This saves us from having to re-read the descriptor byte again in _cbor_value_decode_int64_internal(). Signed-off-by: Thiago Macieira <[email protected]>
It was originally non-inline because I had thought of doing conversions from half-float to float and onwards to double, but I never actually made that in the API. Instead, even the get_float() and get_double(), we only memcpy anyway and leave it up to the upper layer to convert, as needed. This change triggered an use-when-uninitialised false positive warning that I needed to work around in the validator. Signed-off-by: Thiago Macieira <[email protected]>
The extract_length() function is only used in string context, so rename accordingly. Signed-off-by: Thiago Macieira <[email protected]>
The extract_length() was the only case that called extract_number_and_advance() without verifying we had a proper number. This commit reworks the implementation so extract_number_and_advance() reuses the number previously read by preparse_value(), and extract_length() gets inlined to the only place that uses it. Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Instead of consuming it at the end of the last element of a map or array of unknown length. This allows us to obtain the pointer to or offset of the Break byte. Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
We need to re-parse if the input buffer was too short to read the current element's information. When that happens, the current element will be CborInvalidType, so we can't easily resume. Signed-off-by: Thiago Macieira <[email protected]>
Instead of just one function (_cbor_value_get_string_chunk), we now have _cbor_value_begin_string_iteration, _cbor_value_finish_string_iteration, _cbor_value_get_string_chunk_size, and _cbor_value_get_string_chunk. The "begin" function positions the pointer at the first chunk. That's what makes "get_size" possible, since it doesn't need to check for any state. The "finish" funcntion allows the caller to distinguish an error parsing the string from an error parsing the next value. Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Dmitry Shachnev <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Signed-off-by: Thiago Macieira <[email protected]>
Instead of just one function (_cbor_value_get_string_chunk), we now have _cbor_value_begin_string_iteration, _cbor_value_finish_string_iteration, _cbor_value_get_string_chunk_size, and _cbor_value_get_string_chunk. The "begin" function positions the pointer at the first chunk. That's what makes "get_size" possible, since it doesn't need to check for any state. The "finish" funcntion allows the caller to distinguish an error parsing the string from an error parsing the next value. Signed-off-by: Thiago Macieira <[email protected]>
`cbor_parser_init` and `cbor_parser_init_reader` are substantially similar, however the latter misses clearing out `it->flags`, leaving it uninitialised so possibly unsafe. Rather than copying & pasting that from `cbor_parser_init`, lets just use one routine that does the "common" part, then each routine can focus on the specifics needed.
Describe the input parameters for the function and how they are used as best we understand from on-paper analysis of the C code.
The `token` parameter is not sufficient since it is effectively shared by all `CborValue` instances. Since `tinycbor` often uses a temporary `CborValue` context to perform some operation, we need to store our context inside that `CborValue` so that we don't pollute the global state of the reader.
In its place, put an arbitrary `void *` pointer for reader context. The reader needs to store some context information which is specific to the `CborParser` instance it is serving. Right now, `CborValue::source::token` serves this purpose, but the problem is that we also need a per-`CborValue` context and have nowhere to put it. Better to spend an extra pointer (4 bytes on 32-bit platforms) in the `CborParser` (which there'll be just one of), then to do it in the `CborValue` (which there may be several of) or to use a `CborReader` object that itself carries two pointers (`ops` and the context, thus we'd need an extra 3 pointers).
We simplify this reader in two ways: 1. we remove the `consumed` member of `struct Input`, and instead use the `CborValue`'s `source.token` member, which we treat as an unsigned integer offset into our `QByteArray`. 2. we replace the reader-specific `struct Input` with the `QByteArray` it was wrapping, since that's the only thing now contained in our `struct Input`. If a `CborValue` gets cloned, the pointer referred to by `source.token` similarly gets cloned, thus when we advance the pointer on the clone, it leaves the original alone, so computing the length of unknown-length entities in the CBOR document can be done safely.
What is not known, is what the significance is of `CborEncoderAppendType`. It basically tells the writer the nature of the data being written, but the default implementation ignores this and just blindly appends it no matter what. That raises the question of why it's important enough that the writer function needs to know about it.
This reads a CBOR file piece-wise, seeking backward and forward through the file if needed. Some seeking can be avoided by tuning the block size used in reads so that the read window shifts by smaller amounts.
64299b9
to
027704f
Compare
This pull request is superseded by intel#208 |
thiagomacieira
pushed a commit
that referenced
this pull request
Mar 14, 2025
When function text_string_to_escaped successfully parses a string and fails to parse the next value (cbor_value_finish_string_iteration returns an error), it correctly propagates the error but the string is never freed. This can be reproduced with: make CC='clang -g -fsanitize=address' printf '\x82\x60\xff' | ./bin/cbordump -j clang's Address Sanitizer reports: ================================================================= ==20317==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1 byte(s) in 1 object(s) allocated from: #0 0x560b654b9916 in __interceptor_realloc (/tinycbor/bin/cbordump+0xa4916) (BuildId: f9933666b5d987b21f68c2887de4aebe93bc2bef) #1 0x560b654f5c18 in escape_text_string /tinycbor/src/cbortojson.c:331:15 #2 0x560b654f3e29 in text_string_to_escaped /tinycbor/src/cbortojson.c:377:19 intel#3 0x560b654f267d in value_to_json /tinycbor/src/cbortojson.c:674:19 intel#4 0x560b654f34c2 in array_to_json /tinycbor/src/cbortojson.c:545:25 intel#5 0x560b654f2085 in value_to_json /tinycbor/src/cbortojson.c:627:19 intel#6 0x560b654f1baf in cbor_value_to_json_advance /tinycbor/src/cbortojson.c:816:12 intel#7 0x560b654ea928 in dumpFile /tinycbor/tools/cbordump/cbordump.c:76:19 intel#8 0x560b654ead2b in main /tinycbor/tools/cbordump/cbordump.c:149:9 intel#9 0x7fa9d7629d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 SUMMARY: AddressSanitizer: 1 byte(s) leaked in 1 allocation(s). Fix this by freeing the string when cbor_value_finish_string_iteration fails. Fixes: e072bc1 ("CBOR-to-JSON: do properly escape JSON strings")
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
tinycbor
fork branchthiagomacieira/dev
adds support for abstract readers and writers, but the implementation had some limitations in cases where multipleCborValue
cursors were iterating over the same CBOR document, causing state contamination. The interface also was not documented.This documents the new interface and further builds upon it by slightly re-arranging
CborParser
members and opening thetoken
inCborValue
to use by the reader interface for any required purpose.