Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CDRIVER-4448] Fix interaction of bson_visit with bson_validate #1090

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

vector-of-bool
Copy link
Collaborator

Refer to CDRIVER-4448 for details

If bson_iter_visit_all() sets an error offset in the iterator, store
that as the error offset for the validation.
@@ -3584,7 +3584,19 @@ _bson_iter_validate_document (const bson_iter_t *iter,
state->phase = BSON_VALIDATE_PHASE_LF_REF_KEY;
}

(void) bson_iter_visit_all (&child, &bson_validate_funcs, state);
bson_iter_visit_all_v2 (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the result of this call be part of the condition below? e.g.

   if (!bson_iter_visit_all_v2 (
          &child, &bson_validate_funcs, BSON_ITER_VISIT_NOFLAGS, state) &&
       child.err_off > 0 && state->err_offset < 0) {

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By careful reading of the source, the return value from this function is nearly meaningless, because it can return true or false both for success and for failure. It's just quite unfortunate that we're trying to cram too much information into a single bool.

Comment on lines 3596 to 3597
"Iteration failed on element at offset %d",
(int) (iter->off + child.err_off));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Iteration failed on element at offset %d",
(int) (iter->off + child.err_off));
"Iteration failed on element at offset %" PRIu32,
iter->off + child.err_off);


if (child.err_off > 0 && state->err_offset < 0) {
// Iteration on a direct element of 'child' failed
state->err_offset = child.err_off;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently no test coverage of this branch.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I believe the addition of BSON_ITER_VISIT_NOFLAGS makes this branch unreachable, as child.err_off is only assigned when visit_all fails to validate an element, and we're now disabling all of those checks. Now I'm getting more suspicious. I'll add more tests, just to make sure.

.. note::

If a requested element fails validation, the ``bson_iter_visit_all_v2`` call
will return and indicate the position of the erring element via ``iter``.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
will return and indicate the position of the erring element via ``iter``.
will return and indicate the position of the erring element via ``iter->err_off``.

bson_iter_visit_all_v2 (
&child, &bson_validate_funcs, BSON_ITER_VISIT_NOFLAGS, state);

if (child.err_off > 0 && state->err_offset < 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (child.err_off > 0 && state->err_offset < 0) {
if (child.err_off > 0 && state->err_offset <= 0) {

Should it be state->err_offset <= 0 for consistency with the condition used below (line 3604)? Or should the condition below be changed to state->err_offset < 0?

- If bson_iter finds an invalid key string, it will indicate this as a corrupt
  element. This differs from the rest of its element validation, which will
  otherwise just halt visitation.
- If bson_iter generates an error, we assert that it also informed the validator
  about this error, rather than trying to recover that error information
  ourselves.
bson_iter_visit_all_v2 (bson_iter_t *iter,
const bson_visitor_t *visitor,
enum bson_iter_visit_flags flags,
void *data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider an alternative API with options:

typedef struct _bson_visitor_opts_t bson_visitor_opts_t;

BSON_EXPORT (bson_visitor_opts_t *)
bson_visitor_opts_new (bson_iter_visit_flags flags);

BSON_EXPORT (void)
bson_visitor_opts_destroy (bson_visitor_opts_t *opts);

BSON_EXPORT (bool)
bson_iter_visit_all_with_opts (bson_iter_t *iter,
                               const bson_visitor_t *visitor,
                               const bson_visitor_opts_t *opts,
                               void *data);

The with_opts name is consistent with other cases where functions required extension. For example: bson_as_json was extended with options as bson_as_json_with_opts.

An options struct allows extension without creating a new function.

if (flags & BSON_ITER_VISIT_VALIDATE_UTF8) {
if (!bson_utf8_validate (utf8, utf8_len, allow_nul)) {
iter->err_off = iter->off;
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return true;
break;

If UTF-8 validation fails, I expect false to be returned (indicating an error) and the visit_corrupt function to be called.

Suggest changing this, and other cases below.

bson_iter_visit_all_v2 (bson_iter_t *iter,
const bson_visitor_t *visitor,
enum bson_iter_visit_flags flags,
void *data)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider also checking the return values of bson_init_static for the BSON_TYPE_ARRAY and BSON_TYPE_DOCUMENT cases, like the following:

         if (!bson_init_static (&b, docbuf, doclen)) {
            iter->err_off = iter->off;
            break;
         }
         if (VISIT_DOCUMENT (iter, key, &b, data)) {
            return true;
         }

Otherwise, an invalid BSON document or array length may be ignored.

iter->err_off = iter->off;
return true;
if (flags & BSON_ITER_VISIT_VALIDATE_REGEX) {
if (!bson_utf8_validate (regex, strlen (regex), allow_nul)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!bson_utf8_validate (regex, strlen (regex), allow_nul)) {
if (!bson_utf8_validate (regex, strlen (regex), allow_nul) || !bson_utf8_validate (options, strlen (options), allow_nul)) {

@@ -3584,7 +3584,15 @@ _bson_iter_validate_document (const bson_iter_t *iter,
state->phase = BSON_VALIDATE_PHASE_LF_REF_KEY;
}

(void) bson_iter_visit_all (&child, &bson_validate_funcs, state);
bson_iter_visit_all_v2 (
&child, &bson_validate_funcs, BSON_ITER_VISIT_VALIDATE_KEYS, state);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If state.flags has BSON_VALIDATE_UTF8, should BSON_ITER_VISIT_VALIDATE_STRINGS be passed instead?

Same question for BSON_ITER_VISIT_ALLOW_NUL_IN_UTF8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants