Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The parser doesn't allow parse_error() to return true and continue parsing. #3989

Open
2 tasks done
francislan opened this issue Mar 22, 2023 · 6 comments · May be fixed by #4522
Open
2 tasks done

The parser doesn't allow parse_error() to return true and continue parsing. #3989

francislan opened this issue Mar 22, 2023 · 6 comments · May be fixed by #4522
Assignees
Labels
kind: bug state: needs more info the author of the issue needs to provide more details

Comments

@francislan
Copy link

Description

The documentation has contradictory information:

Here, it says that parse_error() must return false but here, it says that "the return value indicates whether the parsing should continue, so the function should usually return false."

However it seems that the code expects parse_error() to return false and doesn't continue parsing if it returns true. See parser::sax_parse_internal() where it propagates the return value of parse_error() instead of only propagating false values.

Example:

//  case token_type::value_float:

if (JSON_HEDLEY_UNLIKELY(!std::isfinite(res)))
{
    return sax->parse_error(...);
}

Should be

//  case token_type::value_float:

if (JSON_HEDLEY_UNLIKELY(!std::isfinite(res)))
{
    if (!sax->parse_error(...))
    {
        return false;
    }
    break;
}

This would allow a SAX handler to deal with non-finite numbers, for example by stringifying them instead of failing parsing altogether. Of course a cleaner way would be for the parser to handle that natively by introducing a new method virtual bool number_nonfinite(const string_t& s) = 0;

Reproduction steps

Have a SAX handler return true for some parse_error()

Expected vs. actual results

Expected: the parser to continue parsing the rest of the input.
Actual: the parser stops.

Minimal code example

No response

Error messages

No response

Compiler and operating system

N/A

Library version

developer branch

Validation

@nlohmann
Copy link
Owner

The documentation is not clear enough: the function must return false. Allowing the parser to proceed after an error needs more work.

@francislan
Copy link
Author

Thanks for clarifying. Is there a bug that tracks "allow the parser to proceed after an error" or "allow parsing non-finite numbers"?

@nlohmann
Copy link
Owner

Not that I am aware of.

@nlohmann nlohmann self-assigned this Nov 29, 2024
@nlohmann nlohmann linked a pull request Nov 29, 2024 that will close this issue
@nlohmann
Copy link
Owner

@francislan Is #4522 what you have in mind?

@nlohmann nlohmann linked a pull request Nov 29, 2024 that will close this issue
@francislan
Copy link
Author

@nlohmann Yes, exactly! Thanks!

@nlohmann
Copy link
Owner

I am not entirely sure how this can be useful, but maybe I misunderstand your use case.

Assume an input like [{1}, "a"]. This is wrong on many levels, and so far, parsing would stop after { was parsed but not string key was seen, so we would see the events

  • start_array()
  • start_object()
  • parse_error() with [json.exception.parse_error.101] parse error at line 1, column 3: syntax error while parsing object key - unexpected number literal; expected string literal), last token: 1

If we would continue parsing, we would continue with the events:

  • key(1)
  • parse_error() with [json.exception.parse_error.101] parse error at line 1, column 4: syntax error while parsing object separator - unexpected '}'; expected ':'), last token: 1}
  • parse_error() with [json.exception.parse_error.101] parse error at line 1, column 5: syntax error while parsing value - unexpected ','; expected '[', '{', or a literal), last token: 1},
  • parse_error() with [json.exception.parse_error.101] parse error at line 1, column 9: syntax error while parsing value - unexpected string literal; expected end of input), last token: "a"

I think we would need to have a means to "stabilize" the parser - for instance, we could ignore the invalid value {1} and continue parsing the array and return ["a"] as result. This is of course very dependent on the client's requirements. But I am not sure how this could be realized.

@francislan What is your use case?

@nlohmann nlohmann added the state: needs more info the author of the issue needs to provide more details label Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug state: needs more info the author of the issue needs to provide more details
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants